Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2016 Sep 19;113(38):10613–10618. doi: 10.1073/pnas.1602575113

Convergent evolution of caffeine in plants by co-option of exapted ancestral enzymes

Ruiqi Huang a, Andrew J O’Donnell a,1, Jessica J Barboline a, Todd J Barkman a,2
PMCID: PMC5035902  PMID: 27638206

Significance

Convergent evolution is responsible for generating similar traits in unrelated organisms, such as wings that allow flight in birds and bats. In plants, one of the most prominent examples of convergence is that of caffeine production, which has independently evolved in numerous species. In this study, we reveal that even though the caffeine molecule is identical in the cacao, citrus, guaraná, coffee, and tea lineages, it is produced by different, previously unknown, biosynthetic pathways. Furthermore, by resurrecting extinct enzymes that ancient plants once possessed, we show that the novel pathways would have evolved rapidly because the ancestral enzymes were co-opted from previous biochemical roles to those of caffeine biosynthesis for which they were already primed.

Keywords: convergent evolution, caffeine biosynthesis, enzyme evolution, paleomolecular biology

Abstract

Convergent evolution is a process that has occurred throughout the tree of life, but the historical genetic and biochemical context promoting the repeated independent origins of a trait is rarely understood. The well-known stimulant caffeine, and its xanthine alkaloid precursors, has evolved multiple times in flowering plant history for various roles in plant defense and pollination. We have shown that convergent caffeine production, surprisingly, has evolved by two previously unknown biochemical pathways in chocolate, citrus, and guaraná plants using either caffeine synthase- or xanthine methyltransferase-like enzymes. However, the pathway and enzyme lineage used by any given plant species is not predictable from phylogenetic relatedness alone. Ancestral sequence resurrection reveals that this convergence was facilitated by co-option of genes maintained over 100 million y for alternative biochemical roles. The ancient enzymes of the Citrus lineage were exapted for reactions currently used for various steps of caffeine biosynthesis and required very few mutations to acquire modern-day enzymatic characteristics, allowing for the evolution of a complete pathway. Future studies aimed at manipulating caffeine content of plants will require the use of different approaches given the metabolic and genetic diversity revealed by this study.


Convergent evolution has resulted in the independent origins of many traits dispersed throughout the tree of life. Whereas some convergent traits are known to be generated via similar developmental or biochemical pathways, others arise from different paths (15). Likewise, similar (orthologous) or different (paralogous or even unrelated) genes may encode for the regulatory or structural proteins composing the components of pathways that build convergent traits (610). One of the most prominent examples of convergence in plants is that of caffeine biosynthesis, which appears to have evolved at least five times during flowering plant history (11). The phylogenetic distribution of caffeine, or xanthine alkaloids more generally, is highly sporadic and usually restricted to only a few species within a given genus (12, 13). Caffeine accumulates in various tissues, where it may deter herbivory (14, 15) or enhance pollinator memory (16). Numerous studies over the past 30 y have indicated that although several possible routes exist, the same canonical pathway to caffeine biosynthesis has evolved independently in Coffea (coffee) and Camellia (tea) involving three methylation reactions to sequentially convert xanthosine to 7-methylxanthine to theobromine to caffeine (Fig. 1) (17, 18). In Coffea, three xanthine methyltransferase (XMT)-type enzymes from the SABATH (salicylic acid, benzoic acid, theobromine methyltransferase) family (19) are used to catalyze the methylation steps of the pathway, whereas Camellia uses a paralogous, convergently evolved caffeine synthase (CS)-type enzyme (2022) (Fig. 1). Because most SABATH enzymes catalyze the methylation of oxygen atoms of a wide diversity of carboxylic acids such as anthranilic, benzoic, gibberellic, jasmonic, loganic, salicylic, and indole-3-acetic acid for floral scent, defense, and hormone modulation (2325), methylation of xanthine alkaloid nitrogen atoms by XMT and CS is likely a recently evolved activity.

Fig. 1.

Fig. 1.

Caffeine biosynthetic network has 12 potential paths. The only path characterized from plants is shown by solid black arrows and involves sequential methylation of xanthosine at N-7, 7-methylxanthine at N-3, and theobromine at N-1 of the heterocyclic ring. Each methylation step is performed by a separate xanthine alkaloid methyltransferase in Coffea. In contrast, Camellia employs the distantly related caffeine synthase enzyme, TCS1, for both the second and third methylation steps, whereas the enzyme that catalyzes the first reaction remains uncharacterized. Other potential biochemical pathways to caffeine are shown by dashed arrows, but enzymes specialized for those conversions are unknown. Cleavage of ribose from 7-methylxanthosine is not shown, but may occur concomitantly with N-7 methylation of xanthosine. CF, caffeine; PX, paraxanthine; TB, theobromine; TP, theophylline; X, xanthine; 1X, 1-methylxanthine; 3X, 3-methylxanthine; 7X, 7-methylxanthine; XR, xanthosine.

Although convergence has been documented at multiple hierarchical levels, fundamental questions remain unanswered about the evolutionary gain of traits such as caffeine that are formed via a multistep pathway. First, although convergently co-opted genes, such as XMT or CS, may evolve to encode enzymes for the same biosynthetic pathway, it is unknown what ancestral functions they historically provided that allowed for their maintenance over millions of years of divergence. Second, it is unknown how multiple protein components are evolutionarily assembled into an ordered, functional pathway like that for caffeine biosynthesis. Under the cumulative hypothesis (26), it is predicted that enzymes catalyzing earlier reactions of a pathway must evolve first; otherwise, enzymes that perform later reactions would have no substrates with which to react. Subsequently, duplication of the gene encoding the first enzyme would give rise to enzymes catalyzing later steps. This hypothesis assumes that, initially, the intermediates in a pathway are advantageous, because it is unlikely that multiple enzymatic steps in a pathway could evolve simultaneously. Alternatively, the retrograde hypothesis (27) states that enzymes catalyzing reactions that occur at the end of a pathway evolved first. Gene duplication of the sequence encoding the first-evolved enzyme would eventually result in new enzymes that perform the preceding pathway steps. This hypothesis assumes that the intermediates of a given pathway would be produced nonenzymatically and be available for catalysis; as such, it may have less general explanatory application. Finally, the patchwork hypothesis (28, 29) explains the origins of novel pathways by the recruitment of enzymes from alternative preexisting pathways. This hypothesis assumes that the older, recruited enzymes were ancestrally promiscuous with respect to the substrates catalyzed such that they were exapted for the activities that they later become specialized for in the novel pathway. Unlike the cumulative and retrograde hypotheses, there is no prediction for the relative ages of enzymes performing each step of the novel pathway under the patchwork hypothesis. The patchwork hypothesis is compatible with the innovation, amplification, and duplication model of protein functional change (30) and those that have emerged from protein engineering studies (31) in that promiscuous enzyme activities are nearly universal properties of modern-day enzymes and have been shown to serve as the basis for evolution of specialized, novel enzyme activities.

Here we report on a comparative molecular and biochemical approach that dissects how caffeine convergence occurred at the level of the genes involved and biochemical pathways catalyzed in the five economically important plants Theobroma (chocolate), Paullinia (guaraná), Citrus (orange), Camellia, and Coffea. We further use the Citrus lineage to test hypotheses related to the mechanisms allowing for the convergent evolution of caffeine by using paleomolecular biology coupled with experimental mutagenesis, thereby demonstrating how multistep pathways may independently evolve.

Results and Discussion

Novel Biosynthetic Pathways for Caffeine in Modern-Day Plants.

To uncover the genes and pathways used by modern-day plants to synthesize caffeine, bioinformatic and phylogenetic analyses were used to reveal that both Theobroma cacao (Tc) (Malvales) and Paullinia cupana (Pc) (Sapindales) express multiple CS-type sequences in their caffeinated leaves and/or fruits that are orthologous to those used by Camellia sinensis (Ericales) in leaves and shoots (Fig. 2A and Figs. S1 and S2) (32, 33). However, contrary to expectations from Camellia (Fig. 1), heterologous expression and assays of the Theobroma and Paullinia CS enzymes indicate that they catalyze a different pathway to synthesize xanthine alkaloids. Specifically, both species possess one enzyme (CS1) that preferentially methylates xanthine to produce 3-methylxanthine, as well as a second enzyme (CS2) that preferentially methylates 3-methylxanthine to produce theobromine (Fig. 2 B and C). Surprisingly, even though the four enzymes are part of the CS lineage, TcCS1 is more closely related to TcCS2 rather than the enzymatically similar PcCS1, which, in turn, is more closely related to PcCS2 (Fig. 2 B and C and Fig. S1). This pattern of relationships indicates that the relative methylation preferences of these enzymes, although similar, have convergently evolved in Theobroma and Paullinia, likely after gene duplication independently occurred in each lineage (Fig. 2 B and C and Fig. S1). No enzymes have been previously reported to specialize in the methylation of xanthine or 3-methylxanthine, and the biochemical route to caffeine implied by these enzyme activities (Fig. 2 B and C) has not been implicated as the primary pathway in these or any other plants. However, there is evidence for this pathway in Theobroma fruits and leaves from metabolomic analyses and radiolabeled tracer studies that showed 3-methylxanthine as an intermediate formed during theobromine accumulation when xanthine or various purine bases and nucleosides are provided as substrates (34, 35). Analyses of fruit and leaf extracts and isotope tracer studies also report the accumulation of 7-methylxanthine to a lesser extent (34, 35). Our liquid chromatography-mass spectrometry (LC-MS) results indicate that this metabolite can be formed by methylation of xanthine by TcCS2 as a secondary activity (Fig. 2B and Fig. S3). Thus, in Theobroma, it is possible that theobromine is produced via methylation of this intermediate by BTS (36) and/or TcCS1 (Fig. 2B), in addition to 3-methylxanthine by TcCS2 (Fig. 2B). It is not yet clear which enzyme contributes to the low levels of caffeine accumulation in Theobroma, but this is not surprising, because its biosynthesis is reported to be very slow (35). In Paullinia, a third enzyme, PcCS, is reported to convert theobromine to caffeine (37) (Fig. 2C). Theobromine is reported from Paullinia tissues (37, 38), which is consistent with the pathway shown in Fig. 2C, but no analyses have surveyed for the presence of intermediates such as 3-methylxanthine or others. For Camellia, bioinformatic and phylogenetic analyses show that TCS1 and TCS2 are expressed in leaves and recently duplicated (Fig. 2E and Figs. S1 and S2C) (36). Although the biochemical role of TCS1 is clear (39), an associated activity for TCS2 has remained elusive (36); however, we were able to demonstrate maximal methyl transfer activity at N-7 of xanthosine (Fig. 2E), consistent with the reported pathway (Fig. 1).

Fig. 2.

Fig. 2.

Caffeine has convergently evolved in five flowering plant species using different combinations of genes and pathways. (A) Phylogenetic relationships among orders of Rosids and Asterids show multiple origins of caffeine biosynthesis. Lime-green lineages trace the ancient CS lineage of enzymes that has been independently recruited for use in caffeine-accumulating tissues in Theobroma, Paullinia, and Camellia. Turquoise lineages trace the ancient XMT lineage that was independently recruited in Citrus and Coffea. (B and C) Theobroma and Paullinia have converged upon similar biosynthetic pathways catalyzed by CS-type enzymes. (D) Citrus has evolved a different pathway catalyzed by XMT-type enzymes, despite its close relationship to Paullinia. (E and F) Camellia and Coffea catalyze the same pathway using different enzymes. Proposed biochemical pathways are based on relative enzyme activities shown by corresponding bar charts that indicate mean relative activities (from 0 to 1) with eight xanthine alkaloid substrates. CisXMT1 and TCS1 catalyze more than one reaction in the proposed pathways. XMT and CS have recently and independently duplicated in each of the five lineages (see Fig. S1 for a detailed gene tree). #Data taken from the literature; *substrate not assayed.

Fig. S1.

Fig. S1.

Phylogenetic relationships among 356 SABATH protein sequences. Sequences were extracted from 11 complete genomes of land plants in addition to selected CS and XMT transcriptome sequences from the oneKP database. Lineages with functionally characterized sequences are labeled by enzyme name, whereas those without known functions are arbitrarily numbered from MT1 to MT6. Bootstrap support values are shown for selected nodes that define major enzyme lineages. Enzymes from Camellia (CS) or Coffea (XMT) known to be involved in caffeine biosynthesis are shown in lime-green and turquoise, respectively. Sequences expressed in Theobroma and Paullinia fruits are clearly orthologous to CS sequences from Camellia. Sequences expressed in Citrus flowers are clearly orthologous to XMT sequences of Coffea. Arrows point to CS and XMT lineages to show recent duplication events within Theobroma, Paullinia, Camellia, Citrus, and Coffea. Nodes for which ancestral resurrected proteins were studied are labeled A–C. Accession numbers for the oneKP and GenBank databases are shown before and after relevant sequences, respectively.

Fig. S2.

Fig. S2.

Fig. S2.

Fig. S2.

(A) The closely related TcCS1 and TcCS2 of Theobroma cacao are highly represented in fruits where theobromine and caffeine accumulate. EST counts in various tissues are shown for all full-length SABATH genes from the genome of T. cacao (Matina). Relationships among the sequences are shown below the chart. GenBank accession numbers are as follows: IAMT (Thecc1EG030787), FAMTa (Thecc1EG019318), FAMTb (Thecc1EG019315), FAMTc (Thecc1EG019314), MT4a (Thecc1EG011287), MT4b (Thecc1EG011286), MT4c (Thecc1EG011290), MT4d (Thecc1EG011291), MT1a (Thecc1EG045368), MT1b (Thecc1EG045372), MT1c (Thecc1EG045370), MT5a (Thecc1EG012604), MT5b (Thecc1EG031006), BAMTa (Thecc1EG000331), BAMTb (Thecc1EG000328), BAMTc (Thecc1EG040854), XMT (Thecc1EG006850), MT3 (Thecc1EG000336), JMTa (Thecc1EG034091), JMTb (Thecc1EG034089), SAMTa (Thecc1EG000326), SAMTb (Thecc1EG000324), BTS (Thecc1EG042576), TcCS1 (Thecc1EG042578), and TcCS2 (Thecc1EG042587, Thecc1EG042590). The two accession numbers for TcCS2 are indistinguishable in the ORFs and are therefore classified together in this chart. (B) The closely related PcCS1 to PcCS5 are highly represented in fruits of Paullinia cupana var. sorbilis, where caffeine accumulates. EST counts in fruit tissues are shown for all full-length SABATH genes. SABATH sequences from the Citrus genome were used to BLAST for ESTs from Paullinia, because a genome is not yet characterized for it. Relationships among the sequences are shown below the chart. Only CS-type sequences were found from Paullinia ESTs. GenBank accession numbers are as follows: GAMT (Citrus: 1g014333m), IAMT (Citrus: 1g016644m), FAMT (Citrus: 1g017702m), MT4 (Citrus: 1g040129m), MT2 (Citrus: 1g018119m), XMT (Citrus: 1g044727m), SAMT (Citrus: 1g017514m), JMT (Citrus: XM_006478399), MT3 (Citrus: 1g017747m), PcCS3 (EC763988, EC777706, EC777629, EC777248, EC774687, EC768101, EC769415, EC769184, EC769690, EC773071, EC764367), PcCS1 (59), PcCS4 (EC765512, EC766603, EC766748, EC777652, EC769966, EC775308, EC766876), PcCS5 (EC765614, EC764433, EC776114, EC764220, EC778019, EC768317, EC764348, EC774623, EC765624, EC772805), and PcCS2 (EC774880, EC768886, EC775438, EC772015, EC764433, EC771794, EC774462, EC772993, EC770731, EC773205, EC765633, EC764023, 776855, EC767182, EC770596). Enzyme activities of PcCS 1 and 4 were highly comparable, as were PcCS2 and 5, so we only present data for one of each in the manuscript. PcCS3 had no detectable activity with any xanthine alkaloid substrate tested. (C) The closely related TCS1 and TCS2 of Camellia sinensis are highly represented in leaves and buds where caffeine accumulates. EST counts in various tissues are shown for representatives of each SABATH gene lineage. Full-length genes from each major lineage of the SABATH family in the Mimulus genome were used to BLAST for ESTs from Camellia, because its genome is not yet characterized. Relationships among the sequences are shown below the chart. Sequences used for BLAST/phylogenetic analysis are as follows: Mimulus IAMT (Phytozome accession no. M01704), Camellia sinensis MT4 (GenBank accession no. GB-GBBZ01008401), Mimulus MT1 (Phytozome accession no. H02148), Mimulus FAMT (Phytozome accession no. H00254), Camellia sinensis SAMT (GenBank accession no. KA284044), Camellia sinensis JMT (GenBank accession no. KA286401), Camellia sinensis MT3 (GenBank accession no. FS950428), Mimulus BAMT (Phytozome accession no. N01694), Camellia sinensis TCS2 (GenBank accession no. AB031281), and Camellia sinensis TCS1 (GenBank accession no. AB031280). No XMT ortholog is known from Mimulus or Ericales.

Fig. S3.

Fig. S3.

TcCS2 converts xanthine to 7-methylxanthine. Eight mass spectrometry scans show parent ion–fragment ion pairs that are largely unique for each xanthine alkaloid product detected in enzyme assays supplied with 100 µM xanthine. Scans for authentic standards are shown below to verify product identity.

In contrast, bioinformatic and phylogenetic analyses revealed that Citrus sinensis (Cis) (Sapindales) expresses two recently duplicated XMT-type enzymes in caffeinated flowers orthologous to those found in Coffea arabica (Gentianales) tissues (Fig. 2A and Figs. S1 and S4A). Surprisingly, assays of the Citrus XMT enzymes imply yet another pathway, different from that catalyzed by Coffea XMT enzymes (Fig. 1), which has led to the convergent evolution of caffeine. Specifically, CisXMT1 not only methylates xanthine to produce 1- and 3-methylxanthine; it also methylates both 1- and 3-methylxanthine to produce theophylline (Fig. 2D and Fig. S5). A second enzyme, CisXMT2, preferentially methylates theophylline to produce caffeine (Fig. 2D). LC-MS analyses of flower buds (Fig. S6) and assays of crude enzyme extracts from Citrus x limon stamens (40) are consistent with the pathway shown in Fig. 2D. Furthermore, in Citrus, theophylline is a conspicuous metabolite in developing flower buds that accumulates early and decreases in concentration as caffeine levels increase (41), suggesting that it is involved in the accumulation of caffeine. On the other hand, theobromine, the long-assumed universal precursor to caffeine in plants (11), is undetectable or present only at low levels in developing buds (41). These findings are particularly intriguing, given that no enzymes have been previously reported to be specialized for these methylation reactions to form theophylline or caffeine and because theophylline is usually considered a degradation product of caffeine (42, 43). We expected Citrus to use the same gene family members and pathway as Paullinia because both are members of Sapindales (Fig. 2A). However, we could neither detect in vitro activity with xanthine alkaloid substrates by the single Citrus CS-type enzyme (Fig. S1) nor is it represented by ESTs in flowers (Fig. S4A), the principal site of caffeine accumulation (41).

Fig. S4.

Fig. S4.

(A) The closely related CisXMT1 and CisXMT2 of Citrus sinensis are highly represented in flowers where caffeine accumulates. EST counts in various tissues are shown for all full-length SABATH genes from the genome of Citrus sinensis. Relationships among the sequences are shown below the chart. GenBank accession numbers are as follows: GAMT (1g014333m), IAMT (1g016644m), MT4a (1g040129m), MT4b (1g044174m), MT2 (1g018119m), FAMTa (1g017702m), FAMTb (1g037735m), FAMTc (1g018250m), FAMTd (1g017363m), FAMTe (017439m), CS (1g18139m), MT3 (1g017747m), CisXMT5 (1g036911m), CisXMT4 (XM_006469416), CisXMT3 (1g045960m), CisXMT2 (1g047625m), CisXMT1 (1g044727m), JMT (XM_006478399), and SAMT (1g017514m). (B) Relative enzyme activity profile for SAMT from Citrus sinensis. Bar charts show relative enzyme activities (from 0 to 1) for three substrates. AA, anthranilic acid; BA, benzoic acid; SA, salicylic acid.

Fig. S5.

Fig. S5.

Mass spectrometry scans show that Citrus CisXMT1 can form 1-methylxanthine, 3-methylxanthine, and theophylline from xanthine alone. Citrus CisXMT1 converts xanthine to both 1-methylxanthine and 3-methylxanthine, as indicated by the presence of parent ion–fragment ion peaks unique for both compounds. The presence of the theophylline peak likely is a result of methylation of both 1-methylxanthine and 3-methylxanthine, although which may be preferred cannot be discerned from these traces alone. However, it should be noted that the catalytic efficiency by which 3-methylxanthine is converted to theophylline is ca. four times lower than that of 1-methylxanthine (Table S1). Scans for authentic standards are shown below to confirm product identities.

Fig. S6.

Fig. S6.

Citrus flower buds appear to accumulate xanthine, 1-methylxanthine, theophylline, and caffeine as well as minor amounts of 3-methylxanthine. Mass spectrometry scans show parent ion–fragment ion pairs unique for each xanthine alkaloid product detected in Citrus flower buds. The presence of these metabolites is consistent with the enzyme assays (Fig. 2D and Fig. S5) and suggests that the primary pathways by which caffeine is produced in Citrus flowers are those shown in Fig. 2D. Table S2 shows parent ion–fragment ion masses for xanthine alkaloid standards.

For more than 30 y, published studies have indicated that caffeine is produced via a single canonical pathway in plants (44, 45). Our results show that flowering plants have a much broader biochemical repertoire whereby at least three pathways lead to caffeine biosynthesis catalyzed by enzymes that derive from one of two methyltransferase lineages. These enzymes have substrate affinities (as measured by KM) comparable to XMT and CS in Coffea and Camellia, respectively, as well as those of other SABATH family members, which are in the 10–1,000 µM range (21, 23, 46, 47) (Table S1). Additionally, although xanthine alkaloids may not be homogeneously distributed at the subcellular and tissue level (48), reported concentrations of the relevant intermediates in Theobroma, Paullinia, and Citrus may be conservatively estimated to be in the 10–1,000 µM range (34, 38, 41), which are comparable to the KM values we obtained (Table S1). Although it is apparent that three analogous pathways for caffeine biosynthesis have evolved in flowering plants, it is unclear what historical genetic and biochemical conditions facilitated this convergence.

Table S1.

Enzyme kinetic parameter estimates for modern-day and ancestral xanthine alkaloid-producing enzymes with selected substrates

Enzyme (+substrate) KM (µM) kcat (1/s) kcat/KM (s−1⋅M−1)
Modern-day enzymes
 TcCS1 (X) 95.8 8.37E-05 0.87
 TcCS2 (3X) 49.1 9.81E-05 2.00
 PcCS1 (X) 95.38 1.52E-03 15.94
 PcCS2 (3X) 677 9.33E-04 1.38
 CsXMT1 (1X) 657.6 6.32E-04 0.97
 CsXMT1 (3X) 1,144 2.99E-04 0.26
 CsXMT1 (X) 759 8.74E-05 0.12
Ancestral enzymes
 CsAncXMT2 (X) 2,740 2.48E-04 0.09
 CsAncXMT2 P25S (X) 1,690 3.97E-04 0.23

Historical Maintenance of Ancestral XMT Enzymes Allowed for Convergence.

In the case of convergent caffeine production in Citrus and Coffea, XMT needed to be maintained for more than 100 My from their common ancestor (49) to then independently give rise to xanthine alkaloid-methylating enzymes, because it is unlikely that their progenitor was producing caffeine given that they currently use completely different biosynthetic pathways (Fig. 2 D and F). To understand how this long-term maintenance occurred, we resurrected ancestral enzymes (50) for the XMT lineage at nodes A–C (Fig. 3 and Fig. S1). Surprisingly, both the putative ca. 100-My-old Rosid-Asterid ancestral enzyme, RAAncXMT (node A), as well as its descendant, CisAncXMT1 (node B), exhibit high relative activity with benzoic acid and salicylic acid (to form methyl benzoate and methyl salicylate, respectively) but very little with xanthine alkaloids (Fig. 3). Ancestral O-methylation of benzoic acid is maintained in a modern-day XMT from Mangifera, which is a relative of Citrus in Sapindales but is not known to synthesize xanthine alkaloids (Fig. 3 and Fig. S1). On the other hand, ancestral activities with benzoic and salicylic acid were completely lost in the modern-day descendant enzymes of Citrus, CisXMT1 and CisXMT2, and they now appear specialized for only N-methylation of xanthine alkaloids (Fig. 3). These specialized modern-day enzymes were most recently derived from CisAncXMT2 (node C), which exhibits both O- and N-methylation activities and seems to be a transitional enzyme associated with the gain of xanthine alkaloid production (Fig. 3). Although CisAncXMT2 appears to have low levels of activity with benzoic and salicylic acid compared with that of 1-methylxanthine, the specific activity with these substrates (0.6 and 2.3 pkat/mg, respectively) is comparable to heterologously expressed, modern-day SAMT- and BSMT-type enzymes (51). Today, modern-day Citrus possesses an SAMT that is capable of methylating both benzoic and salicylic acid, thereby compensating for the eventual loss of those activities from ancestral XMT enzymes (Fig. S4B). Ancestral O-methylation of the carboxyl moiety of benzoic and salicylic acid might have promoted the evolution of N-methylation of xanthine alkaloids because of common attributes of the active sites that would need to accommodate the largely planar rings of both classes of substrates. Indeed, a paralogous SABATH methyltransferase specialized for methylation of the carboxyl group of nicotinic acid (which is also an N-heterocyclic substrate) also recently arose from ancestral enzymes that exhibited activities with benzoic and salicylic acid (46). Although these data indicate that ancestral activity with benzoic and salicylic acid in the XMT lineage allowed for subsequent co-option of the descendant enzymes to form caffeine, how recruitment of the enzymes into a functional pathway occurred remains unknown.

Fig. 3.

Fig. 3.

Resurrected ancestral XMT proteins reveal the historical context for convergent evolution of caffeine biosynthesis. Bar charts show mean relative enzyme activities (from 0 to 1) for 10 substrates. BA, benzoic acid; SA, salicylic acid; all others are as in Fig. 1. Node A shows the resurrected enzyme of the >100-My-old ancestor of Rosids and Asterids that exhibits high relative activity with benzoic and salicylic acid. Although those ancestral activities were maintained in CisAncXMT1 at node B and modern-day Mangifera, they were eventually replaced by increased relative preference for xanthine alkaloid methylation as seen at node C and its descendants. CisAncXMT2 mutants (P25S and H150N indicated on lineages C′ and C′′, respectively) show that very few amino acid replacements are necessary to re-evolve modern-day enzyme activity patterns and form a complete caffeine biosynthetic pathway. Product formation from assays and implied pathway connections are shown by color-coded dots (Insets). For example, a connection between TP (green) and CF (black) implies that the enzyme in question converts theophylline to caffeine. Unshaded rectangles exhibit complete metabolic connections to caffeine, whereas shaded rectangles do not. Average site-specific posterior probabilities are shown for each resurrected ancestral enzyme. Select substrate structures are shown to specify the atom to which a methyl group is transferred.

Exaptation Facilitates Multistep Pathway Evolution in a Cumulative Manner.

To understand how convergent caffeine production evolved via an entirely novel pathway in the Citrus lineage, we mapped ancestral XMT pathway connections of the caffeine biosynthetic network (Fig. 3, Insets, dot boxes). At nodes A and B, ancestral enzymes exhibited very low activity with xanthine alkaloids, such that quantities were too low to allow for product identification by HPLC, making it unlikely that a complete pathway existed at those times. Subsequently, the derived ancestral Citrus enzyme, CisAncXMT2 (node C), had activity with numerous xanthine alkaloids; in particular, highest relative activity with 1-methylxanthine resulted in paraxanthine formation, and 3-methylxanthine was methylated to form theophylline (Fig. S7 A and B). CisAncXMT2 could also convert theophylline to caffeine, such that it would have performed two of the three steps necessary to form caffeine from 3-methylxanthine (Fig. S7C). However, this ancestral enzyme exhibits only a low level of activity and specificity with xanthine to form 1-methylxanthine (Fig. 3, Fig. S8, and Table S1), which could have subsequently been converted to paraxanthine by CisAncXMT2, but not caffeine (Fig. 3, node C). Thus, if CisAncXMT2 was used for methylation of benzoic and salicylic acid, then it appears to have been exapted for several later reactions of the xanthine alkaloid biosynthetic network used by modern-day Citrus, because a complete pathway to caffeine was not likely catalyzed by this enzyme alone. Alternatively, it remains formally possible that the ancestor of Citrus possessed a different, now extinct, enzyme that could have converted xanthine to 3-methylxanthine, so that caffeine may have been produced by CisAncXMT2, yet only modern-day XMT- and CS-type enzymes (Fig. 2) are capable of that conversion, making it unclear why an enzyme specialized for that reaction would subsequently be lost given its importance today.

Fig. S7.

Fig. S7.

Fig. S7.

Fig. S7.

(A) CisAncXMT2 converts 1-methylxanthine to paraxanthine. Mass spectrometry scans show parent ion–fragment ion pairs largely unique for each xanthine alkaloid product detected in enzyme assays supplied with 100 µM 1-methylxanthine. Scans for authentic standards are shown below to confirm product identity. (B) CisAncXMT2 converts 3-methylxanthine to theophylline. Mass spectrometry scans show parent ion–fragment ion pairs largely unique for each xanthine alkaloid product detected in enzyme assays supplied with 100 µM 3-methylxanthine. Scans for authentic standards are shown below to confirm product identity. (C) CisAncXMT2 converts theophylline to caffeine. Mass spectrometry scans show parent ion–fragment ion pairs largely unique for each xanthine alkaloid product detected in enzyme assays supplied with 100 µM theophylline. Scans for authentic standards are shown below to confirm product identity.

Fig. S8.

Fig. S8.

HPLC analyses show the evolution of product formation in ancestral and modern-day Citrus XMT enzymes. CisAncXMT2 converts xanthine to 1-methylxanthine but not 3-methylxanthine (Insets). In contrast, CisAncXMT2 P25S and CisXMT1 methylate xanthine to both 1-methylxanthine and 3-methylxanthine, which are further converted into theophylline. Enzyme assays were conducted using 2 mM xanthine, and products were measured by absorbance at 272 nm. mAU, milliabsorption units.

Next, to recapitulate the evolutionary steps required to generate complete caffeine biosynthetic pathway linkages, we performed experimental mutagenesis of CisAncXMT2 (node C), which was duplicated to give rise to the two modern-day enzymes of Citrus, CisXMT1 and CisXMT2. In the lineage leading to CisXMT1, 17 amino acids were replaced and resulted in the evolution of increased activity with xanthine as well as specialization with 1- and 3-methylxanthine (Fig. 3). We experimentally replaced Pro25 by Ser in CisAncXMT2 (Fig. 3; lineage C′), because this site is predicted to be part of the active site of Coffea DXMT (52) and differs in CisXMT1 and CisXMT2 (Fig. S9). This single mutation resulted in the evolution of three important biochemical changes. First, near-complete loss of ancestral activity with theophylline as well as with benzoic and salicylic acid occurred such that CisAncXMT2 P25S acquired a relative activity profile very similar to modern-day CisXMT1 (Fig. 3). Second, CisAncXMT2 P25S exhibited a 2.5-fold increased catalytic efficiency with xanthine compared with CisAncXMT2 (Table S1) to produce both 1- and 3-methylxanthine (Fig. 3 and Fig. S8). Third, activity of CisAncXMT2 P25S changed such that 1-methylxanthine is methylated to form theophylline instead of paraxanthine (Fig. S10). The importance of this single amino acid replacement is that a connected biosynthetic network from xanthine to theophylline via both 1- and 3-methylxanthine would have rapidly evolved, in part, due to exaptation of CisAncXMT2 (Fig. 3). The existence of exapted ancestral enzymes such as CisAncXMT2 resolves one of the fundamental problems of the cumulative hypothesis because multiple steps of a pathway could evolve simultaneously, thereby avoiding the need to assume the existence of selectively advantageous intermediates. These results also point to a crucial role for ancestral promiscuous activities postulated as part of the patchwork hypothesis and protein engineering studies (31) and reported previously for other SABATH enzymes (46).

Fig. S9.

Fig. S9.

Fig. S9.

Fig. S9.

Aligned protein sequences show that very few sites differ between ancestral and modern-day enzymes and that mutations at very few sites account for functional change. (A) Amino acid alignment of functionally characterized modern-day XMT and CS sequences. Also shown are resurrected ancestral XMT protein sequences. Amino acids that were experimentally replaced to recapitulate evolutionary changes in the Citrus lineage are highlighted in turquoise. Alternative ancestral alleles were generated by mutations shown in green. (B) Posterior probabilities of original and mutated sites are shown for the four alternative ancestral alleles generated and assayed. (C) Bar charts show relative enzyme activities of additional site-directed mutants made to recapitulate evolutionary changes from CisAncXMT2 to either CisXMT1 or CisXMT2. Mutated amino acid positions are highlighted in magenta.

Fig. S10.

Fig. S10.

HPLC analyses show that CisAncXMT2 P25S converts 1-methylxanthine to theophylline instead of paraxanthine, like its ancestor, CisAncXMT2. Enzyme assays were conducted using 200 µM 1-methylxanthine, and products were measured by absorbance at 272 nm. Unlabeled peaks in Middle and Lower are unidentified molecules found in both the enzyme assay and negative control to which no xanthine alkaloids were added.

Finally, although the immediate, postduplication daughter enzyme of lineage C′′ would have initially retained the ancestral activities of it progenitor, CisAncXMT2, it eventually gave rise to the modern-day descendant CisXMT2, which exhibits near-complete specialization with theophylline (Fig. 3). A total of 16 amino acid replacements occurred along this lineage, one of which was His150, which was replaced by Asn (Fig. S9). This residue is likely part of the active site and known to control substrate preference in other SABATH enzymes (46). Experimental mutagenesis of His150 to Asn in CisAncXMT2 (Fig. 3, node C′′) resulted in an enzyme that is similar to modern-day CisXMT2. Specifically, H150N nearly completely abolished methylation activity with every substrate except theophylline and, to a lesser extent, 1-methylxanthine (Fig. 3). Because methylation of theophylline results in the formation of caffeine, the combined activities of CisAncXMT2 H150N and CisAncXMT2 P25S would have allowed for a complete caffeine biosynthetic pathway given xanthine as a starting substrate, much like the two modern-day XMT enzymes in Citrus. Although mutations other than H150N and P25S might have shifted ancestral enzymes toward the modern-day specialized activities of CisXMT1 and CisXMT2, we show that only these two replacements need to be implicated in specialization, because other mutations do not recapitulate the inferred relative activity changes (Fig. S9C).

Conclusion

The results for the XMT lineage indicate that convergent evolution of caffeine biosynthesis was possible partly because ancient lineages of enzymes were maintained over 100 My for alternative biochemical functions. Furthermore, like the fortuitous roles of feathers for flight in birds (53) or ligand binding in ancestral hormone receptors (54), it appears that exapted activities of the ancestral XMT enzymes ultimately promoted their co-option for caffeine biosynthesis. These exaptations became biochemically relevant when, as predicted under the cumulative hypothesis (26), the initial reactions of the caffeine pathway evolved. The fact that very few substitutions to CisAncXMT2 were required to promote substrate preference switches suggests relatively facile mutational basis for the evolution of caffeine biosynthetic pathways. Therefore, it is likely that caffeine biosynthesis would evolve in flowering plants again if the evolutionary tape of life were to be replayed (55). What is more difficult to predict is which of the 12 potential biochemical pathways any particular lineage will use, which methyltransferase enzyme will be co-opted, or which amino acids will be substituted to provide for particular substrate preferences due to the role of historical contingency associated with any given evolutionary transition.

Materials and Methods

Heterologous Expression and Purification of Enzymes.

Gene sequences for Theobroma and Paullinia were synthesized with codon use optimized for gene expression in Escherichia coli (GenScript). Gene sequences for Citrus and Camellia were cloned from fresh flowers or leaves, respectively, using primers designed from the EST and genomic sequences. cDNA was generated using the SuperScript II/Platinum Taq One-Step RT-PCR Kit (Invitrogen). Protein overexpression used either pET-15b (Novagen) or Expresso T7 SUMO (Lucigen) expression vectors, and induction of His6–protein was achieved in 50-mL BL-21(DE3) cell cultures with the addition of 1 mM isopropyl β-d-1-thiogalactopyranoside at 23 °C for 6 h. Purification of the His6-tagged protein was achieved by TALON spin columns (Clontech) according to the manufacturer’s instructions. To determine protein concentration, a standard Bradford assay was used. Recombinant protein purity was evaluated by SDS/PAGE.

Enzyme Assays.

All enzymes were tested for activity with the eight xanthine alkaloid substrates shown in Fig. 1. In addition, all enzymes were tested with benzoic and salicylic acid, but we only report results for XMT enzymes, as shown in Fig. 3, because CS enzymes do not show activity with those substrates. Xanthine alkaloid substrates were dissolved in 0.5 M NaOH, whereas benzoic and salicylic acid were in ethanol. Radiochemical assays were performed in 50-μL reactions with 0.01 μCi (0.5 μL) 14C-labeled SAM, 100 μM methyl acceptor substrate, and 10–20 μL purified protein in 50 mM Tris⋅HCl buffer at 24 °C for 20 min. Negative controls were composed of the same reagents except that the methyl acceptor substrate was omitted and the corresponding solvent was added instead. Methylated products were extracted in ethyl acetate and quantified using a liquid scintillation counter. Raw disintegrations per min obtained from the scintillation counter were corrected using empirically determined extraction efficiencies of products in ethyl acetate. The highest enzyme activity reached with a specific substrate was set to 1.0, and relative activities with the remaining substrates were calculated. Each assay was run at least twice so that mean plus SD, could be calculated. All assays shown were performed on purified protein unless activity was abolished after purification. In such cases (only CisXMT2 and TCS2), we present total protein data. The specific activity for TCS2 with xanthosine was 0.025 pkat/mg, and the specific activity for CisXMT2 with theophylline was 0.12 pkat/mg.

Ancestral Sequence Resurrection and Mutagenesis.

CODEML (56) was used to estimate ancestral sequences for the XMT lineage of enzymes of the SABATH family assuming the Jones, Taylor, Thornton (JTT) + gamma model of amino acid substitution. Regions with alignment gaps were analyzed with parsimony to determine ancestral residue numbers. The estimated sequences were subsequently synthesized by GenScript with codons chosen for optimal protein expression in E. coli. Alternative ancestral alleles were generated by site-directed mutagenesis using the QuikChange Site-Directed Mutagenesis Kit (Stratagene) to change amino acids that differed among analyses using different subsets of sequences, trees, and models of substitution. Although posterior probabilities were high for most sites of most alleles (see average site-specific posterior probabilities in Fig. 3), different analyses did result in different estimated ancestral alleles in some cases. Therefore, at least two ancestral enzymes were characterized for each node A–C in Fig. 3, and an alignment showing each resurrected allele is provided in Fig. S9A. Amino acid sites differing between alternate alleles are shown in Fig. S9B, with posterior probabilities listed for each amino acid that was mutated. The positions of mutations are shown in the alignment of Fig. S9A. Because assays at each node were represented by at least two alleles, mean and SE were calculated for relative activity with each substrate assayed.

Detailed procedures for bioinformatic, phylogenetic, enzyme kinetics, HPLC, and LC-MS/MS analyses are provided in SI Materials and Methods. See Table S2 for MS/MS parameters and LC retention time for target xanthine alkaloids.

Table S2.

MS/MS parameters and LC retention time for target xanthine alkaloids

Compound Parent ion (m/z) Fragment ion (m/z) Collision energy (eV) MS scan function Retention time (min)
Xanthine 153 110 17 1 4.67
1-Methylxanthine 167 110 17 2 9.36
3-Methylxanthine 167 124 17 2 8.82
7-Methylxanthine 167 124 17 2 8.23
Theobromine 181 138 17 3 12.09
Theophylline* 181 124, 96 30 3 12.92
Paraxanthine* 181 124, 55 35 3 12.70
Caffeine 195 138 14 4 15.69
*

The most abundant fragment ion is listed first.

SI Materials and Methods

Bioinformatics.

For Citrus and Theobroma, we obtained all SABATH gene family members from their respective genomic sequences and used these for BLAST analyses of the EST database of GenBank. Only full-length SABATH genes predicted to encode more than 300 amino acids were included. All ESTs were assembled according to the reference sequence using Sequencher (Gene Codes), and the number of positive EST matches was tallied and the tissue origin was recorded. For Theobroma, only two genes were represented by >5 ESTs in caffeinated cocoa seeds/fruits relative to all other 23 SABATH family members: TcCs1 (Thecc1EG042578) (30 ESTs) and TcCS2 (Thecc1EG042587 and Thecc1EG04290) (>120 ESTs) (Fig. S2A). Thecc1EG042587 and Thecc1EG04290 were tallied together because they encode identical ORFs. One gene, TcBTS, was previously isolated from leaves and reported to have activity with 7-methylxanthine (36). This gene is represented by <5 ESTs in fruits and its KM for that substrate is high (2 mM) (36). Thus, although it may participate in xanthine alkaloid production in leaves, any role in fruits must be relatively minor given the kinetics of the enzyme, particularly relative to those of TcCS1 and TcCS2 (Table S1). In Citrus (57), three SABATH genes are represented by more than five ESTs in flowers, which accumulate both caffeine and theophylline (Fig. S4A). Of these, one of the most abundant sets of ESTs represents an SAMT (1g017514m) that we have experimentally investigated for enzyme activity. It does not methylate any of the xanthine alkaloids at detectable levels; instead, it prefers to methylate salicylic, benzoic, and anthranilic acid (Fig. S4B), the latter of which correlates with the presence of methylanthranilate in the flowers of Citrus (58). Because the in vitro activity of this enzyme appears to be irrelevant for xanthine alkaloid methylation, we do not discuss it further. CisXMT1 (1g044727m) was represented by >30 ESTs, whereas CisXMT2 (1g047625m) was represented by 7 ESTs (Fig. S4A).

Because no genomic sequences exist for Camellia and Paullinia, we used a different approach to survey the EST sets. First, we chose full-length SABATH family sequences from an Asterid, Mimulus, to query Camellia ESTs and a Rosid, Citrus, to query Paullinia ESTs (32). The genomic reference sequences were then used for BLAST analyses of the EST database of GenBank. All Camellia and Paullinia matches were assembled using Sequencher, and phylogenetic analyses were performed to verify the orthology of each putative EST assembly to the query sequence. After orthology was established, EST number and tissue type were recorded. For Camellia, only two genes were represented by any ESTs in leaves and leaf buds, where caffeine accumulates (Fig. S2C). These sequences correspond to the previously studied CS sequences TCS1 (AB031280) (9 ESTs) and TCS2 (AB031281) (12 ESTs) (36). For Paullinia, ESTs were assembled and hypothesized to represent the same gene if they possessed 98% identity over a 100-bp window of the coding sequence. Out of 105 ESTs, five contigs were assembled. These five putative genes were the only SABATH sequences represented by any ESTs, and all were expressed in fruits/seeds where caffeine is known to accumulate (Fig. S2B). One of the contigs contained 18 ESTs and putatively represents the full-length CS sequence, grn006 (which we refer to as PcCS1), that was reported previously but not experimentally characterized (59). The contig with the highest coverage is referred to as PcCS2 (30 ESTs). One contig, PcCS3, was represented by 15 ESTs, but we could not demonstrate activity with any substrate. Therefore, it is not discussed further in the paper. Two other contigs, PcCS4 (27 ESTs) and PcCS5 (15 ESTs), are minor sequence variants of PcCS1 and PcCS2, respectively, and showed the same in vitro enzyme activities as they did and therefore are not discussed further. It should be noted that the recently reported PcCS (Fig. 2C) that was reported to methylate theobromine to produce caffeine (37) was most similar to ESTs making up PcCS1 and PcCS4, but there were numerous differences in the 3′ end of the gene that did not match any ESTs in GenBank.

Estimation of Michaelis–Menten Parameters.

For kinetic measurements (Table S1), enzyme assays were performed by varying methyl acceptor substrate concentrations while SAM was held at saturating levels. All kinetic studies were performed in two independent experiments with incubation times chosen so that reaction velocity was linear and less than 10% substrate depletion occurred. Initial velocities versus substrate concentrations were plotted using GraphPad Prism to fit the hyperbolic Michaelis–Menten equation for the calculation of Vmax and KM. Vmax was then converted to apparent kcat and expressed in units of s−1.

High-Performance Liquid Chromatography.

Product identity was determined using HPLC on 500-μL scaled-up reactions using all of the same reagents as described above except that nonradioactive SAM was used as the methyl donor and reactions were allowed to progress for 2 h. Whole reactions were filtered through Vivaspin columns (Sartorius) to remove all protein before injection onto the HPLC. Mixtures were separated by HPLC using a two-solvent system with a 250 mm × 4.6 mm Kinetex 5 μm EVO C18 column (Phenomenex). Solvent A was 99.9% (vol/vol) water with 0.1% TFA and solvent B was 80% (vol/vol) acetonitrile, 19.9% (vol/vol) water, and 0.1% TFA, and a 0–10% gradient was generated over 16 min with a flow rate of 1.0 mL/min. Subsequently, buffer B was increased to 100% over 4 min and then held at that percentage for 20 min. Equilibration back to 0% buffer B was achieved over a 20-min period. Product identity was determined by comparing retention times and absorbance at 254 and 272 nm of authentic standards. All reactions were compared with negative controls in which no methyl acceptor substrates were added.

Liquid Chromatography–Tandem Mass Spectrometry.

Enzyme assays were analyzed by LC-MS tandem mass spectrometry to confirm product identity based on UV absorbance. LC-MS/MS analysis was performed using an Agilent 1100 HPLC inline with a Quattro Micro mass spectrometer using the established HPLC conditions described above except (i) mobile phase A was changed to 0.1% formic acid/0.01% trifluoroacetic acid/water and mobile phase B was 0.1% formic acid/0.01% trifluoroacetic acid/acetonitrile, (ii) the flow rate was reduced to 0.5 mL/min, and (iii) a postcolumn addition of 0.1% formic acid in acetonitrile was added via a PEEK tee at a flow rate of 100 μL/min. Compound elution was performed using a linear gradient of 0–16% mobile phase B over 16 min followed by 2 min of 95% B for a run time of 20 min. Under these conditions, mass spectrometry scans allowed for unambiguous identification of all eight xanthine alkaloid metabolites used or produced in the enzyme assays. First, xanthine is clearly separated from all other alkaloids and shows a unique 153.4 > 110 fragmentation pattern. Second, 7-methylxanthine and 3-methylxanthine may be identified from different retention times even though they both show a 167.4 > 124 fragmentation pattern. Third, 1-methylxanthine has a different retention time from all other alkaloids and a unique 167.4 > 110 fragment. Fourth, the 181.5 > 138 fragmentation pattern is specific for theobromine, which also has a clear retention time difference from other alkaloids. Fifth, paraxanthine and theophylline are difficult to separate under these LC conditions. In addition, both have a common 181.5 > 124 fragment. Therefore, to reliably identify each, we also scanned for 181.5 > 96, which is more abundant for theophylline whereas 181.5 > 55 is more abundant for paraxanthine. Finally, caffeine separated clearly from all other alkaloids and had a specific parent ion–fragment ion signature that allowed its unambiguous identification. Table S2 provides additional details about mass spectrometry fragmentation conditions. Detection was initially optimized using pure standards of the expected products diluted to 1 μM in 0.1% formic acid/50% (vol/vol) acetonitrile and infused directly onto a Waters Quattro Micro mass spectrometer via an electrospray ion source.

Phylogenetic Analyses.

Amino acid sequences from all enzymatically characterized SABATH gene family members and those from various land plant complete genomes were obtained from GenBank and the PlantTribes database (60). In addition, a limited sample of XMT and CS sequences was obtained from the oneKP database (www.onekp.com) to provide more detailed branching relationships of the recently evolved caffeine biosynthetic enzymes of Theobroma (Malvales), Camellia (Ericales), Coffea (Gentianales), as well as Paullinia and Citrus (Sapindales). Sequences were aligned using MAFFT version 7 (61) using the auto search strategy to maximize accuracy and speed. PhyML (62) was used to generate a maximum-likelihood phylogenetic estimate for the SABATH family members. We assumed the JTT model for amino acid substitution with an invariant and gamma parameter for among-site rate heterogeneity. Bootstrap support values were obtained from 100 replicates.

Acknowledgments

We thank Greg Cavey, Talline Martins, Andre Venter, James Kiddle, Matt Gibson, Logan Rowe, Rachel Steuf, and Kevin Blair for helpful discussions and assistance; Gane Ka-Shu Wong for providing access to the oneKP database to confirm placements of gene sequences shown in Fig. S1; the Chemistry Department at Western Michigan University for facilitating our HPLC analyses; and Thomas Baumann for allowing us to use his photos of Camellia and Paullinia in Fig. 2. This study was supported by National Science Foundation Grant MCB-1120624 (to T.J.B.).

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1602575113/-/DCSupplemental.

References

  • 1.Rajakumar R, et al. Ancestral developmental potential facilitates parallel evolution in ants. Science. 2012;335(6064):79–82. doi: 10.1126/science.1211451. [DOI] [PubMed] [Google Scholar]
  • 2.Sanger TJ, Revell LJ, Gibson-Brown JJ, Losos JB. Repeated modification of early limb morphogenesis programmes underlies the convergence of relative limb length in Anolis lizards. Proc Biol Sci. 2012;279(1729):739–748. doi: 10.1098/rspb.2011.0840. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Pichersky E, Lewinsohn E. Convergent evolution in plant specialized metabolism. Annu Rev Plant Biol. 2011;62:549–566. doi: 10.1146/annurev-arplant-042110-103814. [DOI] [PubMed] [Google Scholar]
  • 4.Des Marais DL, Rausher MD. Parallel evolution at multiple levels in the origin of hummingbird pollinated flowers in Ipomoea. Evolution. 2010;64(7):2044–2054. doi: 10.1111/j.1558-5646.2010.00972.x. [DOI] [PubMed] [Google Scholar]
  • 5.Yoon HS, Baum DA. Transgenic study of parallelism in plant morphological evolution. Proc Natl Acad Sci USA. 2004;101(17):6524–6529. doi: 10.1073/pnas.0401824101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Reed RD, et al. optix drives the repeated convergent evolution of butterfly wing pattern mimicry. Science. 2011;333(6046):1137–1141. doi: 10.1126/science.1208227. [DOI] [PubMed] [Google Scholar]
  • 7.Berens AJ, Hunt JH, Toth AL. Comparative transcriptomics of convergent evolution: Different genes but conserved pathways underlie caste phenotypes across lineages of eusocial insects. Mol Biol Evol. 2015;32(3):690–703. doi: 10.1093/molbev/msu330. [DOI] [PubMed] [Google Scholar]
  • 8.Schwarze K, et al. The globin gene repertoire of lampreys: Convergent evolution of hemoglobin and myoglobin in jawed and jawless vertebrates. Mol Biol Evol. 2014;31(10):2708–2721. doi: 10.1093/molbev/msu216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Rosenblum EB, Römpler H, Schöneberg T, Hoekstra HE. Molecular and functional basis of phenotypic convergence in white lizards at White Sands. Proc Natl Acad Sci USA. 2010;107(5):2113–2117. doi: 10.1073/pnas.0911042107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Losos JB. Convergence, adaptation, and constraint. Evolution. 2011;65(7):1827–1840. doi: 10.1111/j.1558-5646.2011.01289.x. [DOI] [PubMed] [Google Scholar]
  • 11.Ashihara H, Suzuki T. Distribution and biosynthesis of caffeine in plants. Front Biosci. 2004;9:1864–1876. doi: 10.2741/1367. [DOI] [PubMed] [Google Scholar]
  • 12.Hammerstone JF, Romanczyk LJ, Aitken WM. Purine alkaloid distribution within Herrania and Theobroma. Phytochemistry. 1994;35(5):1237–1240. [Google Scholar]
  • 13.Weckerle CS, Stutz MA, Baumann TW. Purine alkaloids in Paullinia. Phytochemistry. 2003;64(3):735–742. doi: 10.1016/s0031-9422(03)00372-8. [DOI] [PubMed] [Google Scholar]
  • 14.Nathanson JA. Caffeine and related methylxanthines: Possible naturally occurring pesticides. Science. 1984;226(4671):184–187. doi: 10.1126/science.6207592. [DOI] [PubMed] [Google Scholar]
  • 15.Kim YS, et al. Resistance against beet armyworms and cotton aphids in caffeine-producing transgenic chrysanthemum. Plant Biotechnol. 2011;28(4):393–395. [Google Scholar]
  • 16.Wright GA, et al. Caffeine in floral nectar enhances a pollinator’s memory of reward. Science. 2013;339(6124):1202–1204. doi: 10.1126/science.1228806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Suzuki T, Takahashi E. Caffeine biosynthesis in Camellia sinensis. Phytochemistry. 1976;15(8):1235–1239. [Google Scholar]
  • 18.Ashihara H, Monteiro AM, Gillies FM, Crozier A. Biosynthesis of caffeine in leaves of coffee. Plant Physiol. 1996;111(3):747–753. doi: 10.1104/pp.111.3.747. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.D’Auria JC, Chen F, Pichersky E. The SABATH family of MTS in Arabidopsis thaliana and other plant species. In: Romeo JT, editor. Recent Advances in Phytochemistry. Elsevier Science; Oxford: 2003. pp. 253–283. [Google Scholar]
  • 20.Kato M, et al. Caffeine biosynthesis in young leaves of Camellia sinensis: In vitro studies on N-methyltransferase activity involved in the conversion of xanthosine to caffeine. Physiol Plant. 1996;98(3):629–636. [Google Scholar]
  • 21.Uefuji H, Ogita S, Yamaguchi Y, Koizumi N, Sano H. Molecular cloning and functional characterization of three distinct N-methyltransferases involved in the caffeine biosynthetic pathway in coffee plants. Plant Physiol. 2003;132(1):372–380. doi: 10.1104/pp.102.019679. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Denoeud F, et al. The coffee genome provides insight into the convergent evolution of caffeine biosynthesis. Science. 2014;345(6201):1181–1184. doi: 10.1126/science.1255274. [DOI] [PubMed] [Google Scholar]
  • 23.Effmert U, et al. Floral benzenoid carboxyl methyltransferases: From in vitro to in planta function. Phytochemistry. 2005;66(11):1211–1230. doi: 10.1016/j.phytochem.2005.03.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Seo HS, et al. Jasmonic acid carboxyl methyltransferase: A key enzyme for jasmonate-regulated plant responses. Proc Natl Acad Sci USA. 2001;98(8):4788–4793. doi: 10.1073/pnas.081557298. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Varbanova M, et al. Methylation of gibberellins by Arabidopsis GAMT1 and GAMT2. Plant Cell. 2007;19(1):32–45. doi: 10.1105/tpc.106.044602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Granick S. Speculations on the origins and evolution of photosynthesis. Ann N Y Acad Sci. 1957;69(2):292–308. doi: 10.1111/j.1749-6632.1957.tb49665.x. [DOI] [PubMed] [Google Scholar]
  • 27.Horowitz NH. On the evolution of biochemical syntheses. Proc Natl Acad Sci USA. 1945;31(6):153–157. doi: 10.1073/pnas.31.6.153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Ycas M. On earlier states of the biochemical system. J Theor Biol. 1974;44(1):145–160. doi: 10.1016/s0022-5193(74)80035-4. [DOI] [PubMed] [Google Scholar]
  • 29.Jensen RA. Enzyme recruitment in evolution of new function. Annu Rev Microbiol. 1976;30:409–425. doi: 10.1146/annurev.mi.30.100176.002205. [DOI] [PubMed] [Google Scholar]
  • 30.Bergthorsson U, Andersson DI, Roth JR. Ohno’s dilemma: Evolution of new genes under continuous selection. Proc Natl Acad Sci USA. 2007;104(43):17004–17009. doi: 10.1073/pnas.0707158104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Aharoni A, et al. The ‘evolvability’ of promiscuous protein functions. Nat Genet. 2005;37(1):73–76. doi: 10.1038/ng1482. [DOI] [PubMed] [Google Scholar]
  • 32.Angelo PCSA, et al. Brazilian Amazon Consortium for Genomic Research (REALGENE) Guarana (Paullinia cupana var. sorbilis), an anciently consumed stimulant from the Amazon rain forest: The seeded-fruit transcriptome. Plant Cell Rep. 2008;27(1):117–124. doi: 10.1007/s00299-007-0456-y. [DOI] [PubMed] [Google Scholar]
  • 33.Argout X, et al. Towards the understanding of the cocoa transcriptome: Production and analysis of an exhaustive dataset of ESTs of Theobroma cacao L. generated from various tissues and under various conditions. BMC Genomics. 2008;9:512. doi: 10.1186/1471-2164-9-512. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Koyama Y, Tomoda Y, Kato M, Ashihara H. Metabolism of purine bases, nucleosides and alkaloids in theobromine-forming Theobroma cacao leaves. Plant Physiol Biochem. 2003;41(11–12):977–984. [Google Scholar]
  • 35.Zheng XQ, Koyama Y, Nagai C, Ashihara H. Biosynthesis, accumulation and degradation of theobromine in developing Theobroma cacao fruits. J Plant Physiol. 2004;161(4):363–369. doi: 10.1078/0176-1617-01253. [DOI] [PubMed] [Google Scholar]
  • 36.Yoneyama N, et al. Substrate specificity of N-methyltransferase involved in purine alkaloids synthesis is dependent upon one amino acid residue of the enzyme. Mol Genet Genomics. 2006;275(2):125–135. doi: 10.1007/s00438-005-0070-z. [DOI] [PubMed] [Google Scholar]
  • 37.Schimpl FC, et al. Molecular and biochemical characterization of caffeine synthase and purine alkaloid concentration in guarana fruit. Phytochemistry. 2014;105:25–36. doi: 10.1016/j.phytochem.2014.04.018. [DOI] [PubMed] [Google Scholar]
  • 38.Baumann TW, Schulthess BH, Hanni K. Guaraná (Paullinia cupana) rewards seed dispersers without intoxicating them by caffeine. Phytochemistry. 1995;39(5):1063–1070. [Google Scholar]
  • 39.Kato M, Mizuno K, Crozier A, Fujimura T, Ashihara H. Caffeine synthase gene from tea leaves. Nature. 2000;406(6799):956–957. doi: 10.1038/35023072. [DOI] [PubMed] [Google Scholar]
  • 40.Stutz MA. 2001. Purinalkaloide in Blüten. Zeitliche und räumliche Allokation. Aspekte der Biosynthese. Masters thesis (University of Zurich, Zurich)
  • 41.Kretschmar JA, Baumann TW. Caffeine in Citrus flowers. Phytochemistry. 1999;52(1):19–23. [Google Scholar]
  • 42.Mazzafera P. Catabolism of caffeine in plants and microorganisms. Front Biosci. 2004;9:1348–1359. doi: 10.2741/1339. [DOI] [PubMed] [Google Scholar]
  • 43.Ashihara H, Crozier A. Biosynthesis and metabolism of caffeine and related purine alkaloids in plants. Adv Bot Res. 1999;30:117–205. [Google Scholar]
  • 44.Ashihara H, Crozier A. Caffeine: A well known but little mentioned compound in plant science. Trends Plant Sci. 2001;6(9):407–413. doi: 10.1016/s1360-1385(01)02055-6. [DOI] [PubMed] [Google Scholar]
  • 45.Suzuki T, Takahashi E. Biosynthesis of caffeine by tea-leaf extracts. Enzymic formation of theobromine from 7-methylxanthine and of caffeine from theobromine. Biochem J. 1975;146(1):87–96. doi: 10.1042/bj1460087. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Huang R, et al. Enzyme functional evolution through improved catalysis of ancestrally nonpreferred substrates. Proc Natl Acad Sci USA. 2012;109(8):2966–2971. doi: 10.1073/pnas.1019605109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Kato M, et al. Purification and characterization of caffeine synthase from tea leaves. Plant Physiol. 1999;120(2):579–586. doi: 10.1104/pp.120.2.579. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.van Breda SV, van der Merwe CF, Robbertse H, Apostolides Z. Immunohistochemical localization of caffeine in young Camellia sinensis (L.) O. Kuntze (tea) leaves. Planta. 2013;237(3):849–858. doi: 10.1007/s00425-012-1804-x. [DOI] [PubMed] [Google Scholar]
  • 49.Wikstrom N, Savolainen V, Chase MW. Evolution of the angiosperms: Calibrating the family tree. Proc Biol Sci. 2001;268(1482):2211–2220. doi: 10.1098/rspb.2001.1782. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Thornton JW. Resurrecting ancient genes: Experimental analysis of extinct molecules. Nat Rev Genet. 2004;5(5):366–375. doi: 10.1038/nrg1324. [DOI] [PubMed] [Google Scholar]
  • 51.Hippauf F, et al. Enzymatic, expression and structural divergences among carboxyl O-methyltransferases after gene duplication and speciation in Nicotiana. Plant Mol Biol. 2010;72(3):311–330. doi: 10.1007/s11103-009-9572-0. [DOI] [PubMed] [Google Scholar]
  • 52.McCarthy AA, McCarthy JG. The structure of two N-methyltransferases from the caffeine biosynthetic pathway. Plant Physiol. 2007;144(2):879–889. doi: 10.1104/pp.106.094854. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Prum RO, Brush AH. The evolutionary origin and diversification of feathers. Q Rev Biol. 2002;77(3):261–295. doi: 10.1086/341993. [DOI] [PubMed] [Google Scholar]
  • 54.Bridgham JT, Carroll SM, Thornton JW. Evolution of hormone-receptor complexity by molecular exploitation. Science. 2006;312(5770):97–101. doi: 10.1126/science.1123348. [DOI] [PubMed] [Google Scholar]
  • 55.Gould SJ. Wonderful Life. WW Norton; New York: 1989. [Google Scholar]
  • 56.Yang Z. PAML 4: Phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24(8):1586–1591. doi: 10.1093/molbev/msm088. [DOI] [PubMed] [Google Scholar]
  • 57.Forment J, et al. Development of a citrus genome-wide EST collection and cDNA microarray as resources for genomic studies. Plant Mol Biol. 2005;57(3):375–391. doi: 10.1007/s11103-004-7926-1. [DOI] [PubMed] [Google Scholar]
  • 58.Azam M, et al. Comparative analysis of flower volatiles from nine citrus at three blooming stages. Int J Mol Sci. 2013;14(11):22346–22367. doi: 10.3390/ijms141122346. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Figueirêdo LC, Faria-Campos AC, Astolfi-Filho S, Azevedo JL. Identification and isolation of full-length cDNA sequences by sequencing and analysis of expressed sequence tags from guarana (Paullinia cupana) Genet Mol Res. 2011;10(2):1188–1199. doi: 10.4238/vol10-2gmr1124. [DOI] [PubMed] [Google Scholar]
  • 60.Wall PK, et al. PlantTribes: A gene and gene family resource for comparative genomics in plants. Nucleic Acids Res. 2008;36(Database issue):D970–D976. doi: 10.1093/nar/gkm972. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–780. doi: 10.1093/molbev/mst010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Guindon S, et al. New algorithms and methods to estimate maximum-likelihood phylogenies: Assessing the performance of PhyML 3.0. Syst Biol. 2010;59(3):307–321. doi: 10.1093/sysbio/syq010. [DOI] [PubMed] [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES