Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2006 Dec 28;104(2):559–564. doi: 10.1073/pnas.0610012104

Functionally important glycosyltransferase gain and loss during catarrhine primate emergence

Chihiro Koike †,, Monica Uddin §, Derek E Wildman §,¶,, Edward A Gray †,, Massimo Trucco ††, Thomas E Starzl †,, Morris Goodman §,‡‡,§§
PMCID: PMC1766424  PMID: 17194757

Abstract

A glycosyltransferase, α1,3galactosyltransferase, catalyzes the terminal step in biosynthesis of Galα1,3Galβ1–4GlcNAc-R (αGal), an oligosaccharide cell surface epitope. This epitope or antigenically similar epitopes are widely distributed among the different forms of life. Although abundant in most mammals, αGal is not normally found in catarrhine primates (Old World monkeys and apes, including humans), all of which produce anti-αGal antibodies from infancy onward. Natural selection favoring enhanced resistance to αGal-positive pathogens has been the primary reason offered to account for the loss of αGal in catarrhines. Here, we question the primacy of this immune defense hypothesis with results that elucidate the evolutionary history of GGTA1 gene and pseudogene loci. One such locus, GGTA1P, a processed (intronless) pseudogene (PPG), is present in platyrrhines, i.e., New World monkeys, and catarrhines but not in prosimians. PPG arose in an early ancestor of anthropoids (catarrhines and platyrrhines), and GGTA1 itself became an unprocessed pseudogene in the late catarrhine stem lineage. Strong purifying selection, denoted by low nonsynonymous substitutions per nonsynonymous site/synonymous substitutions per synonymous site values, preserved GGTA1 in noncatarrhine mammals, indicating that the functional gene product is subjected to considerable physiological constraint. Thus, we propose that a pattern of alternative and/or more beneficial glycosyltransferase activity had to first evolve in the stem catarrhines before GGTA1 inactivation could occur. Enhanced defense against αGal-positive pathogens could then have accelerated the replacement of αGal-positive catarrhines by αGal-negative catarrhines. However, we emphasize that positively selected regulatory changes in sugar chain metabolism might well have contributed in a major way to catarrhine origins.

Keywords: adaptive evolution, glycobiology, pseudogene


The catarrhine primates (Old World monkeys and apes, including humans) are distinguished from other primates by a suite of biological characteristics among which are relatively long life spans, extended periods of growth and development, long generation times, large body sizes, atrophied vomeronasal organs but trichromatic vision, and increased encephalization (14). Unearthing the genetic, epigenetic, environmental, and behavioral changes associated with the emergence of these features is a major goal of evolutionary primatologists. It is evident from the fossil record and molecular divergence estimates that many of the genetic changes associated with the emergence of catarrhines occurred between 40 and 25 million years ago (mya) (5). Some of the genetic changes during this period of stem-catarrhine evolution drastically altered the prevalence of different cell surface sugar chains in tissues. In this article, we focus on elucidating the evolutionary history of the genetic locus, GGTA1, that was involved in an especially enigmatic alteration, the loss of the cell-surface sugar chain epitope termed αGal (Galα1,3Gal β1–4GlcNAc-R). The GGTA1 locus encodes a particular glycosyltransferase, the Golgi transmembrane-bound α1,3 galactosyltransferase (α1,3GT, EC 2.4.1.87) that catalyzes the terminal step in biosynthesis of the αGal epitope (6).

With the sole exception of catarrhines (7), the αGal epitope is expressed on the surface of cells in all mammalian species examined to date, including artiodactyls, carnivores, rodents, prosimians, and platyrrhines (New World monkeys) (7, 8). The epitope is abundant on such different cells as those of the vascular system (8), the digestive mucosa (9,) and the vomeronasal sensory epithelium (10). The αGal-negative catarrhines produce high titers of anti-αGal antibodies from infancy onward (11). These antibodies are similar to the anti-A and anti-B isoagglutinins of humans who lack the A or B antigens of the ABH histo-blood type oligosachharides. Because of the high titer of catarrhine anti-αGal antibodies, immediate (hyperacute) rejection occurs when organs, tissues, or cells are transplanted from αGal-positive species to αGal-negative catarrhines (12).

The molecular basis for αGal expression was established by the cloning of bovine (13), mouse (14), and porcine (15) GGTA1 cDNA, by examination of the mouse (16) and porcine (1720) GGTA1 genomic organization, and by production of GGTA1 knockout mice (21) and pigs (22). Insight into why catarrhines lack a functional α1,3GT enzyme was forestalled by the shortage of information on the catarrhine GGTA1 coding region. The initial information was limited to partial sequences and chromosomal locations of an unprocessed human pseudogene and a processed (i.e., intronless) human pseudogene (23, 24), i.e., an orthologue and a paralogue of GGTA1. Based on the finding that the unprocessed pseudogene sequence shared the same in-frame termination codon with the processed pseudogene sequence, it was assumed that the processed pseudogene was generated from an already inactivated source gene (8, 23). It was further assumed from partial catarrhine sequences (25) that the GGTA1 locus became an unprocessed pseudogene twice, once in apes and separately in Old World monkeys. Because the αGal epitope or antigenically similar epitopes occur widely among the different forms of life, including pathogens, it has been suggested that the primary reason for why the catarrhines would lose αGal is that there would be an immunological advantage afforded by anti-αGal antibodies (8, 23, 2527). There might also be enhanced defense against those pathogens that have binding sites for αGal epitopes, because these pathogens might then not be able to attach themselves to the αGal-negative cells of catarrhines.

The molecular basis for the absence of αGal epitope expression in catarrhine primates was definitively elucidated by delineation of the full coding region and exon–intron structure of the unprocessed pseudogene (termed UPG) in rhesus, orangutan, and human genomes (28). In contrast to the hypothesis of independent inactivation in apes and Old World monkeys, the presence of two shared derived substitutions in the UPG of the three catarrhine species indicated that GGTA1 inactivation occurred in the stem lineage of catarrhines (28). Sequence comparisons also indicated that the origin of the processed pseudogene (GGTA1P, termed PPG) preceded the origin of UPG, in contradiction to the previous view that the PPG arose from the UPG (8, 23).

The uncertainty as to PPG's origin has been resolved by the finding in this study that the PPG locus is present in both platyrrhines and catarrhines but not in the prosimians that are strepsirrhine primates. We identified the expressed GGTA1 locus and sequenced cDNA representing this locus in two strepsirrhines (lemur and loris) and partially sequenced such cDNA in the haplorhine prosimian (tarsier). With a data set of GGTA1, UPG, and PPG orthologues and paralogues, we reconstructed the evolutionary history of these sequences, dating not only when the PPG arose in the stem anthropoids but also when the active gene became a pseudogene, i.e., UPG, in the stem catarrhines. Our data contradict the view that the active gene was relatively neutral to natural selection. Instead, our data suggest that purifying selection pressure favored retention of the GGTA1 gene in noncatarrhine mammals even though they then lacked the possible immunological advantage that would be afforded by producing anti-αGal antibodies. Furthermore, we infer that pseudogenization of the functional GGTA1 gene in catarrhines became possible only after alternative and/or more beneficial glycosyltransferase activity evolved in the stem lineage of the catarrhine primates.

Results

Alignment File.

In this study, full coding region GGTA1 sequences were generated from lemur, loris, and howler monkey samples, and PPG sequences were generated from capuchin, marmoset, rhesus, orangutan, and chimpanzee samples. Also, a partial coding region sequence from a putative GGTA1 locus was generated from a tarsier sample. These sequences and previously reported sequences for the active GGTA1 gene and the PPG and UPG homologues [see supporting information (SI) Table 2] were aligned for subsequent phylogenetic analyses. An alignment file containing the full set of GGTA1, UPG, and PPG nucleotide sequences referred to in this study is available in Nexus format as SI Data Set 1. The species with the active α1,3GT enzyme show many conserved amino acid residues, among which are the 16 that have been previously described as required for the enzyme to function properly (2932). These 16 residues are identical in all species with an active GGTA1 coding region except in the howler monkey, which differed at three of the critical residues: S207C, R210S, and H288Y (numbered according to the marmoset amino acid sequence). The species with an inactivated GGTA1 coding region, i.e., all catarrhine UPG sequences, show in the alignment file a single nucleotide position gap located where the marmoset sequence codes for amino acid residue 81. This gap designates a frame shift deletion just 5′ of the coding sequence region specifying the enzyme's catalytic domain (28). Thus, if any transcribed message with this frame shift deletion had been translated in the stem catarrhines, the resulting polypeptide would not have been functional.

Nonsynonymous to Synonymous Rates (ω).

Phylogenetic Analysis by Maximum Likelihood (PAML) (33) rejected the hypothesis (χ2 = 116.44, 37 df, P < 0.0001) of a molecular clock under the “one-ratio” M0 model (−ln L = 7014.74) with a single ω [i.e., nonsynonymous substitutions per nonsynonymous site (dN)/synonymous substitutions per synonymous site (dS)] value on all branches in favor of a “free ratio” M1 model (−ln L = 6956.52) that allowed the ω value to vary on every branch. The M1 tree is depicted in Fig. 1. In general, the ω values are well below 1 for all lineages in which GGTA1 was active. This pattern of purifying selection is consistent with the hypothesis that GGTA1 is a functionally important gene. The highest ω values for active lineages are on the lineage leading to the howler monkey and on the stem primate lineage. The ω values for the UPG and PPG are much more compatible with neutrally evolving lineages, with ω values that tend toward a value of 1. Branch test results comparing the M0 (−lnL = 4766.92) and M2 (−lnl = 4764.96) models for the active gene data set were significantly different (χ2 = 3.93, 1 df, P < 0.05), with the howler lineage showing a dN/dS ratio more than two times higher than the ratio estimated for the rest of the tree (0.64 vs. 0.28, respectively). Maximum parsimony (MP) and maximum likelihood (ML)-reconstructed ancestral sequences yielded dN/dS ratios that were similar to those obtained by PAML analysis except that the primate stem showed a lower ratio (0.36 and 0.20 for MP and ML, respectively) (data not shown), indicative of an active gene evolving under strong purifying selection.

Fig. 1.

Fig. 1.

PAML-estimated ω (i.e., dN/dS) and (N*dN, S*dS) values under the M1 “free ratio” model, which allows the ω value to vary on each branch. Estimated number of nonsynonymous and synonymous sites under this model are 827.4 and 318.6, respectively. Branches leading to unprocessed (UPG) and processed (PPG) loci are depicted in black. Branches leading to GGTA1 genes are depicted in color, with green indicating those branches with dN/dS values <0.5 (i.e., purifying selection) and red indicating those branches with dN/dS values >0.5 (i.e., relaxed purifying selection). Depicted in gray is the branch on which the UPG locus originated. Tree topology also reflects the optimal maximum likelihood and Bayesian phylogenetic results (−lnL = 7125.71 and 7140.76, respectively); maximum parsimony bootstrap results differed only in showing a trichotomy among the active New World monkey genes.

Phylogenetic Analyses.

The optimal MP, ML, and Bayesian phylogenetic results are in close agreement (Fig. 1). Parsimony searches recovered three most parsimonious trees of length 1,121, and the only source of incongruence among these topologies is the branching order of the GGTA1 sequences within the New World monkey clade. The optimal ML topology agrees with both the MP and Bayesian topologies. Branch support for all clades (SI Table 3) is generally quite high.

Time of PPG and UPG Origin.

The PPG lineage is estimated to have originated 57.6 mya, indicating an appearance on the stem anthropoid lineage at a date close to that of the tarsier-anthropoid last common ancestor (LCA): i.e., the crown haplorhine LCA. The UPG lineage is estimated to have originated 28.1 mya, supporting the inference that the conversion of the GGTA1 to the UPG locus preceded the crown catarrhine LCA by only a few million years.

Tarsier Sequence.

This study also provides sequence information from the genus Tarsius for its putative GGTA1 locus. It is not certain whether the sequences obtained represent PPG or GGTA1 sequences. In favor of the former, phylogenetic analyses group the tarsier sequences with sequences from the PPG clade, albeit with low branch support (SI Table 4). This finding is consistent with the calculated appearance of the PPG at a time close to the haplorhine LCA (see above). Conversely, there are no disruptive mutations in the sequence that would cause either reading frameshifts or premature termination codons. Also, in the phylogenetic tree that included the partial tarsier sequence, the tarsier lineage has a low ω value (0.21, SI Fig. 3), which indicates a functional gene under purifying selection. It will be necessary to obtain either intronic or full-length cDNA sequence data to fully determine whether the tarsier possesses an active GGTA1 locus.

Evolutionary Rates.

The “Hominoid Slowdown” hypothesis was originally proposed on the basis of data that revealed very small degrees of serum albumin antigenic divergence among hominoids (apes, including humans) and also on the basis of ideas concerning such hominoid organismal features as lengthened gestations and increased generation times (3436). That a slowdown in rates of molecular evolution occurred has been affirmed in numerous studies since then (3742). A comparison of rates calculated from the PPG clade supports the concept of a rate slowdown in anthropoid lineages, especially in the catarrhines (Table 1). The UPG rates also show the hominoid rate being slower than the Old World monkey rate.

Table 1.

Evolutionary rates in selectively neutral DNA

Time period/lineage Total changes* Nucleotide substitution rate
Rates from PPG data
57.6–40 mya 95.1 4.7
40 mya, marmoset 100.6 2.2
40 mya, capuchin 77.1 1.7
40 mya, rhesus 59.1 1.3
40 mya, orangutan 61.2 1.3
40 mya, chimpanzee 53.1 1.2
40 mya, human 56.3 1.2
25 mya, rhesus 36.2 1.3
25 mya, orangutan 38.3 1.3
25 mya, chimpanzee 30.2 1.1
25 mya, human 33.4 1.2
14 mya, orangutan 17.8 1.1
14 mya, chimpanzee 9.7 0.6
14 mya, human 12.9 0.8
6 mya, chimpanzee 2.1 0.3
6 mya, human 5.3 0.8
Rates from UPG data
To rhesus UPG 53.7 1.9
To orangutan UPG 26.9 0.9
To human UPG 30.6 1.1

*Nonsynonymous and synonymous changes.

Reported as nucleotide substitutions per million years per 1,000 nt positions.

Twenty-five million years ago to present.

Discussion

From our results, we can infer that, in noncatarrhine mammals, the GGTA1 gene functions under strong physiological constraints. The reconstructed evolutionary history of this gene (Fig. 2) reveals that, in the stem lineage to anthropoid primates, near the time of the tarsier–anthropoid LCA (≈57–58 mya), GGTA1 gave rise to a processed, retrotranscribed pseudogene. The two paralogously related loci (GGTA1 and PPG) then coexisted in the same anthropoid lineages but evolved independently of each other. The retrotranscribed PPG locus freely accumulated mutations that escaped the scrutiny of natural selection, whereas the parent GGTA1 gene did not escape such scrutiny in the anthropoid stem and platyrrhine lineages but, nevertheless, did become a UPG in the late catarrhine stem lineage (≈28 mya). Thereafter, the UPG sequences accumulated mutations freely in catarrhine lineages.

Fig. 2.

Fig. 2.

Phylogeny of the GGTA1 locus. Shown in black are the lineages in which GGTA1 remained active. Shown in red are the lineages in which GGTA1 was inactivated. Blue and red arrows indicate, respectively, the origin of the processed pseudogene (PPG) and the origin of the unprocessed pseudogene (UPG). The latter event corresponds to the time of GGTA1 inactivation that occurs in the stem-catarrhine lineage and precedes the last common ancestor of living catarrhines by only a few million years. The vertical line at the left indicates time (millions of years ago).

Despite being freed of physiological constraints, both the PPG and UPG sequences have nucleotide substitution rates that slowed in hominoid lineages. This slowdown indicates that rates of occurrence of new mutations decreased during the descent of hominoids. In turn, the decreased de novo mutation rates probably resulted directly from lengthened generation times and indirectly from the whole suite of adaptive changes that shaped catarrhine origins and evolution. We suspect that among the adaptive changes were those that brought about a new pattern of sugar chain expression that then permitted inactivation of the GGTA1 gene.

If the production of anti-αGal antibodies was the crucial survival advantage worth loss of the α1,3GT enzyme, inactivation of the GGTA1 gene should have occurred in many mammalian species. Yet over the past 100 or so million years, no examples of αGal-negative lineages are known to have arisen among noncatarrhine mammals (1320, 28, 43, and this study). Indeed, strong purifying selection with preservation of the functional GGTA1 locus in noncatarrhine mammals is evident from our finding that the αGal-positive lineages accumulated nonsynonymous substitutions at this locus much more slowly than synonymous substitutions (Fig. 1).

That loss of the GGTA1 gene can be deleterious has become evident by the production of GGTA1 KO mice and pigs for xenotransplantation research (21, 22, 44). Animals with GGTA1 KO have both subtle and conspicuous health abnormalities. For example, the KO mice develop early-onset cataracts (21). The founder KO pigs were only a few months old and appeared to be healthy at the time they were reported (22). However, they proved to be small, frail, and difficult to breed (unpublished observations of C.K., M.T., and T.E.S., who also are coauthors of ref. 22). Such findings are consistent with the view that the GGTA1 gene plays an important role in the endogenous metabolic homeostasis of αGal-positive species.

However, the fact that the KO is not lethal suggests that the activities of the α1,3GT enzyme can be substituted for (albeit incompletely) by another glycosyltransferase or by multiple glycosyltransferases. We propose that adaptive coevolution of other glycosyltransferase genes made such a substitution possible in the late stem-catarrhines. If the functional consequences of the replacement glycosyltransferase activity were more beneficial than the original α1,3GT activity, that could have provided the major impetus for GGTA1 gene inactivation. However, the inactivation would still leave open the possibility that pathogen resistance played a substantial role in GGTA1 evolution.

No doubt, the evolution of glycosyltransferase gene families (26, 4548) involved not only changes in enzyme encoding sequences but also in promoter and other regulatory sequences. Such changes would be expected to result in species-specific patterns of developmental and tissue expression. For example, the ABH epitopes are expressed in the digestive mucosa of all mammalian species but are not expressed on the vascular endothelial cells (VEC) of noncatarrhine mammals (49). All catarrhines express ABH epitopes on VEC. Apes also express ABH epitopes on RBC but Old World monkeys do not (49). Another example of species differences in oligosaccharide tissue expression patterns involves the sialic acid moiety. This moiety is found at higher concentrations in the brains of apes than in other mammals and, among apes, is more abundant in human than in chimpanzee brains (50). A striking finding is the loss of the sialic acid moiety N-glycolylneuraminic acid (Neu5Gc) in humans and its replacement by a greater abundance of its precursor, N-acetylneuraminic acid (Neu5Ac) (51). This human-specific loss resulted from an inactivating mutation in the gene encoding cytidine-monophosphate N-acetylneuraminic acid hydroxylase (CMAH) (51). Siglecs, i.e., receptors that can interact with the cell-surface sialic acid oligosaccharides, also show striking evolutionary changes in descent of catarrhine lineages (51, 52).

Clearly, the changes associated first with catarrhine primate emergence and then with later ape lineages involved striking shifts in the expression of sugar chains. The evidence from our study indicates that maintenance of the GGTA1 gene was an obligatory condition for survival of noncatarrhine mammals in the wild. In our metabolic hypothesis, the αGal-negative state could not be established until the activity of the α1,3GT enzyme was replaced by alternative and/or more beneficial glycosyltransferase activity, i.e., by a major adaptive change in sugar chain metabolism. The resulting production of anti-αGal antibodies may well have enhanced protection against αGal-positive pathogens (26, 27), but this added benefit from the loss of αGal would have required first the alterations in sugar-chain metabolism.

Loss of αGal in the late stem-catarrhines may have been implicated in the decreased reliance on olfaction that accompanied the greater reliance on vision. Observations made on the sensory epithelium of the vomeronasal organ (VNO) of rats (10) show that αGal epitopes are abundantly expressed in this sensory epithelium and thus presumably engaged in the VNOs important olfactory function of detecting pheromones. Catarrhines have atrophic VNO tissue, loss of olfactory receptor genes, and decreased pheromone detection abilities (3, 53, 54). It may be significant that the only noncatarrhine lineage in our GGTA1 analysis showing an increased dN/dS ratio is the platyrrhine lineage to the howler monkey, suggesting that purifying selection for maintenance of α1,3GT function acted less strongly on this lineage than on all other noncatarrhine lineages. Possibly, the howler monkey relies less on olfaction than on vision as suggested by Gilad et al. (54). The howler monkey is the only primate other than catarrhines that is known to have full trichromatic vision (4, 5558). Examination of many other primates, especially other members of the Atelidae (the family to which howler monkeys belong) is needed to determine whether, among noncatarrhine primates, the decreased purifying selection acting on the GGTA1 gene is specific to the howler monkey.

Finally, however, we emphasize that studies of the GGTA1 gene cannot be interpreted in isolation. Since discovery of the ABO (H) histo-blood groups, such systems have been seen mainly from the perspective of immunological reactions to cell-surface antigens (26, 45, 46). In place of the hypothesis that loss of αGal was primarily driven by an immunologic advantage against pathogens, we propose that GGTA1 inactivation could not be established until alternative glycosyltransferase activity replaced α1,3GT activity. This replacement involved striking shifts in sugar-chain expression patterns. In subjecting the many glycosyltransferase genes to detailed phylogenetic studies, the species chosen should reflect the great adaptive diversity that exists among primates and other mammals. The results of such studies should help test our hypothesis that positively selected changes in sugar chain metabolism coincided with and helped bring about emergence and evolution of the catarrhine primates.

Materials and Methods

Materials.

Whole blood from the ring-tailed lemur (Lemur catta), slender loris (Loris tardigradus), Philippine tarsier (Tarsier syrichta), common marmoset (Callithrix jacchus), white-throated capuchin (Cebus capucinus), black-and-gold howler (Alouatta caraya), rhesus macaque (Macaca mulatta), orangutan (Pongo pygmaeus), and chimpanzee (Pan troglodytes) was kindly provided by the Pittsburgh Zoo (Pittsburgh, PA), University of Wisconsin (Madison, WI), University of Pittsburgh, Wayne State University, Yerkes National Primate Research Center Emory University (Atlanta, GA), or the Duke University Primate Research Center (Durham, NC).

Methods.

Standard methods were used to isolate high-molecular-weight genomic DNA from the various samples. Total RNA was extracted from the samples with TRIzol reagent (GIBCO, Carlsbad, CA). Poly(A)+ RNA was separated from total RNA by using the Dynabeads mRNA Purification Kit (Dynal, Oslo, Norway) according to the manufacturer's instructions.

DNA Amplification and Sequencing.

GenomeWalker libraries for the respective species were constructed by using the Universal GenomeWalker Library kit (Clontech, Palo Alto, CA). Human PPG amplification was obtained with GenomeWalker-PCR. Species- or clade-specific PPG primers were designed from sequences initially obtained by using the human PPG primer set. For the lemur GGTA1 gene, primers were designed based on sequence obtained by using the previously published (28) human UPG primer set; these newly designed primers were also used to obtain loris and tarsier gene sequences. Primer sequences used to identify the various genes are as follows: Lemur GGTA1 gene: L9A, 5′-CATCATGCTGGACGACATCTCCAAGATGC-3′; L9B, 5′-CAAGCCTGAGAAGAGGTGGCAGGACATC-3′; L9P, 5′-GTATGCTGACTTTACGCCTCTCATAGG-3′; L9Q, 5′-GTAGCTGAGCCACTGACTGGCCCAG-3′. New World monkey PPG: M4a, 5′-AGGAGAAAATAATGAATGTCAAAGGAAACGT-3′; M6a, 5′-ACACCCAGAAGTTGTTGACAGCGGCAC-3′; M8p, 5′-TCTTTCCACAGCAAACCCATCAACCCCA-3′; M8q, 5′-GCCTTCCCACATAACCGGCACATTCCA-3′. Rhesus PPG: Rpa, 5′-GCTGAGTGGATGGATGATGGGGAGGAG-3′; Rpq, 5′-CAAGCTGATCTCGAACTCCTGACCTCACGTG-3′. Orangutan PPG: Upa, 5′-GTCAAAGCCGATACGTTTTCCCGGCAG-3′; Upq, 5′-ACCATAGATTCATTCTCTCATATTACAGTGCTC-3′.

TaKaRa LA Taq (Takara Shuzo, Shiga, Japan) and Titanium Taq (Clontech) enzyme were used for all PCR experiments. The PCR thermal cycling conditions, recommended by the manufacturer, were performed on a Gene Amp System 9600 or 9700 thermocycler (PerkinElmer, Wellesley, MA).

To identify the 5′ and 3′ ends of the GGTA1 gene or UPG transcripts of the lemur, loris, tarsier, marmoset, capuchin, rhesus, orangutan, and chimpanzee, the Marathon RACE (rapid amplification of cDNA end) libraries (Clontech) were constructed from total RNA of the different species in accordance with the manufacturer's specified protocol.

PCR products amplified by the GW-PCR, RACE-PCR, and RT-PCR were subcloned into the pCR II vector provided with the Original TA Cloning kit (Invitrogen, Carlsbad, CA). Automated fluorescent sequencing of cloned inserts was performed by using an ABI 377 DNA Sequence Analyzer (Applied Biosystems, Foster City, CA).

Sequence Analyses.

Newly obtained and previously reported DNA sequences (SI Table 2) were manually aligned in MacClade 4.08 (59). Deduced amino acid sequences of GGTA1 genes were checked for accuracy against protein sequences associated with the nucleotide GenBank accession numbers. Gene trees were inferred by using MP and ML methods as implemented in PAUP* 4.0b10 (60) and by using Bayesian methods as implemented by MrBayes 3.1 (61). For the latter two methods, the optimal model of sequence evolution was determined by using ModelTest 3.7 (62) and MrModeltest2.2 (63), respectively, before inferring optimal gene trees. ModelTest 3.7 selected TVM+Γ as the best-fit model (−lnL = 7125.71) according to the Akaike Information Criterion (AIC) (64), with the following associated parameters: Lset Base = (0.2876 0.2117 0.2446), Nst = 6, Rmat = (1.7559 5.6656 0.7355 1.7448 5.6656), Rates = γ, Shape = 1.1091, Pinvar = 0. MrModelTest2.2 selected a GTR+ Γ as the best-fit model according to the AIC, with the following model form: Prset statefreqpr = dirichlet (1, 1, 1, 1); Lset nst = 6 rates = γ.

Patterns of selection across the inferred tree for the entire data set were estimated by using PAML 3.15 (33). In addition, ancestral sequences were reconstructed by using MP (with the DELTRAN algorithm) and ML (by using the TVM+Γ model) methods for each node in the tree depicted in Fig. 1; dN/dS for each node and its descendent branch was then calculated by using MEGA 3.1 (65) with the Pamilo–Bianchi–Li method (66, 67). The existence of a molecular clock was tested by comparing the M0 model, which estimates a single ω (dN/dS) ratio for all links contained in the tree, with the M1 model, which is allowed to estimate a different ω ratio for each link contained in the tree. In addition, to test whether the appearance of color vision correlates with a pattern of selection that suggests a reduction of functional constraints, a data set containing only the active genes was analyzed by comparing the M0 model with an M2 model, which allowed one rate for the howler lineage and another for the rest of the tree. In all cases, the existence of multiple local optima was evaluated by running each model three times, with three different staring ω values (0.5, 1, and 2). Where differences existed, the value with the highest likelihood score was used to compare models by using the likelihood ratio test.

Estimating the Time of PPG and UPG Origin.

The origin of the PPG and UPG sequences was estimated by using both the topology determined by the gene tree analyses and the nonsynonymous (N) and synonymous (S) values depicted in Fig. 1. As in Goodman et al. (5), Goodman (68), and Wildman and Goodman (69), 63 mya was taken as the date for the LCA of the living primates, 40 mya for the LCA of living anthropoids (platyrrhines and catarrhines), 26 mya for the LCA of living platyrrhines, 25 mya for the LCA of living catarrhines, and 14 mya for the orangutan–human LCA.

PPG Origin.

By using the N and S changes determined for each lineage depicted in Fig. 1, the distance (number of N plus number of S) between the primate LCA and the PPG-stem anthropoid LCA added to the distance between PPG-stem anthropoid LCA and crown anthropoid LCA was equated to 23 million years (63–40 mya); this value totaled 41.5. Then, by proportionality, the number of million years (X) was determined between: (i) primate LCA and PPG-stem anthropoid LCA; 23 million years/41.5 = X/9.8, where X = 5.4 million years, indicating a time of PPG origin at (63–5.4) or 57.6 mya. (ii) PPG-stem anthropoid LCA and crown anthropoid LCA; 23 million years/41.5 = X/31.7, where X = 17.6 million years, indicating a time of UPG origin at (40 + 17.6) or 57.6 million years.

UPG Origin.

Next, the conversion of the ACT to UPG on the catarrhine stem was determined by assuming that, at the time of pseudogenization, a slow rate for N functional substitutions became a fast rate for N pseudogene substitutions. By using the N changes depicted in Fig. 1, a functional N rate was estimated from the platyrrhine stem (e.g., 3.5/14 = 0.25 N per million years) and a pseudogene N rate was estimated from the UPG branch to the rhesus monkey (e.g., 37.1/25 = 1.48 N per million years). With these two rates and the total number of N changes on the 15 million years represented by the catarrhine stem (i.e., 7.6), the number of millions of years that the stem catarrhine GGTA1 locus existed before it became the UPG locus (i.e., X) was calculated in the following manner: 0.25X + 1.48 (15 − X) = 7.6. Thus, X = 11.87 million years, and the UPG locus originated ≈28.1 mya.

Finally, overall evolutionary rates were calculated for selectively neutral DNA as represented by the PPG clade and for PPG and UPG rates from the crown catarrhine LCA until the present day.

Supplementary Material

Supporting Information

Acknowledgments

We thank Ms. Terese Libert for technical assistance; Mss. Terry Mangan, Angela Alexander, Jenni Lajzerowicz, Madge Swindells, and Rexanne Struve for manuscript preparation; and Ajit Varki, Joseph Hacia, Nathaniel Dominy, Baruch Blumberg, and Nandor Gabor Than for insightful discussion. This work was supported by National Institutes of Health Grant R01DK64207 and in part by the Intramural Research Program of the National Institute of Child Health and Human Development, National Institutes of Health, Department of Health and Human Services and National Science Foundation Grant BCS-0550209.

Abbreviations

dS

synonymous substitutions per synonymous site

dN

nonsynonymous substitutions per nonsynonymous site

LCA

last common ancestor

mya

million years ago

ML

maximum likelihood

MP

maximum parsimony

PPG

processed pseudogene

UPG

unprocessed pseudogene.

Footnotes

The authors declare no conflict of interest.

Data deposition: The sequences reported in this paper have been deposited in the GenBank database (accession nos. DQ985356DQ985360, AF378671, AF521020, AY126667, and AY570802).

This article contains supporting information online at www.pnas.org/cgi/content/full/0610012104/DC1.

References

  • 1.Martin RD. Primate Origins and Evolution: A Phylogenetic Reconstruction. Princeton: Princeton Univ Press; 1990. [Google Scholar]
  • 2.Harvey PH, Martin RD, Clutton-Brock TH. In: Primate Societies. Smuts BB, Cheyney DL, Seyfarth RM, Wrangham RW, Struhsaker TT, editors. Chicago: Univ of Chicago Press; 1987. pp. 181–196. [Google Scholar]
  • 3.Zhang J, Webb DM. Proc Natl Acad Sci USA. 2003;100:8337–8341. doi: 10.1073/pnas.1331721100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Dominy NJ, Lucas PW. Nature. 2001;410:363–366. doi: 10.1038/35066567. [DOI] [PubMed] [Google Scholar]
  • 5.Goodman M, Porter CA, Czelusniak J, Page SL, Schneider H, Shoshani J, Gunnell G, Groves CP. Mol Phylogenet Evol. 1998;9:585–598. doi: 10.1006/mpev.1998.0495. [DOI] [PubMed] [Google Scholar]
  • 6.Blanken WM, Van den Eijnden DH. J Biol Chem. 1985;260:12927–12934. [PubMed] [Google Scholar]
  • 7.Galili U, Shohet SB, Kobrin E, Stults CLM, Macher BA. J Biol Chem. 1988;263:17755–17762. [PubMed] [Google Scholar]
  • 8.Joziasse DH, Oriol R. Biochim Biophy Acta. 1999;1455:403–418. doi: 10.1016/s0925-4439(99)00056-3. [DOI] [PubMed] [Google Scholar]
  • 9.McKenzie IF, Xing PX, Vaughan HA, Prenzoska J, Dabkowski PL, Sandrin MS. Transplant Immunol. 1994;2:81–86. doi: 10.1016/0966-3274(94)90032-9. [DOI] [PubMed] [Google Scholar]
  • 10.Takami S, Getchell ML, Getchell TV. Cell Tissue Res. 1995;280:211–216. doi: 10.1007/BF00307791. [DOI] [PubMed] [Google Scholar]
  • 11.Galili U, Clark MR, Shohet SB, Buehler J, Macher BA. Proc Natl Acad Sci USA. 1987;84:1369–1373. doi: 10.1073/pnas.84.5.1369. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Good AH, Cooper DKC, Malcolm AJ, Ippolito RM, Koren E, Neethling FA, Ye Y, Zuhdi N, Lamontagne LR. Transplant Proc. 1992;24:559–562. [PubMed] [Google Scholar]
  • 13.Joziasse DH, Shaper JH, Van den Eijnden DH, Van Tunen AJ, Shaper NL. J Biol Chem. 1989;264:14290–14297. [PubMed] [Google Scholar]
  • 14.Larsen RD, Rajan VP, Ruff MM, Kukowska-Latallo J, Cummings RD, Lowe JB. Proc Natl Acad Sci USA. 1989;86:8227–8231. doi: 10.1073/pnas.86.21.8227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Sandrin MS, Dabkowski PL, Henning MM, Mouhtouris E, McKenzie IFC. Xenotransplant. 1994;1:81–85. [Google Scholar]
  • 16.Joziasse DH, Shaper NL, Kim D, Van den Eijnden DH, Shaper JH. J Biol Chem. 1992;267:5534–5541. [PubMed] [Google Scholar]
  • 17.Strahan KM, Gu F, Preece AF, Gustavsson I, Andersson L, Gustafsson K. Immunogenetics. 1995;41:101–105. doi: 10.1007/BF00182319. [DOI] [PubMed] [Google Scholar]
  • 18.Vanhove B, Goret F, Soulillou JP, Pourcel C. Biochim Biophys Acta. 1997;1356:1–11. doi: 10.1016/s0167-4889(96)00151-6. [DOI] [PubMed] [Google Scholar]
  • 19.Katayama A, Ogawa H, Kadomatsu K, Kurosawa N, Kobayashi T, Kaneda N, Uchimura K, Yokoyama I, Muramatsu T, Takagi H. Glycoconj J. 1998;6:583–589. doi: 10.1023/a:1006963809894. [DOI] [PubMed] [Google Scholar]
  • 20.Koike C, Friday RP, Nakashima I, Luppi P, Fung JJ, Rao AS, Starzl TE, Trucco M. Transplant. 2000;70:1275–1283. doi: 10.1097/00007890-200011150-00004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Tearle RG, Tange MJ, Zannettino ZL, Katerelos M, Shinkel TA, Van Denderen BJW, Lonie AJ, Lyons I, Nottle MB, Cox T, et al. Transplant. 1996;61:13–19. doi: 10.1097/00007890-199601150-00004. [DOI] [PubMed] [Google Scholar]
  • 22.Phelps CJ, Koike C, Vaught TD, Boone J, Wells KD, Chen SH, Ball S, Specht SM, Polejaeva IA, Monahan JA, et al. Science. 2003;299:411–414. doi: 10.1126/science.1078942. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Joziasse DH, Shaper JH, Jabs EW, Shaper NL. J Biol Chem. 1991;266:6991–6998. [PubMed] [Google Scholar]
  • 24.Shaper NL, Lin SP, Joziasse DH, Kim DY, Yang-Feng TL. Genomics. 1992;12:613–615. doi: 10.1016/0888-7543(92)90458-5. [DOI] [PubMed] [Google Scholar]
  • 25.Galili U, Swanson K. Proc Natl Acad Sci USA. 1991;88:7401–7404. doi: 10.1073/pnas.88.16.7401. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Gagneux P, Varki A. Glycobiology. 1999;9:747–755. doi: 10.1093/glycob/9.8.747. [DOI] [PubMed] [Google Scholar]
  • 27.Varki A. Cell. 2006;126:841–845. doi: 10.1016/j.cell.2006.08.022. [DOI] [PubMed] [Google Scholar]
  • 28.Koike C, Fung JJ, Geller DA, Kannagi R, Libert T, Luppi P, Nakashima I, Profozich J, Rudert W, Sharma SB, et al. J Biol Chem. 2002;277:10114–10120. doi: 10.1074/jbc.M110527200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Gastinel LN, Bignon C, Misra AK, Hindsgaul O, Shaper JH, Joziasse DH. EMBO J. 2001;20:638–649. doi: 10.1093/emboj/20.4.638. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Boix E, Swaminathan GJ, Zhang Y, Natesh R, Brew K, Acharya KR. J Biol Chem. 2001;276:48608–48614. doi: 10.1074/jbc.M108828200. [DOI] [PubMed] [Google Scholar]
  • 31.Shetterly S, Tom I, Yen TY, Joshi R, Lee L, Wang PG, Macher BA. J Glycobiol. 2001;11:645–653. doi: 10.1093/glycob/11.8.645. [DOI] [PubMed] [Google Scholar]
  • 32.Lazarus BD, Milland J, Ramsland PA, Mouhtouris E, Sandrin MS. Glycobiology. 2002;12:793–802. doi: 10.1093/glycob/cwf092. [DOI] [PubMed] [Google Scholar]
  • 33.Yang Z. Comput Appl Biosci. 1997;13:555–556. doi: 10.1093/bioinformatics/13.5.555. [DOI] [PubMed] [Google Scholar]
  • 34.Goodman M. Hum Biol. 1961;33:131–162. [PubMed] [Google Scholar]
  • 35.Goodman M. Hum Biol. 1962;34:104–150. [PubMed] [Google Scholar]
  • 36.Goodman M. In: Classification and Human Evolution. Washburn SL, editor. Chicago: Aldine; 1963. pp. 204–234. [Google Scholar]
  • 37.Kim S-H, Elango N, Warden C, Vigoda E, Yi SV. PLoS Genet. 2006;2:e163. doi: 10.1371/journal.pgen.0020163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Steiper ME, Young NM, Sukarna TY. Proc Natl Acad Sci USA. 2004;101:17021–17026. doi: 10.1073/pnas.0407270101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Bailey WJ, Fitch DH, Czelusniak J, Slightom JL, Goodman M. Mol Biol Evol. 1991;8:155–184. doi: 10.1093/oxfordjournals.molbev.a040641. [DOI] [PubMed] [Google Scholar]
  • 40.Seino S, Bell GI, Li WH. Mol Biol Evol. 1992;9:193–203. doi: 10.1093/oxfordjournals.molbev.a040713. [DOI] [PubMed] [Google Scholar]
  • 41.Ellsworth DL, Hewett-Emmett D, Li WH. Mol Phylogenet Evol. 1993;2:315–321. doi: 10.1006/mpev.1993.1030. [DOI] [PubMed] [Google Scholar]
  • 42.NISC Comparative Sequencing Program. Elango N, Thomas JW, Yi SV. Proc Natl Acad Sci USA. 2006;103:1370–1375. doi: 10.1073/pnas.0510716103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Henion TR, Galili U. Subcell Biochem. 1999;32:49–77. [PubMed] [Google Scholar]
  • 44.Thall AD, Malys P, Lowe JB. J Biol Chem. 1995;270:21437–21440. doi: 10.1074/jbc.270.37.21437. [DOI] [PubMed] [Google Scholar]
  • 45.Hennet T. Cell Mol Life Sci. 2002;59:1081–1095. doi: 10.1007/s00018-002-8489-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Costache M, Apoil P-A, Cailleau A, Elmgren A, Larson G, Henry S, Blancher A, Iordachescu D, Oriol R, Mollicone R. J Biol Chem. 1997;272:29721–29728. doi: 10.1074/jbc.272.47.29721. [DOI] [PubMed] [Google Scholar]
  • 47.Harduin-Lepers A, Vallejo-Ruiz V, Krzewinski-Recchi MA, Samyn-Petit B, Julien S, Delannoy P. Biochimie. 2001;83:727–737. doi: 10.1016/s0300-9084(01)01301-3. [DOI] [PubMed] [Google Scholar]
  • 48.Saito N, Yamamoto F. Mol Biol Evol. 1997;14:399–411. doi: 10.1093/oxfordjournals.molbev.a025776. [DOI] [PubMed] [Google Scholar]
  • 49.Oriol R, Le Pendu J, Mollicone R. Vox Sang. 1986;51:161–171. doi: 10.1111/j.1423-0410.1986.tb01946.x. [DOI] [PubMed] [Google Scholar]
  • 50.Wang B, Brand-Miller J. Eur J Clin Nutr. 2003;57:1351–1369. doi: 10.1038/sj.ejcn.1601704. [DOI] [PubMed] [Google Scholar]
  • 51.Varki A. Am J Phys Anthropol Suppl. 2001;33:54–69. doi: 10.1002/ajpa.10018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Angata T, Margulies EH, Green ED, Varki A. Proc Natl Acad Sci USA. 2004;101:13251–13256. doi: 10.1073/pnas.0404833101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Liman ER, Innan H. Proc Natl Acad Sci USA. 2003;100:3328–3332. doi: 10.1073/pnas.0636123100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Gilad Y, Wiebe V, Przeworski M, Lancet D, Pääbo S. PLOS Biol. 2004;2:120–125. doi: 10.1371/journal.pbio.0020005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Jacobs GH, Neitz M, Deegan JF, Neitz J. Nature. 1996;382:156–158. doi: 10.1038/382156a0. [DOI] [PubMed] [Google Scholar]
  • 56.Boissinot S, Zhou Y-H, Qui L, Dulai KS, Neiswanger K, Schneider H, Sampaio I, Hunat DM, Hewett-Emmett D, Li WH. Zool Stud. 1997;36:360–369. [Google Scholar]
  • 57.Hunt DM, Dulai KS, Cowing JA, Julliot C, Mollon JD, Bowmaker JK, Li WH, Hewett-Emmet D. Vision Res. 1998;38:3299–3306. doi: 10.1016/s0042-6989(97)00443-4. [DOI] [PubMed] [Google Scholar]
  • 58.Kainz PM, Neitz J, Neitz M. Vision Res. 1998;38:3315–3320. doi: 10.1016/s0042-6989(98)00078-9. [DOI] [PubMed] [Google Scholar]
  • 59.Maddison DR, Maddison WP. MacClade 4: Version 4.0. Sunderland, MA: Sinauer; 2000. [Google Scholar]
  • 60.Swofford DL. PAUP*: Phylogenetic Analysis Using Parsimony (*and Other Methods) Version 4. Sunderland, MA: Sinauer; 2002. [Google Scholar]
  • 61.Ronquist F, Huelsenbeck JP. Bioinformatics. 2003;19:1572–1574. doi: 10.1093/bioinformatics/btg180. [DOI] [PubMed] [Google Scholar]
  • 62.Posada D, Crandall KA. Bioinformatics. 1998;14:817–818. doi: 10.1093/bioinformatics/14.9.817. [DOI] [PubMed] [Google Scholar]
  • 63.Nylander JAA. MrModeltest v2. Uppsala, Sweden: Evolutionary Biology Centre, Uppsala Univ; 2004. [Google Scholar]
  • 64.Akaike H. IEE Trans Aut Cont. 1974;19:716–722. [Google Scholar]
  • 65.Kumar S, Tamura K, Nei M. Brief Bioinform. 2004;5:150–163. doi: 10.1093/bib/5.2.150. [DOI] [PubMed] [Google Scholar]
  • 66.Pamilo P, Bianchi NO. Mol Biol Evol. 1993;10:271–281. doi: 10.1093/oxfordjournals.molbev.a040003. [DOI] [PubMed] [Google Scholar]
  • 67.Li W-H. J Mol Evol. 1993;36:96–99. doi: 10.1007/BF02407308. [DOI] [PubMed] [Google Scholar]
  • 68.Goodman M. Am J Hum Genet. 1999;64:31–39. doi: 10.1086/302218. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Wildman DE, Goodman M. In: Evolutionary Theory and Processes: Modern Horizons. Wasser SP, editor. The Netherlands: Kluwer, Dordrecht; 2004. pp. 293–311. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information
pnas_0610012104_1.pdf (221KB, pdf)

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES