Abstract
To study the evolutionary history of Papio cynocephalus endogenous retrovirus (PcEV), we analyzed the distribution and genetic characteristics of PcEV among 17 different species of primates. The viral pol-env and long terminal repeat and untranslated region (LTR-UTR) sequences could be recovered from all Old World species of the papionin tribe, which includes baboons, macaques, geladas, and mangabeys, but not from the New World monkeys and hominoids we tested. The Old World genera Cercopithecus and Miopithecus hosted either a PcEV variant with an incomplete genome or a virus with substantial mismatches in the LTR-UTR. A complete PcEV was found in the genome of Colobus guereza—but not in Colobus badius—with a copy number of 44 to 61 per diploid genome, comparable to that seen in papionins, and with a sequence most closely related to a virus of the papionin tribe. Analysis of evolutionary distances among PcEV sequences for synonymous and nonsynonymous sites indicated that purifying selection was operational during PcEV evolution. Phylogenetic analysis suggested that possibly two subtypes of PcEV entered the germ line of a common ancestor of the papionins and subsequently coevolved with their hosts. One strain of PcEV was apparently transmitted from a papionin ancestor to an ancestor of the central African lowland C. guereza.
One of the main characteristics of the retrovirus life cycle is the integration of full-length viral DNA into the host genome, forming the provirus. Once integrated into the genome of a germ cell, a retrovirus genome can be inherited in a Mendelian fashion and propagated as an endogenous retrovirus (1). Endogenous retrovirus genomes have been found in all vertebrates, including primates (6). From Old World monkeys, three complete endogenous proviral genomes have been recovered: Baboon endogenous virus (BaEV) (7, 18), Simian endogenous retrovirus (SERV) (21), and Papio cynocephalus endogenous retrovirus (PcEV), which was recently isolated from the genomic library of a yellow baboon (12). SERV (a type D retrovirus) and PcEV (a type C retrovirus) are most likely the parents of the relatively young recombinant virus BaEV (7). BaEV has type C gag and pol genes with extensive homology to PcEV gag and pol genes and a type D env gene with extensive homology to the SERV env gene. BaEV has repeatedly infected the germ line of most papionin species and of Cercopithecus aethiops (17) (Fig. 1). SERV sequences could be amplified by PCR from all Old World monkeys of the subfamily Cercopithecinae. The virus has an estimated copy number of 142 to 235 integrations in the baboon genome, whereas BaEV is present at only 10 to 30 copies (Table 1). Because SERV is present in all members of the subfamily Cercopithecinae, but not in the subfamily Colobinae, it most likely entered the germ line of a common ancestor after the divergence of these two subfamilies, an event estimated to have occurred 9 million years ago (13, 21) (Fig. 1).
TABLE 1.
Retrovirus | Copy no. (per diploid genome) in:
|
|
---|---|---|
P. hamadryas | C. guereza | |
SERV | 142–235 | Not present |
PcEV | 39–58 | 44–61 |
BaEV | 10–30 | Not present |
Our recent isolation and sequencing of PcEV from the baboon genome revealed a proviral genome 8,572 nucleotides (nt) in length that is present in 39 to 58 proviral copies per baboon genome (Table 1) (12). Earlier analysis showed that the gag, pol, and env genes of PcEV are homologous to those of other type C retroviruses, including Gibbon ape leukemia virus (GaLV) and Porcine endogenous retrovirus (PERV). The most extensive homology was observed with the gag and pol genes of BaEV (81 and 92% at the nucleotide level, respectively), indicating that PcEV is the type C ancestor of BaEV (12). The present study was designed to gain insight into PcEV distribution and evolution among primates.
Total DNA was extracted from peripheral blood mononuclear cells, spleen tissue, serum, or plasma obtained from 17 species of primates with silica and guanidium thiocyanate (2) (Table 2). The origin of the samples was described previously (20). Two sets of primers were synthesized based on the PcEV proviral sequence (GenBank accession no. AF142988). For the first set, the upstream primer, POLm, was located near the 3′ end of the pol gene (5′-CGCACTCAAGGACTAGAGCC-3′); the downstream primer, ENVu, was located in the gp70-encoding region of the env gene (5′-CTTGATGCGGACCAGGTTGC-3′). POLm and ENVu amplify 728 bp (from nt 5959 to 6686) of the PcEV pol-env region. The second primer set was located in the 5′ long terminal repeat (LTR) and the 5′ untranslated region (UTR) of the gag gene of PcEV. The upstream primer, LTRm, was in the U3 region of the 5′ LTR (5′-TTCCCGGAATCAACAACTCC-3′); the downstream primer, UTRu (5′-TAAGTGAGAAGGTGCCGGAC-3′), was located 196 nt downstream of the primer binding site (PBS) sequence. LTRm and UTRu amplify 555 bp (from nt 192 to 746) of the PcEV sequence. PCR amplifications of these two fragments were performed for 35 to 40 cycles under the following conditions: 1 min at 95°C, 1 min at 55°C, and 2 min at 72°C, followed by an extension of 10 min at 72°C. Fragments were cloned into the pCRII-TOPO vector (Invitrogen, Carlsbad, Calif.). Clones were sequenced with SP6/T7 dye primers in both directions with an Applied Biosystem 373A automated sequencer, following the manufacturer's protocols. Alignment of the PcEV sequences was self-evident and performed manually. The PCR results obtained from 17 species of primates with primer sets POLm-ENVu and LTRm-UTRu are summarized in Table 2, and the sequence alignments are shown in Fig. 2 and 3, respectively. PCR fragments of the expected size could be generated with both primer sets from all samples of the papionin species, which includes baboons, geladas, mangabeys, and macaques, indicating that PcEV is present in the genomes of all species of the papionin tribe. From the sequence alignment of the PcEV pol-env fragments, it can be noted that the pol region is much more conserved than the env region, but the initiation codon of env (ATG) and the stop codon of pol [TA(G/A)] were present in all clones. Several in-frame deletions or insertions, encompassing 3, 6, 9, 12, and 48 nt, were present in the env open reading frame of PcEV in some species (Fig. 2). The sequence variability of PcEV among species is slightly lower in the LTR-UTR than in the pol-env fragment (Fig. 2 and 3). The R and U5 regions are more conserved than the U3 region and the UTR. The regulatory sequences, such as the poly(A) signal, TATA box, and CAT box, were conserved among all clones, and all PBS sequences were complementary to the 3′ end of tRNAGly (Fig. 3). PcEV pol-env fragments could also be generated from Cercopithecus nictitans and Miopithecus talapoin, but even with 40 cycles of amplification, the PCR bands were much weaker than those of the papionin samples, suggesting low amplification efficiency due to mismatches in the POLm and/or ENVu primer. Sequences from other species of the genus Cercopithecus were more difficult to amplify, probably due to primer mismatches. Sequence analysis showed that the pol-env fragments of C. nictitans and M. talapoin were 94% homologous to each other and slightly divergent (>10%) from the pol-env fragments amplified from the papionins (Fig. 2). No positive PCR result could be obtained from these two samples with the LTRm-UTRu primer set, even when the annealing temperature was lowered from 55 to 40°C. This suggests that in the genome of these two species, the LTR-UTRs of PcEV are so divergent as to cause substantial primer mismatches or that this region of viral genome has been totally lost during evolution. No PcEV sequences were recovered from hominoids, colobines (except for Colobus guereza), or New World monkeys with any PcEV primer set, with either 40 or 55°C as the annealing temperature in the PCRs.
TABLE 2.
Species (vernacular name) | Primer location and PCR result
|
|
---|---|---|
5′ LTR-UTR | pol-env | |
Old World primate | ||
P. hamadryas hamadryas (sacred baboon) | + | + |
P. hamadryas cynocephalus (yellow baboon) | + | + |
P. hamadryas ursinus (chacma baboon) | + | + |
T. gelada (gelada) | + | + |
L. aterrimus (black mangabey) | + | + |
M. mulatta (rhesus macaque) | + | + |
M. nemestrina (pig-tailed macaque) | + | + |
C. nictitans (spot-nosed guenon) | − | + |
M. talapoin (talapoin monkey) | − | + |
C. guereza (Abyssinian black-and-white colobus) | + | + |
C. badius (Western red colobus) | − | − |
Pan paniscus (pygmy chimpanzee) | − | − |
Pongo abelii (Bornean orangutan) | − | − |
Homo sapiens (human) | − | − |
New World monkey | ||
Lagothrix lagotricha (woolly monkey) | − | − |
Cebus apella (tufted or brown capuchin) | − | − |
Saguines spp. (tamarin) | − | − |
+, PCR fragments generated; −, no PCR fragments generated.
Surprisingly, both LTR-UTR and pol-env PCR fragments could be obtained from a spleen tissue sample of C. guereza (sequence C. guereza-1 in Fig. 2 and 3) at high annealing temperature. Sequences of these two fragments showed high homology to papionin PcEV fragments. To confirm these results, we amplified a spleen sample and a serum sample from two other individuals of the same species (sequences C. guereza-3 and C. guereza-2 in Fig. 2 and 3, respectively). Fragments almost identical to those of C. guereza-1 were obtained, suggesting that the previous results were not due to laboratory contamination. The sequences of each fragment recovered from the three individuals show more than 99% homology to each other and approximately 97% homology to the sequences obtained from papionin species (Fig. 2 and 3). To determine whether PcEV is present in other Colobus species, we tested a genomic DNA sample of Colobus badius (the western red colobus), kindly donated by Ronald Noe (TAI National Park Monkey Project, Tai, Ivory Coast, and Max-Planck-Institut für Verhaltensphysiologie, Seewiesen, Germany). No specific fragment could be amplified from this sample by any of the two primer sets, even when the annealing temperature was lowered to 40°C and/or when template DNA was serially diluted to optimize the PCRs. These results suggest that PcEV was not inherited from a common ancestor of extant Colobus monkeys but became part of the C. guereza genome after a cross-species transmission of exogenous PcEV.
To gain more insight into the evolution of PcEV and compare the virus and host trees, phylogenetic analyses based on the pol-env and LTR-UTR sequence alignments shown in Fig. 2 and 3 were performed with the neighbor-joining (NJ) (14) option of the MEGA analysis package (10), while maximum parsimony analysis was performed with PAUP4 (version 4.0.0.d55 for Unix) (15). In the NJ tree, evolutionary distances were estimated by Kimura's two-parameter method (8), and 100 bootstrap replicates were analyzed. Both the NJ and maximum parsimony methods generated identical trees. For pol-env and LTR-UTRs, the relative sequence of the complete proviral sequence of PcEV and, as an outgroup, a relative pol-env fragment of PERV (GenBank accession no. Y17013) which possesses pol and env genes closely related to PcEV were analyzed. Two main clusters could be distinguished in the pol-env tree; the first cluster comprised the clones from the papionin species and C. guereza, and the second cluster comprised the clones from C. nictitans and M. talapoin, two sister group genera according to their mitochondrial sequences (19). An interesting finding is that the five clones from three different C. guereza monkeys clustered with a sequence of Papio hamadryas ursinus (Fig. 4). In a host NJ tree based on mitochondrial 12S rRNA sequences, colobines cluster apart from all cercopithecoid monkeys (20). The original proviral sequence of PcEV was closely related to clones obtained from P. hamadryas ursinus and Theropithecus gelada. Within the papionin cluster, we observed two groups separated by sequences P. ursinus-2 and P. hamadryas-1 (Fig. 4). Probably, these two clones represent recombinant sequences, possibly generated in the PCR.
Similar results were obtained for the LTR-UTR sequences, although Cercopithecus and Miopithecus could not be included. In the LTR-UTR tree, two main clusters were distinguishable (Fig. 5). The first cluster contained PcEV sequences from baboons (Papio hamadryas hamadryas and P. hamadryas ursinus) and geladas. As in the pol-env tree, all C. guereza clones clustered closely with sequences from baboon species. A second cluster comprised viral clones from the black mangabey (Lophocebus aterrimus) and from macaques (Macaca mulatta and Macaca nemestrina).
The split between the Colobinae and Cercopithecinae subfamilies is estimated to have occurred approximately 9 million years ago (Fig. 1) (13). In a host species analysis based on mitochondrial DNA, these two subfamilies formed two clearly separated and distantly related clusters (20). However, in the phylogenetic tree based on both pol-env and LTR-UTR fragments of PcEV, the C. guereza clones always clustered together with clones from the papionin tribe (Fig. 4 and 5). Since viral phylogeny does not follow host phylogeny, a PcEV cross-species transmission is the most likely explanation for the homology observed between papionin PcEV and C. guereza PcEV. Because neither pol-env nor LTR-UTR PcEV fragments could be obtained from another colobine species, C. badius, the transmission most likely occurred from a papionin ancestor to an ancestor of extant C. guereza. The C. guereza samples from which we could amplify PcEV all represented the lowland form of C. guereza, since all samples originated from the Central African Republic. C. badius, which was negative for PcEV, is a species presently inhabiting the tropical forests of western Africa, thousands of kilometers away from the modern habitat of C. guereza (9).
Cross-species transmissions are rather the rule than an exception for retroviruses. For exogenous viruses like Human immunodeficiency virus and Simian immunodeficiency virus and Human T-cell leukemia virus type 1 and Simian T-cell leukemia virus type 1, several transmission events among primate species have been documented (3, 16). Because retrovirus infection requires intimate contact, like sexual contact or biting, transmissions are expected to occur between species sharing the same habitat. Colobus monkeys currently share habitats with Cercopithecus species and with mangabeys in some instances.
Generally, endogenous viruses increase in copy number with time (1). The copy number of PcEV in the genome of C. guereza was estimated as described previously (12) and was found to be in the range of 44 to 61 proviral copies per diploid genome, which is comparable to the PcEV copy number in the baboon genome (Table 1). It is therefore likely that the PcEV in papionin species and in C. guereza is of comparable age but was transmitted well after the separation of the two subfamilies, setting the upper limit for the spread of exogenous PcEV in Africa at around 9 million years ago (Fig. 1). The lower limit is much more difficult to estimate. At present, Macaca sylvanus (the Barbary macaque) is the only African macaque species, and this species is ancestral to all Asian macaque species (5). The first Asian macaque fossils date back approximately 3 million years (4), suggesting that the separation of African and Asian species occurred some time before that date. Because Asian macaque species contained PcEV proviral sequences closely related to those seen in African PcEV, they most likely inherited the virus from a common ancestor before they moved into Asia, suggesting that PcEV was firmly established in the African monkey germ lines by then. It is not unlikely that endogenous PcEV has or had the capability to produce progeny since its genome is complete, like that of BaEV in baboons. Viral transcription and particle formation seen in a retrovirus transmitted through the germ line could then be defined as a semiendogenous phase. Thus, integration of PcEV occurred earlier than integration of BaEV (estimated to have happened less than 1 million years ago) but probably later than the integration of SERV, which has the highest copy number (Table 1).
PcEV in Cercopithecus and Miopithecus appears to have a more complicated history. The viral sequences seem to follow the host phylogenetic tree, suggesting coevolution of virus and host (papionins and cercopithecini are the two tribes making up the subfamily Cercopithecinae). However, it cannot be ruled out that when these two tribes separated, closely related PcEV subtypes were circulating as exogenous virus. The separation of the two tribes is suggested to have occurred in the early Pliocene (5.6 to 3.5 million years ago), since the first fossilized remains of Cercopithecus date to 2.9 million years ago (4). PcEV subtypes could have preferentially infected certain monkey species and integrated separately into their germ lines.
In the phylogenetic tree of pol-env genes, two subclusters could be distinguished within the papionins and the C. guereza cluster (Fig. 4). The evidence that the sequences of viral clones obtained from the same individual fell into two subclusters can be explained by separate integrations of two PcEV subtypes, suggesting that infection with one subtype did not protect from infection with another PcEV subtype. Close inspection of the env sequences showed that the subtypes differed mainly by two in-frame stretches of 12 and 48 nt in the env gene. These stretches are absent from the PcEV env genes of some papionins, all those of C. guereza, and all genes from the Cercopithecus and Miopithecus samples, suggesting that the ancestral (most dispersed) virus did not contain these sequences and that they should thus be regarded as insertions. The changing of the env gene by insertions of 3 and 16 amino acids in the two in-frame segments and by additional amino acid changes possibly opens the way for another subtype of virus to reinfect a cell already harboring PcEV.
Subsequently, we analyzed evolutionary distances among PcEV sequences for synonymous and nonsynonymous sites. Such an analysis is a powerful tool for understanding the forces of viral evolution (11, 22, 23). Since those forces could, in general, differ for different genes, we performed the analysis for pol and env regions separately. In both regions, the number of synonymous substitutions per synonymous site (ds) was higher than the number of nonsynonymous substitutions per nonsynonymous site (da): 6.3 and 4.5% for the pol region, respectively, and 20.0 and 9.6% for the env region, respectively. This high variation in the env region compared to the pol region of PcEV conformed to the general observation that the pol gene is more conserved than the env gene. For both gene fragments, the mean ds/da ratios were above 1.0 (1.65 and 2.11 for pol and env, respectively), suggesting that purifying selection was operational. However, the ds/da ratios of PcEV were low compared to those found in other endogenous retroviruses, for example in members of the Human endogenous retrovirus (HERV) K10 family (22, 23), indicating that PcEV has probably not been active for a long period.
In conclusion, PcEV was amplified from all species of the papionin tribe, including baboons, macaques, geladas, and mangabeys, and a variant, possibly another subtype or strain, was amplified from Cercopithecus and Miopithecus monkeys. Evidence was obtained that PcEV was transmitted from papionins to at least one species of Colobus monkey (C. guereza).
Acknowledgments
We thank Vladimir Lukashov and Ben Berkhout for stimulating discussions, Merlijn van der Mee for help with the PAUP program, John Dekker for technical support, and Lucy Phillips for editorial review.
This study was partly supported by Amsterdam Support Diagnostics, Inc.
REFERENCES
- 1.Boeke J D, Stoye J P. Retrotransposons, endogenous retroviruses, and the evolution of retroelements. In: Coffin J M, Hughes S H, Varmus H E, editors. Retroviruses. Cold Spring Harbor Laboratory, N.Y: Cold Spring Harbor Laboratory Press; 1997. pp. 343–435. [PubMed] [Google Scholar]
- 2.Boom R, Sol C J A, Salimans M M M, Jansen C L, Wertheim-van Dillen P M E, van der Noordaa J. Rapid and simple method for purification of nucleic acids. J Clin Microbiol. 1990;28:495–503. doi: 10.1128/jcm.28.3.495-503.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Chen Z, Telfier P, Gettie A, Reed P, Zhang L, Ho D D, Marx P A. Genetic characterization of new West African simian immunodeficiency virus SIVsm: geographic clustering of household-derived SIV strains with human immunodeficiency virus type 2 subtypes and genetically diverse viruses from a single feral sooty mangabey troop. J Virol. 1996;70:3617–3627. doi: 10.1128/jvi.70.6.3617-3627.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Conroy G C. Primate evolution. W. W. New York, N.Y: Norton & Company, Inc.; 1990. [Google Scholar]
- 5.Fa J E, Lindburg D G. Evolution and ecology of macaque societies. Cambridge, England: Cambridge University Press; 1996. [Google Scholar]
- 6.Herniou E, Martin J, Miller K, Cook J, Wilkinson M, Tristem M. Retroviral diversity and distribution in vertebrates. J Virol. 1998;72:5955–5966. doi: 10.1128/jvi.72.7.5955-5966.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Kato S, Matsuo K, Nishimura N, Takahashi N, Takano T. The entire nucleotide sequence of baboon endogenous virus DNA: a chimeric genome structure of murine type C and simian type D retroviruses. Jpn J Genet. 1987;62:127–137. [Google Scholar]
- 8.Kimura M A. A simple method for estimating evolutionary rate of base substitution through comparative studies of nucleotide sequences. J Mol Evol. 1980;16:111–120. doi: 10.1007/BF01731581. [DOI] [PubMed] [Google Scholar]
- 9.Kingdon J. The Kingdon field guide to African mammals. San Diego, Calif: Academic Press; 1997. [Google Scholar]
- 10.Kumar S, Tamura K, Nei M. MEGA: molecular evolutionary genetics analysis, version 1.01. University Park, Pa: The Pennsylvania State University; 1993. [Google Scholar]
- 11.Lukashov V V, Kuiken C L, Goudsmit J. Intrahost human immunodeficiency virus type 1 evolution is related to length of the immunocompetent period. J Virol. 1995;69:6911–6916. doi: 10.1128/jvi.69.11.6911-6916.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Mang R, Goudsmit J, van der Kuyl A C. Novel endogenous type C retrovirus in baboons: complete sequence, providing evidence for baboon endogenous virus gag-pol ancestry. J Virol. 1999;73:7021–7026. doi: 10.1128/jvi.73.8.7021-7026.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Martin R D. Primate origins: plugging the gaps. Nature. 1993;363:223–234. doi: 10.1038/363223a0. [DOI] [PubMed] [Google Scholar]
- 14.Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987;4:406–425. doi: 10.1093/oxfordjournals.molbev.a040454. [DOI] [PubMed] [Google Scholar]
- 15.Swofford D L. PAUP. Phylogenetic analysis using parsimony (and other methods), version 4. Sunderland, Mass: Sinauer Associates; 1998. [Google Scholar]
- 16.Vandamme A M, Salemi M, Desmyter J. The simian origins of the pathogenic human T-cell lymphotropic virus type I. Trends Microbiol. 1998;6:477–483. doi: 10.1016/s0966-842x(98)01406-1. [DOI] [PubMed] [Google Scholar]
- 17.van der Kuyl A C, Dekker J T, Goudsmit J. Distribution of baboon endogenous virus among species of African monkeys suggests multiple ancient cross-species transmissions in shared habitats. J Virol. 1995;69:7877–7887. doi: 10.1128/jvi.69.12.7877-7887.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.van der Kuyl A C, Dekker J T, Goudsmit J. Full-length proviruses of baboon endogenous virus (BaEV) and dispersed BaEV reverse transcriptase retroelements in the genome of baboon species. J Virol. 1995;69:5917–5924. doi: 10.1128/jvi.69.9.5917-5924.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.van der Kuyl, A. C., J. T. Dekker, and J. Goudsmit. Primate genus Miopithecus: evidence for the existence of species and subspecies of dwarf guenons based on cellular and endogenous viral sequences. Mol. Phylogenet. Evol., in press. [DOI] [PubMed]
- 20.van der Kuyl A C, Kuiken C L, Dekker J T, Goudsmit J. Phylogeny of African monkeys based upon mitochondrial 12S rRNA sequences. J Mol Evol. 1995;40:173–180. doi: 10.1007/BF00167111. [DOI] [PubMed] [Google Scholar]
- 21.van der Kuyl A C, Mang R, Dekker J T, Goudsmit J. Complete nucleotide sequence of simian endogenous type D retrovirus with intact genome organization: evidence for ancestry to simian retrovirus and baboon endogenous virus. J Virol. 1997;71:3666–3676. doi: 10.1128/jvi.71.5.3666-3676.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Zsiros J, Jebbink M F, Lukashov V V, Voute P A, Berkhout B. Evolutionary relationships within a subgroup of HERV-K-related human endogenous retroviruses. J Gen Virol. 1998;79:61–70. doi: 10.1099/0022-1317-79-1-61. [DOI] [PubMed] [Google Scholar]
- 23.Zsiros J, Jebbink M F, Lukashov V V, Voute P A, Berkhout B. Biased nucleotide composition of the genome of HERV-K related endogenous retroviruses and its evolutionary implications. J Mol Evol. 1999;48:102–111. doi: 10.1007/pl00006437. [DOI] [PubMed] [Google Scholar]