Abstract
Human immunodeficiency virus type 1 (HIV-1) originated from three independent cross-species transmissions of simian immunodeficiency virus (SIVcpzPtt) infecting chimpanzees (Pan troglodytes troglodytes) in west central Africa, giving rise to pandemic (group M) and non-pandemic (groups N and O) clades of HIV-1. To identify host-specific adaptations in HIV-1 we compared the inferred ancestral sequences of HIV-1 groups M, N and O to 12 full length genome sequences of SIVcpzPtt and four of the outlying but closely related SIVcpzPts (from P. t. schweinfurthii). This analysis revealed a single site that was completely conserved among SIVcpzPtt strains but different (due to the same change) in all three groups of HIV-1. This site, Gag-30, lies within p17, the gag-encoded matrix protein. It is Met in SIVcpzPtt, underwent a conservative replacement by Leu in one lineage of SIVcpzPts but changed radically to Arg on all three lineages leading to HIV-1. During subsequent diversification this site has been conserved as a basic residue (Arg or Lys) in most lineages of HIV-1. Retrospective analysis revealed that Gag-30 had reverted to Met in a previous experiment in which HIV-1 was passaged through chimpanzees. To examine whether this substitution conferred a species specific growth advantage, we used site-directed mutagenesis to generate variants of these chimpanzee-adapted HIV-1 strains with Lys at Gag-30, and tested their replication in both human and chimpanzee CD4+ T lymphocytes. Remarkably, viruses encoding Met replicated to higher titers than viruses encoding Lys in chimpanzee T cells, but the opposite was found in human T cells. Taken together, these observations provide compelling evidence for host-specific adaptation during the emergence of HIV-1 and identify the viral matrix protein as a modulator of viral fitness following transmission to the new human host.
Keywords: HIV-1, SIV, matrix protein, cross-species transmission, host-specific adaptation
Introduction
Human immunodeficiency viruses types 1 and 2 (HIV-1 and HIV-2) are the cause of HIV-AIDS. HIV-1 and HIV-2 are classified in the genus Lentivirus of the family Retroviridae. Related lentiviruses have been found infecting more than 30 species of non-human primates in sub-Saharan Africa (Bibollet-Ruche et al. 2004a). These simian immunodeficiency viruses (SIVs) form host-species specific clades, and in their natural hosts appear to be non-pathogenic. HIV-AIDS emerged in the 20th century after humans acquired SIVs from two different species, chimpanzees (Pan troglodytes) and sooty mangabeys (Cercocebus atys) (Sharp et al. 2001).
The vast majority of HIV-AIDS cases around the world are due to human immunodeficiency virus type 1 (HIV-1). HIV-1 originated from cross-species transmissions of SIVcpzPtt infecting chimpanzees (P. t. troglodytes) in west central Africa (Gao et al. 1999). HIV-1 strains are classified into three distinct groups (M, N and O) which are interspersed among SIVcpzPtt lineages in phylogenetic trees (Fig. 1), indicating that they arose from three independent ape-to-human transmissions (Hahn et al. 2000). The three transmissions have had very different outcomes: group N strains have been found in only a limited number of individuals in Cameroon, group O strains are more widespread but mainly restricted to individuals from Cameroon and surrounding countries, while group M strains have spread throughout Africa and the rest of the world. Group M strains have been diversifying since around 1930 (Korber et al. 2000), and are classified into numerous subtypes. The protein sequences encoded by contemporary strains of HIV-1 group M differ substantially from those of their closest known SIVcpz relatives (e.g., by 14-35% in the major polyproteins Gag, Pol and Env), but it is not known which (if any) of these differences are relevant to the biology of these viruses in the two different hosts.
FIG. 1.
The three independent origins of HIV-1. Red arrows indicate the branches on which ape-to-human transmissions can be inferred to have occurred, and black circles denote the ancestors of HIV-1 groups M, N and O. Colours indicate SIVcpz strains from Pan troglodytes troglodytes (red) and P. t. schweinfurthii (blue). The tree represents a strict consensus of phylogenies derived from four major regions of the proteome (Keele et al. 2006). The evolutionary relationships among SIVcpzPtt strains, and among SIVcpzPts strains, vary depending on the region of the genome used for analysis, presumably reflecting recombination events in the past. However, the distinction between the SIVcpzPtt and SIVcpzPts lineages, and the positions of HIV-1 groups M (close to SIVcpzPtt strains LB7, MB66 and MB897), N (close to SIVcpzPtt strain EK505) and O (as an outgroup to the known SIVcpzPtt strains) are constant. The dashed line indicates the position of SIVgor based on partial sequences (Van Heuverswyn et al. 2006); note that HIV-1 group O may have been transmitted from chimpanzees to humans via gorillas.
Although the first SIVcpz isolate was described more than 15 years ago (Peeters et al. 1989), by 2003 only four full-length sequences of SIVcpzPtt had been determined because captive chimpanzees are only rarely infected with this virus (Sharp et al. 2005). We have since sequenced eight additional SIVcpzPtt strains, including viruses that represent the closest known relatives of HIV-1 groups M and N (Bibollet-Ruche et al. 2004b; Nerrienet et al. 2005; Keele et al. 2006; Van Heuverswyn F, unpublished data). Thus, there are now twelve full length SIVcpzPtt sequences available, as well as four of SIVcpzPts from P. t. schwein-furthii which form an outgroup to the SIVcpzPtt/HIV-1 clade. This provides an opportunity to look for viral genetic changes associated with cross-species transmission that may represent adaptations of SIVcpz to its new human host. Here we identify one such site, in the p17 matrix protein encoded by the gag gene, which shows clear evidence of having evolved under host-specific selection pressure. Furthermore, we demonstrate that replacements at this site have different effects on virus replication in CD4 T-cells from humans and chimpanzees.
Materials and Methods
Sequence Analyses
Protein sequences were aligned using ClustalW (Thompson et al. 1994), with minor manual adjustment. Phylogenetic relationships were inferred by the Bayesian method (Yang and Rannala 1997) implemented in MrBayes (Huelsenbeck and Ronquist 2001), using the JTT model of amino acid substitution (Jones et al. 1992) and a gamma distribution of among-site rate variation with four categories (Yang 1994); analyses were run for 1 million generations.
The HIV-1 group M ancestral sequence was obtained from the HIV Sequence Database (www.hiv.lanl.gov). The group O ancestral sequence was inferred from 13 full length genome sequences; sequences previously shown to be recombinant (Yamaguchi et al. 2003) were excluded from this analysis. (Accession numbers of all sequences used are provided in the Supplement.) A phylogeny estimated from concatenated Gag, Pol and Env sequences, rooted by reference to SIVcpzPtt sequences, was used to find the maximum likelihood estimate of the ancestral group O sequence using Codeml from the PAML package (Yang 1997), under the assumption of a molecular clock using the JTT matrix and gamma distributed rates at sites. The HIV-1 group N ancestral sequence was obtained by a similar analysis of five genome sequences, with SIVcpzPtt/EK505 used to root the phylogeny.
We searched for species-specific signatures in alignments of Gag, Pol, Vif, Vpr, Vpu, Tat, Rev and Nef protein sequences of 19 viral strains (twelve SIVcpzPtt, four SIVcpzPts, and the three HIV-1 group ancestral sequences). We were looking for sites that are highly conserved among SIVcpzPtt strains, but since any of the SIVcpz sequences may have been derived from a defective viral genome, we allowed for some deviation from perfect conservation among all sequences; this same concern does not apply to HIV-1 group ancestor sequences. Thus, we searched for sites where the amino acid was conserved among at least nine of the twelve SIVcpzPtt sequences, and identical among all three HIV-1 group ancestors, but differed between HIV-1 and the consensus sequence of SIVcpzPtt.
The degree of biochemical dissimilarity among amino acids was assessed by reference to Grantham’s chemical distances. These distances combine information on chemical composition, polarity and molecular volume, and are highly correlated with amino acid replacement frequencies (Grantham 1974). The values for the 190 pair-wise comparisons among the 20 amino acids range from 5 to 215, with a mean of 100.
Cell Lines, Viruses and Virus Stocks
293T and JC53BL13 cells were maintained in Dulbec-co’s modified Eagle’s medium (DMEM) + 10% heat inactivated fetal bovine serum (FBS). Replication competent molecular clones of chimpanzee-passaged HIV-1 (JC16 and NC7) were obtained from Dr. Novembre (Mwaengo and Novembre 1998). Viral stocks of JC16 and NC7 were generated by transfecting full-length proviral plasmids into 293T cells using FuGene 6 (Roche Applied Science, Indianapolis, IN). Seventy-two hours post-transfection, virus-containing supernatants were clarified by low speed centrifugation and frozen in aliquots at -70C. The infectivity of virus stocks was determined using the JC53BL13 assay as described previously (Derdeyn et al. 2000; Wei et al. 2002).
To change the amino acid at Gag-30 in JC16 and NC7 from a Met to a Lys, site-directed mutagenesis was performed using the QuickChange II XL kit (Stratagene, La Jolla, CA) and the primers JCNC-M30K-F (5′-GGAAA GAAAAAATATAAGTTAAAACATATAGTATGG-3′) and JCNC-M30K-R (5′-CCATACTATATGTTTTAACT TATATTTTTTCTTTCC-3′).
Plasmids were grown in STBL2 bacteria (Invitrogen, Carlsbad, CA) as described by Takehisa et al. (2007).
Isolation and Activation of Chimpanzee and Human PBMCs
Blood was obtained from healthy HIV-1-negative human volunteers (Research Blood Components, Boston, MA), as well as HIV/SIV-uninfected chimpanzees housed at the Yerkes Primate Research Center in accordance with Animal Welfare Act guidelines. Chimpanzee blood samples were collected in tubes containing acid citrate dextrose from anaesthetized individuals during their annual health survey (this procedure was approved by the Emory Institutional Animal Care and Use Committee). Both human and chimpanzee peripheral blood mononuclear cells (PBMCs) were isolated by density separation using Ficoll-hypaque plus (GE-Healthcare, Piscatawy, NJ) and centrifuged at 1800 rpm for 25 min at 22°C. Buffy coat cells were washed once at room temperature in Hanks Balanced Saline Solution (HBSS) + 4 mM EDTA and once at 4°C in HBSS 1% FBS. CD4+ T lymphocytes were enriched using CD4-containing microbeads by magnetic cell sorting according to the manufacturer’s protocols (Militenyi Biotec, Auburn, CA). For activation, CD4+ T lymphocytes were allowed to adhere for 30 minutes and then stimulated for 12-15 hours with 3lg/ml of Staphylococcal Enterotoxin B (Sigma-Aldridge, St. Louis, MO) in RPMI 15% FBS. Suspension cells were washed once with HBSS,+resuspended in DMEM with 10% GCT conditioned media (BioVeris Corp., Gaithersburg, MD) and 10% Human AB serum (Fisher Bioreagents, Fair Lawn, NJ), returned to the same well in which they were activated, and incubated for 5-6 days at 37°C in a humidified 5% CO2 incubator. Suspension cells were then resuspended at 1 × 106 cells/ml in DMEM with 10% FBS and 30 IU of IL-2/ml (Roche Applied Science, Indianapolis, IN) for 24 hours prior to infection.
Replication Kinetics
Approximately 500,000 activated human or chimpanzee CD4+ T lymphocytes were infected overnight at a multiplicity of infection of 0.01 (JC53BL titer) in 300ll of DMEM + 10% FBS + 30 units IL-2/ml. The following morning cells were washed three times with HBSS and plated in 24 well plates in 2 ml of DMEM 10% FBS + 30 units of IL-2/ml. Forty microliters of supernatant + were collected every other day and stored at ×70°C. Cultures were maintained for 9 days. Virus replication was assessed by quantifying the amount of p24 core protein released into the culture supernatant using the HIV-1 p24 antigen EIA kit (Beckman Coulter, Fullerton, CA) according to the manufacturer’s protocol.
Experiments involving pairs of viruses, with either Met or Lys at Gag-30, were performed in cells from the same donors. Each experiment was repeated four times. The significance of the higher growth rate of Met-encoding viruses in chimpanzee cells, or of Lys-encoding viruses in human cells, was assessed in a one-tailed paired t-test of log-transformed data from day 9.
Results
Identification of a Host-Specific Substitution in HIV-1
We compared the inferred ancestral sequences at the root of each HIV-1 group to SIVcpz sequences, seeking changes that had occurred in parallel on each branch of the tree where cross-species transmission from chimpanzees to humans occurred (fig. 1). SIVcpz/HIV-1 genomes encode nine proteins. We searched alignments of all nine proteins for sites that are conserved among SIVcpzPtt strains but differ (with the same change) in all three ancestral HIV-1 sequences. An analysis allowing up to three of the 12 SIVcpzPtt sequences to differ from the consensus sequence identified seven sites from four different proteins (table 1). Only one of these sites (Gag-30) was completely conserved among the 12 SIVcpzPtt sequences. At all six other sites the amino acid found in HIV-1 was present as a minor variant in SIVcpzPtt and was also present in SIVcpzPts; at all six sites the amino acid found in HIV-1 was present in at least two SIVcpz sequences (SIVcpzPtt or SIVcpzPts). Thus, there was only one site which was highly conserved among SIVcpz sequences and yet contained the same replacement in all three HIV-1 group ancestors, with an amino acid not found in any of the SIVcpz sequences. Gag-30 is Met in all 12 SIVcpzPtt sequences and in one SIVcpzPts, Leu in three (closely related) SIVcpzPts, and Arg in all three HIV-1 group ancestors (fig. 2). This implies that the ancestor of the entire SIVcpz clade had Met at this position, which has subsequently been highly conserved among chimpanzee viruses, undergoing a biochemically conservative change to Leu (Grantham’s amino acid distance, D 5 15) only in one lineage within SIVcpzPts. However, on all three branches leading to the HIV-1 groups (fig. 1), Met has been replaced by the biochemically dissimilar, basic amino acid Arg (D = 91).
Table 1. Sites Differing Between HIV-1 and SIVcpzPtt.
Sitea | HIV-1 | SIVcpzPtt | SIVcpzPts | HIV-1/cpz |
---|---|---|---|---|
Gag-30 | Arg b | Met(12)c | Leu(3), Met(1) | Met |
Gag-224 | Pro | Ala(9), Pro(3) | Gln(3), Pro(1) | Ala |
Gag-335 | Lys | Arg(10), Lys(2) | Lys(4) | Lys |
Pol-175 | Lys | Arg(9), Lys(3) | Lys(3), Arg(1) | Lys |
Pol-497 | Tyr | Phe(9), Tyr(3) | Tyr(4) | Tyr |
Vif-185 | Gly | Glu(9), Gly(3) | Glu(2), Gly(2) | Gly |
Nef-46 | Ser | Arg(11), Ser(1) | Asn(3), Ser(1) | Ser |
Note.— HIV-1 refers to the inferred ancestral sequences of the three groups of HIV-1. HIV-1/cpz refers to the sequence of HIV-1 strains after passage in chimpanzees for more than 10 years.
sites numbered according to the reference strain HIV-1/HXB2.
amino acid found in HIV-1 in bold.
numbers of occurrences among SIVcpz sequences in brackets.
FIG. 2.
Species-specific adaptive changes in the matrix protein (Gag p17). Sequences from a region in the N-terminal basic domain of HIV-1/SIVcpz matrix proteins reveal a site (boxed) that differs between the HIV-1 group ancestors (R, Arg) and chimpanzee viruses (M, Met/L, Leu). Dashes indicate amino acid identity to the sequence at the top. Clones JC16 and NC7 were isolated from two chimpanzees experimentally infected with HIV-1 (Mwaengo and Novembre 1998).
The available strains of SIVcpzPtt include three very closely related to HIV-1 group M, as well as one very closely related to group N (fig. 3; see also Keele et al. 2006). These strains serve to locate two of the Met-to-Arg changes on relatively short branches immediately prior to the ancestral nodes in groups M and N. As yet no SIVcpzPtt strain has been found that is very closely related to HIV-1 group O (fig. 3). However, we recently reported the discovery of SIV in gorillas (Van Heuverswyn et al. 2006); on the basis of partial pol and env sequences, these SIVgor viruses are closely related to HIV-1 group O (fig. 1). We have subsequently obtained a partial gag sequence for one SIVgor strain (F. Van Heuverswyn, unpublished). Like the SIVcpzPtt sequences, this SIVgor sequence encodes Met at Gag-30, placing the third Met-to-Arg change on the branch after the split of the ancestors of SIVgor and HIV-1 group O.
FIG. 3.
Phylogeny of the SIVcpz/HIV-1 clade derived from analysis of Gag protein sequences. SIVcpzPtt and SIVcpzPts strains are in red and blue, respectively. The representative HIV-1 sequences used were U455 and LAI (group M), YBF30 and YBF106 (group N) and ANT70 and MVP1580 (group O). A posteriori probabilities (expressed as percent) are shown above branches. The scale bar represents 0.05 amino acid replacements per site.
To assess how unusual the observation at Gag-30 is, we searched for any sites in the Gag alignment where, within the SIVcpzPtt/HIV-1 clade, any three sequences shared one residue while all 12 other sequences shared an-other residue; in addition, we required that the residue found in the three sequences should not be present in any of the SIVcpzPts sequences. After excluding sites with a gap in any sequence, the alignment contained 450 sites. This analysis identified six sites, including Gag-30, the site already found (table 2). However, consideration of the phylogenetic relationships among the gag genes of these viral strains (fig. 3) indicates that four of these sites require fewer than three independent substitutions. SIVcpzPtt strains GAB1 and CAM13 form a monophyletic pair, and replacements at Gag-19 and Gag-380 likely occurred in their common ancestor. Similarly, a replacement at Gag-52 likely occurred in the common ancestor of SIVcpzPtt strain EK505 and HIV-1 group N. Thus, these three sites require only two independent substitutions. Only one replacement is required at site Gag-231, since SIVcpzPtt strains CAM3, CAM5 and DP943 form a monophyletic clade (fig. 3). The remaining site, Gag-28, would require three independent replacements, although it is equally parsimonious to invoke three independent Lys-to-Arg changes on terminal branches of the tree, or two Lys-to-Arg changes and a reversion of Arg-to-Lys on the terminal branch to CAM13. Thus only one other site, in addition to Gag-30, was found that may have undergone the same replacement on three terminal branches of the tree. These replacements involving basic residues at Gag-28 are much more conservative (D 5 26) than the Met-to-Arg change at Gag-30, and likely reflect recurrent neutral switches. The other sites identified in this analysis also involve amino acids with much smaller chemical distances than the change at Gag-30 (table 2). In light of these data, the finding of a non-conservative Met-to-Arg change on all three (and only those three) branches leading to the human viruses is quite remarkable and highly unlikely to represent a chance occurrence.
Table 2. Sites in Gag Differing between Three HIV-1/SIVcpzPtt Sequences and All Others.
Sitea | Set 1b | AA | Set 2b | D1c | SIVcpzPts | D2c |
---|---|---|---|---|---|---|
Gag-19 | GAB1, CAM13, LB7 | Val | Ile | 29 | Ile | - |
Gag-28 | GAB1, GAB2, LB7 | Arg | Lys | 26 | Lys | - |
Gag-30 | M, N, O | Arg | Met | 91 | Met/Leu | 15 |
Gag-52 | EK505, MB897, N | Asp | Glu | 45 | Glu | - |
Gag-231 | CAM3, CAM5, DP943 | Ala | Pro | 27 | Pro | - |
Gag-380 | GAB1, CAM13, US | Lys | Arg | 26 | Arg/Gly | 125 |
sites numbered according to the reference strain HIV-1/HXB2.
Set 1 are the three sequences from the SIVcpzPtt/HIV-1 clade sharing the unusual residue; M, N and O refer to HIV-1 groups, while the other sequences are all SIVcpzPtt strains. Set 2 shows the amino acid found in the remaining 12 sequences from the SIVcpzPtt/HIV-1 clade.
D values are Grantham’s amino acid distances. D1 is the distance between the residues found in Set1 and Set 2. D2 is the distance between the two residues found in SIVcpzPts.
Evolution of HIV-1 in Experimentally Infected Chimpanzees
To investigate further whether the observed replacement at Gag-30 was indicative of host-specific adaptation, we examined the sequences of two molecular clones (JC16 and NC7) that were recovered from chimpanzees experimentally inoculated with HIV-1 (Mwaengo and Novembre 1998). The rationale for these chimpanzee infections was to generate pathogenic variants for vaccine and pathogenesis studies (Novembre et al. 1997; Wei and Fultz 1998). Although chimpanzees were subsequently found not to represent a viable animal model, the viral isolates and molecular clones derived from these experiments are of value because they represent chimpanzee-adapted strains of HIV-1. Clone JC16 was recovered from a chimpanzee 10 years after his initial infection; clone NC7 was recovered from a second chimpanzee one month after receiving blood from the first chimpanzee (Mwaengo and Novembre 1998). The two clones differ at 1.4% of nucleotides; given the rate of HIV-1 evolution (Li et al. 1988), this indicates that they may have diverged about 1-2 years before they were isolated. These nucleotide substitutions cause 51 amino acid differences across the viral proteome. To determine whether prolonged propagation in chimpanzees had exerted selection pressure on the Gag-30 residue, we compared the sequences of JC16 and NC7 to their parental strains, which represented a mixture of group M subtype B strains (SF2 and LAV) and thus contained a conservative replacement of Arg with Lys at Gag-30 (Wei and Fultz 1998). Remarkably, inspection of JC16 and NC7 revealed Met at Gag-30 in both clones, indicating selection pressure for this residue in the chimpanzee host (fig. 2).
The six other sites initially identified as differing between HIV-1 and the majority of SIVcpzPtt strains all retained the HIV-1 residue after passage in chimpanzees (table 1). At one site, Gag-224, the residue in the chimpanzee-adapted virus (Ala) differed from that in the three HIV-1 group ancestor sequences (Pro). However, this site is Ala in most HIV-1 subtype B sequences, including the viruses used in the chimpanzee passage experiment.
Replication of JC16 and NC7 in Chimpanzee and Human T Cells
To assess whether replacements at Gag-30 confer a host species specific growth advantage, we applied sitedirected mutagenesis to the chimpanzee adapted JC16 and NC7 clones to change the Met back to Lys, and assessed their replication potential in CD4+ T lymphocytes from multiple chimpanzee and human donors. These experiments revealed a clear and reproducible correlation between the identity of the amino acid at Gag-30 and the ability of viruses to replicate in cells from the two different species (fig. 4). In chimpanzee cells, JC16 and NC7 with Met at Gag-30 (blue) grew, on average, to much higher titers than the same viruses with Gag-30 mutated to Lys (red). However, there was considerable variation among replicates: in tests of the day 9 titers, NC7 with Met was significantly higher than NC7 with Lys (P <0.05), but for JC16 the comparison only bordered on significance (P = 0.054). In human cells, these chimpanzee adapted viruses exhibited low replicative fitness (compare the scales in the left and right panels of fig. 4), but both JC16 and NC7 viruses with Lys at Gag-30 replicated considerably more efficiently than those with Met (P <.005, in both cases).
FIG. 4.
Replication of variants of JC16 (upper) and NC7 (lower) strains in chimpanzee (left) and human (right) CD4 T lymphocytes. For each virus, strains encoding Met (blue) or Lys (red) at Gag-30 are compared. Virus replication was monitored by measuring the+level of the p24 antigen in culture supernatants (note that the scale varies between left and right panels). Replication curves are shown as the average of four independent experiments in CD4+ T lymphocytes from different chimpanzee and human donors; the error bars represent one standard deviation calculated for each time point.
Evolution of Gag-30 in HIV-1
Arg at Gag-30 has remained highly conserved during the diversification of HIV-1 groups M, N and O, typically only being conservatively replaced by another basic amino acid, Lys (D = 26). Approximately 80% of available group O sequences have Arg, while in the other 20% it is Lys. Only six group N virus Gag sequences have thus far been characterized: five have Arg, and one has Lys. Among the group M subtype consensus sequences compiled at the HIV Sequence Database (www.hiv.lanl.gov), Arg is present in subtypes A1, A2, D, F1, G, H and K, while the subtype B consensus sequence has Lys. Interestingly, the only HIV-1 group M subtype that does not contain a basic residue at Gag-30 is subtype C. Met is found in both the consensus sequence and the inferred ancestral sequence of subtype C. Since subtype C is not the basal lineage in the group M radiation (Korber et al. 2000; Leitner et al. 2005), this indicates that reversion of Arg to Met occurred subsequent to the divergence of the subtype C ancestor from the other group M lineages.
Discussion
Comparisons of the inferred sequences of the ancestors of the three independent groups of HIV-1 with those of SIVcpz strains revealed a single site in the HIV-1 proteome which appears to have undergone identical changes on each of the three occasions when virus was transmitted from apes to humans. All available sequences of SIVcpzPtt, the clade from within which HIV-1 originated, have Met at Gag-30, while the ancestors of all three HIV-1 groups had Arg. Gag-30 was the only site, among 450 in the Gag alignment, to show the same non-conservative amino acid replacement on three independent terminal branches within the SIVcpzPtt/HIV-1 clade, and so it is remarkable that these changes were found specifically on the branches leading to the three HIV-1 groups. The finding of this non-conservative amino acid replacement on the branches of the tree immediately prior to each of the three clades of human viruses, at a site which is highly conserved among chimpanzee viruses, strongly suggests host-species specific adaptation. During the subsequent spread and evolution of HIV-1 in humans, Gag-30 has been conserved as Arg, or conservatively replaced by another basic amino acid (Lys), in all lineages except group M subtype C.
Retrospective analysis of the result of passaging HIV-1 in chimpanzees for more than 10 years (Mwaengo and Novembre 1998) revealed that the basic residue at Gag-30 had reverted to Met, providing independent evidence of host-species specific selection on this site. Thus, while the Met-to-Arg change on the lineages leading to HIV-1 could have occurred prior to cross-species transmission, possibly predisposing particular chimpanzee viruses to greater fitness in humans, the reversion to Met in chimpanzees infected with HIV-1 argues that the adaptive changes occurred after the cross-species transmission events.
Previously it has been shown that mutation of HIV-1 subtype B Gag-30 to a non-basic residue (Thr), in combination with a similar mutation at Gag-32, greatly reduced viral replication (Freed et al. 1995). Here we have found that when the Met at Gag-30 in two chimpanzee-adapted HIV-1 strains was replaced by Lys, the growth rate of these viruses in chimpanzee CD4+ T cells was reduced by half. In contrast, in human cells these same strains with Lys grew twice as efficiently as strains with Met. These reciprocal differences in replication capacity confirm that the nature of the amino acid at Gag-30 can have a major effect on viral fitness in cells from different host species. However, the greatly reduced growth rate of the chimpanzee-adapted viruses in human T cells, even after the replacement of Met with Lys at Gag-30, indicates that additional sites have evolved away from being human-adapted.
In this context, it is intriguing that in subtype C, a major sublineage within HIV-1 group M, Gag-30 has reverted to Met. Subtype C is very widespread: it is the predominant form of HIV-1 in southern Africa and also responsible for AIDS epidemics in China and India (Hemelaar et al. 2006). It is thus clear that HIV-1 strains with Met at Gag-30 can be epidemiologically successful. However, subtype C isolates have been shown to replicate less efficiently than other subtypes in head-to-head ex vivo competition experiments in primary CD4+ T cells and macrophages (Ball et al. 2003; Arien et al. 2005). This finding has been interpreted to indicate that disease progression may be slower in subtype C infections, providing a longer period when infected individuals may transmit virus to others and thus explaining why subtype C is so common globally (Arien et al. 2007). Whether there is indeed a causal link between the observed in vitro phenotype of subtype C and its ability to spread in the human population requires further studies, especially in light of data indicating that in vivo viral loads are higher in subtype C infections (Dyer et al. 1998). However, it is tempting to speculate that the reduced replication fitness of subtype C viruses in primary lymphocyte cultures may be due, at least in part, to the Met residue at Gag-30. Since the host-specificity at this site is otherwise so consistent, it is also possible that the ancestral subtype C lineage underwent changes at other sites which removed the need for, and perhaps even selected against, a basic residue at Gag-30 during replication in human cells. A number of subtype C viruses have Arg at Gag-30, but we could not find any meaningful correlation between the presence or absence of Met at Gag-30 and any polymorphisms at other sites. Site directed mutagenesis of prototypic subtype C strains, as reported here for chimpanzee adapted subtype B strains, may shed further light on these possibilities.
The residue present at Gag-30 varies among SIVs infecting different primate species. Interestingly, it is also Met in SIVsmm from sooty mangabeys, the progenitor of HIV-2. Eight different groups of HIV-2 have been described, which each seem to reflect an independent cross-species transmission event (Hahn et al. 2000; Damond et al. 2004; Santiago et al. 2005), although only two of these groups (A and B) are known to have spread in the human population. Remarkably, this residue changed to Arg in the inferred ancestor of the most widespread form of HIV-2, group A. However, it has been conserved as Met in HIV-2 group B, and was not found to have changed in any of the sporadic forms of HIV-2. These observations provide additional evidence that a basic residue at Gag-30 enhances the replication potential and possibly the secondary spread of primary lentiviruses after transmission to humans.
Gag-30 lies within the basic N-terminal domain of the gag-encoded matrix protein (p17). This domain is critical for targeting the Gag precursor to the plasma membrane during virus assembly (Hill et al. 1996). Several cellular factors are known (or believed) to bind this same region. These include PI(4,5)P2, a plasma membrane component (Ono et al. 2004), TIP47, a recently discovered Gag/Env connector protein complex (Lopez-Verges et al. 2006), and AP-3, an adaptor protein complex that is involved in cargo protein sorting to specific membrane components (Dong et al. 2005). NMR studies have shown that Gag-30 is not a part of the hydrophobic cleft that binds PI(4,5)P2 (Saad et al. 2006). The TIP47 binding site on the HIV-1 matrix protein has not yet been mapped; however, mutagenesis studies suggest that two highly conserved matrix regions (S6-S9 and D14-E17) are likely involved (Lopez-Verges et al. 2006). Finally, the d subunit of the AP-3 complex is believed to interact with the aminoterminal a-helical segment of the matrix protein (Dong et al. 2005). Thus, there are a number of candidate human proteins that could bind the SIVcpz matrix proteins with reduced affinity due to species specific differences in their protein sequence, and the same could be true for chimpanzee proteins and their interaction with the HIV-1 matrix protein.
In conclusion, although the specific function of Gag-30 remains to be determined, it is clear that adaptive changes at this residue in the SIVcpz matrix protein were required for efficient replication in the human host. These findings represent the first demonstration of host species-specific selection pressure in primate lentiviruses and identify the Gag matrix protein as a modulator of virus fitness following cross-species transmission to a new primate host.
Supplementary Material
Acknowledgments
We thank Wes Sundquist, Eric Freed, Mike Emerman, Heinrich Gottlinger and Mike Summers for unpublished information, and Mike Worobey and John Brookfield for discussion. This work was supported by the National Institutes of Health (R01 AI50529, R01 AI58715, P30 AI 27767, P30 CA 13148), the Yerkes Regional Primate Research Center (RR-00165), the Bristol Myers Freedom to Discover Program, and the Institut de Recherche pour le Développement (IRD).
Footnotes
A list of the GenBank accession numbers of all sequences analyzed is available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
Literature Cited
- Arien KK, Abraha A, Quinones-Mateu ME, Kestens L, Vanham G, Arts EJ. The replicative fitness of primary human immunodeficiency virus type 1 (HIV-1) group M, HIV-1 group O, and HIV-2 isolates. J Virol. 2005;79:8979–8990. doi: 10.1128/JVI.79.14.8979-8990.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Arien KK, Vanham G, Arts EJ. Is HIV-1 evolving to a less virulent form in humans? Nature Rev Microbiol. 2007;5:141–151. doi: 10.1038/nrmicro1594. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ball SC, Abraha A, Collins KR, Marozsan AJ, Baird H, Quinones-Mateu ME, Penn-Nicholson A, Murray M, Richard N, Lobritz M, Zimmerman PA, Kawamura T, Blauvelt A, Arts EJ. Comparing the ex vivo fitness of CCR5-tropic human immunodeficiency virus type 1 isolates of subtype B and C. J Virol. 2003;77:1021–1038. doi: 10.1128/JVI.77.2.1021-1038.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bibollet-Ruche F, Bailes E, Gao F, Pourrut X, Barlow K, Clewley JP, Mwenda JM, Langat MK, Chege GK, McClure HM, Mpoudi-Ngole E, Delaporte E, Peeters M, Shaw GM, Sharp PM, Hahn BH. New simian immunodeficiency virus infecting De Brazza’s monkeys (Cercopithecus neglectus): evidence for a Cercopithecus monkey clade. J Virol. 2004a;78:7748–7762. doi: 10.1128/JVI.78.14.7748-7762.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bibollet-Ruche F, Gao F, Bailes E, Saragosti S, Delaporte E, Peeters M, Shaw GM, Hahn BH, Sharp PM. Complete genome analysis of one of the earliest SIVcpzPtt strains from Gabon (SIVcpzGAB2) AIDS Res Human Retrovir. 2004b;20:1377–1381. doi: 10.1089/aid.2004.20.1377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Damond F, Worobey M, Campa P, Farfara I, Colin G, Matheron S, Brun-Vezinet F, Robertson DL, Simon F. Identification of a highly divergent HIV type 2 and proposal for a change in HIV type 2 classification. AIDS Res Human Retrovir. 2004;20:666–672. doi: 10.1089/0889222041217392. [DOI] [PubMed] [Google Scholar]
- Derdeyn CA, Decker JM, Sfakianos JN, Wu X, O’Brien WA, Ratner L, Kappes JC, Shaw GM, Hunter E. Sensitivity of human immunodeficiency virus type 1 to the fusion inhibitor T-20 is modulated by coreceptor specificity defined by the V3 loop of gp120. J Virol. 2000;74:8358–67. doi: 10.1128/jvi.74.18.8358-8367.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dong X, Li H, Derdowski A, Ding L, Burnett A, Chen X, Peters TR, Dermody TS, Woodruff E, Wang J-J, Spearman P. AP-3 directs the intracellular trafficking of HIV-1 Gag and plays a key role in particle assembly. Cell. 2005;120:663–674. doi: 10.1016/j.cell.2004.12.023. [DOI] [PubMed] [Google Scholar]
- Dyer JR, Kazembe P, Vernazza PL, Gilliam BL, Maida M, Zimba D, Hoffman IF, Royce RA, Schock JL, Fiscus SA, Cohen MS, Erin JJ. High levels of human immunodeficiency virus type 1 in blood and semen of seropositive men in sub-Saharan Africa. J. Infect Dis. 1998;177:1742–1746. doi: 10.1086/517436. [DOI] [PubMed] [Google Scholar]
- Freed EO, Englund G, Martin MA. Role of the basic domain of human immunodeficiency virus type 1 matrix in macrophage infection. J Virol. 1995;69:3949–3954. doi: 10.1128/jvi.69.6.3949-3954.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gao F, Bailes E, Robertson DL, Chen Y, Rodenburg CM, Michael SF, Cummins LB, Arthur LO, Peeters M, Shaw GM, Sharp PM, Hahn BH. Origin of HIV-1 in the chimpanzee Pan troglodytes troglodytes. Nature. 1999;397:436–441. doi: 10.1038/17130. [DOI] [PubMed] [Google Scholar]
- Grantham R. Amino acid difference formula to help explain protein evolution. Science. 1974;185:862–864. doi: 10.1126/science.185.4154.862. [DOI] [PubMed] [Google Scholar]
- Hahn BH, Shaw GM, De Cock KM, Sharp PM. AIDS as a zoonosis: scientific and public health implications. Science. 2000;287:607–614. doi: 10.1126/science.287.5453.607. [DOI] [PubMed] [Google Scholar]
- Hemelaar J, Gouws E, Ghys PD, Osmanov S. Global and regional distribution of HIV-1 genetic subtypes and recombinants in 2004. AIDS. 2006;20:W13–W23. doi: 10.1097/01.aids.0000247564.73009.bc. [DOI] [PubMed] [Google Scholar]
- Hill CP, Worthylake D, Bancroft DP, Christensen AM, Sundquist WI. Crystal structures of the trimeric human immunodeficiency virus type 1 matrix protein: implications for membrane association and assembly. Proc Natl Acad Sci USA. 1996;93:3099–3104. doi: 10.1073/pnas.93.7.3099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huelsenbeck JP, Ronquist F. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics. 2001;17:754–755. doi: 10.1093/bioinformatics/17.8.754. [DOI] [PubMed] [Google Scholar]
- Jones DT, Taylor WR, Thornton JM. The rapid generation of mutation data matrices from protein sequences. Comp Appl Biosci. 1992;8:275–282. doi: 10.1093/bioinformatics/8.3.275. [DOI] [PubMed] [Google Scholar]
- Keele BF, Van Heuverswyn F, Li Y, Bailes E, Takehisa J, Santiago ML, Bibollet-Ruche F, Chen Y, Wain LV, Liegeois F, Loul S, Mpoudi Ngole E, Bienvenue Y, Delaporte E, Brookfield JFY, Sharp PM, Shaw GM, Peeters M, Hahn BH. Chimpanzee reservoirs of pandemic and non-pandemic HIV-1. Science. 2006;313:523–526. doi: 10.1126/science.1126531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Korber B, Muldoon M, Theiler J, Gao F, Gupta R, Lapedes A, Hahn BH, Wolinsky S, Bhattacharya T. Timing the ancestor of the HIV-1 pandemic. Science. 2000;288:1789–1796. doi: 10.1126/science.288.5472.1789. [DOI] [PubMed] [Google Scholar]
- Leitner T, Korber B, Daniels M, Calef C, Foley B. HIV-1 subtype and circulating recombinant form (CRF) reference sequences, 2005. In: Leitner T, Foley B, Hahn B, Marx P, McCutchan F, Mellors J, Wolinsky S, Korber B, editors. HIV sequence compendium. Los Alamos National Laboratory; New Mexico: 2005. pp. 41–48. [Google Scholar]
- Li W-H, Tanimura M, Sharp PM. Rates and dates of divergence between AIDS virus nucleotide sequences. Mol Biol Evol. 1988;5:313–330. doi: 10.1093/oxfordjournals.molbev.a040503. [DOI] [PubMed] [Google Scholar]
- Lopez-Verges S, Camus G, Blot G, Beauvoir R, Benarous R, Berlioz-Torrent C. Tail-interacting protein TIP47 is a connector between Gag and Env and is required for Env incorporation into HIV-1 virions. Proc Natl Acad Sci USA. 2006;103:14947–14952. doi: 10.1073/pnas.0602941103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mwaengo DM, Novembre FJ. Molecular cloning and characterization of viruses isolated from chimpanzees with pathogenic human immunodeficiency virus type 1 infections. J Virol. 1998;72:8976–8987. doi: 10.1128/jvi.72.11.8976-8987.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nerrienet E, Santiago ML, Foupouapouognigni Y, Bailes E, Mundy NI, Njinku B, Kfutwah A, Muller-Trutwin MC, Barre-Sinoussi F, Shaw GM, Sharp PM, Hahn BH, Ayouba A. Simian immunodeficiency virus infection in wild-caught chimpanzees from Cameroon. J Virol. 2005;79:1312–1319. doi: 10.1128/JVI.79.2.1312-1319.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Novembre FJ, Saucier M, Anderson DC, Klumpp SA, O’Neill SP, Brown CR, Hart CE, Guenthner PC, Swenson RB, McClure HM. Development of AIDS in a chimpanzee infected with human immunodeficiency virus type 1. J Virol. 1997;71:4086–4091. doi: 10.1128/jvi.71.5.4086-4091.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ono A, Ablan SD, Lockett SJ, Nagashima K, Freed EO. Phosphatidylinositol (4,5) bisphosphate regulates HIV-1 Gag targeting to the plasma membrane. Proc natl Acad Sci USA. 2004;101:14889–14894. doi: 10.1073/pnas.0405596101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peeters M, Honore C, Huet T, Bedjabaga L, Ossari S, Bussi P, Cooper RW, Delaporte E. Isolation and partial characterization of an HIV-related virus occurring naturally in chimpanzees in Gabon. AIDS. 1989;3:625–630. doi: 10.1097/00002030-198910000-00001. [DOI] [PubMed] [Google Scholar]
- Saad JS, Miller J, Tai J, Kim A, Ghanam RH, Summers MF. Structural basis for targeting HIV-1 Gag proteins to the plasma membrane for virus assembly. Proc Natl Acad Sci USA. 2006;103:11364–11369. doi: 10.1073/pnas.0602818103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Santiago M, Range F, Keele BF, Li Y, Bailes E, Bibollet-Ruche F, Fruteau C, Noe R, Peeters M, Brookfield JFY, Shaw GM, Sharp PM, Hahn BH. Simian immunodeficiency virus infection in free-ranging sooty mangabeys (Cercocebus atys atys from the Tai Forest, Cote d’Ivoire: implications for the origin of epidemic human immunodeficiency virus type 2. J Virol. 2005;79:12515–12527. doi: 10.1128/JVI.79.19.12515-12527.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sharp PM, Bailes E, Chaudhuri RR, Rodenburg CM, Santiago MO, Hahn BH. The origins of AIDS viruses: where and when? Phil Trans Roy Soc London B. 2001;356:867–876. doi: 10.1098/rstb.2001.0863. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sharp PM, Shaw GM, Hahn BH. Simian immunodeficiency virus infection of chimpanzees. J Virol. 2005;79:3891–3902. doi: 10.1128/JVI.79.7.3891-3902.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Takehisa J, Kraus MH, Decker JM, Li Y, Keele BF, Bibollet-Ruche F, Zammit KP, Wng Z, Santiago ML, Kamenya S, Wilson ML, Pusey AE, Bailes E, Sharp PM, Shaw GM, Hahn BH. Generation of infectious molecular clones of simian immunodeficiency virus by synthesis of fecal viral consensus sequences. J Virol. 2007 doi: 10.1128/JVI.00551-07. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thompson JD, Higgins DG, Gibson TJ. ClustalW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–4680. doi: 10.1093/nar/22.22.4673. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Heuverswyn F, Li Y, Neel C, Bailes E, Keele BF, Liu W, Loul S, Butel C, Liegois F, Bienvenue Y, Ngole EM, Sharp PM, Shaw GM, Delaporte E, Hahn BH, Peeters M. SIV infection in wild gorillas. Nature. 2006;444:164. doi: 10.1038/444164a. [DOI] [PubMed] [Google Scholar]
- Wei Q, Fultz PN. Extensive diversification of human immunodeficiency virus type 1 subtype B strains during dual infection of a chimpanzee that progressed to AIDS. J Virol. 1998;72:3005–3017. doi: 10.1128/jvi.72.4.3005-3017.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wei X, Decker JM, Liu H, Zhang Z, Arani RB, Kilby JM, Saag MS, Wu X, Shaw GM, Kappes JC. Emergence of resistant human immunodeficiency virus type 1 in patients receiving fusion inhibitor (T-20) monotherapy. Antimicrob Agents Chemother. 2002;46:1896–905. doi: 10.1128/AAC.46.6.1896-1905.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yamaguchi J, Bodelle P, Kaptue L, Zekeng L, Gurtler LG, Devare SG, Brennan CA. Near full-length genomes of 15 HIV type 1 group O isolates. AIDS Res Human Retrovir. 2003;19:979–988. doi: 10.1089/088922203322588332. [DOI] [PubMed] [Google Scholar]
- Yang Z. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J Mol Evol. 1994;39:306–14. doi: 10.1007/BF00160154. [DOI] [PubMed] [Google Scholar]
- Yang Z. PAML: a program package for phylogenetic analysis by maximum likelihood. Comp Appl Biosci. 1997;15:555–556. doi: 10.1093/bioinformatics/13.5.555. [DOI] [PubMed] [Google Scholar]
- Yang Z, Rannala B. Bayesian phylogenetic inference using DNA sequences: a Markov Chain Monte Carlo Method. Mol Biol Evol. 1997;14:717–24. doi: 10.1093/oxfordjournals.molbev.a025811. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.