Summary
Chemokine receptor CCR2 and stromal-derived factor (SDF-1) are involved in HIV infection and AIDS symptom onset. Recent cohort studies showed that point mutations in these two genes, CCR2-64I and SDF1-3′A, can delay AIDS onset ⩾16 years after seroconversions. The protective effect of CCR2-64I is dominant, whereas that of SDF1-3′A is recessive. SDF1-3′A homozygotes also showed possible protection against HIV-1 infection. In this study, we surveyed the frequency distributions of the two alleles at both loci in world populations, with emphasis on those in east Asia. The CCR2-64I frequencies do not vary significantly in the different continents, having a range of 0.1–0.2 in most populations. A decreasing cline of the CCR2-64I frequency from north to south was observed in east Asia. In contrast, the distribution of SDF1-3′A in world populations varies substantially, and the highest frequency was observed in Oceanian populations. Moreover, an increasing cline of the SDF1-3′A frequency from north to south was observed in east Asia. The relative hazard values were computed to evaluate the risk of AIDS onset on the basis of two-locus genotypes in the east Asian and world populations.
Introduction
The discoveries that chemokines can block HIV replication and that their receptors play an important role in fusion of HIV to target cells raised expectations that chemokines might hold the key to understanding HIV-mediated pathogenesis (Premack et al. 1996; Littman 1998). Two chemokine receptors, CXCR4 and CCR5, are major coreceptors required for entry of, respectively, T cell line–tropic and macrophage-tropic strains of HIV-1 into CD4 cells (for review, see Littman 1998, and references therein). In addition, the expression of the stromal-derived factor (SDF1), the only known ligand of CXCR4, may inhibit transmission of T cell line–tropic HIV strains (Bluel et al. 1996; Feng et al. 1996). Other chemokine receptors, such as CCR2, may also play a role in HIV infection, although their mechanisms are less clear (Littman 1998; Mummidi et al. 1998).
Recently, the mutations in the genes encoding the aforementioned chemokine receptors (CCR5 and CCR2) and ligand (SDF1) in natural populations were linked to HIV-1 resistance in cohort studies (Dean et al. 1996; Fauci et al. 1996; Hill et al. 1996; Liu et al. 1996; Paxton et al. 1996; Samson et al. 1996; Smith et al. 1997; Mummidi et al. 1998; Winkler et al. 1998). Homozygous individuals with a 32-bp deletion in CCR5 (CCR5-Δ32) were found to be resistant to HIV infection (Dean et al. 1996; Liu et al. 1996). Heterozygotes for the CCR5-Δ32 mutation exhibit a delayed progression to disease. The CCR5-Δ32 mutation results in a truncated protein removing the receptor from cells (Liu et al. 1996). The benefit of carrying the deletion mutation was thought to result from decreased expression levels of the receptor in the patients, which led either to resistance to virus infection for the homozygotes or to delayed disease progression for the heterozygotes (Dean et al. 1996).
The mutant alleles of CCR2 and SDF1 (CCR2-64I and SDF1-3′A) carry point mutations in sequence-conserved regions, and the functional constraints of the regions were suggested to explain their protective effects of delaying AIDS onset in the cohort studies (Smith et al. 1997; Winkler et al. 1998). The SDF1-3′A mutation is located in the untranslated region of the transcript. The molecular mechanisms of the effect of the mutations in the CCR2 gene and SDF1 gene are not yet understood (Littman 1998; Mummidi et al. 1998), but a recent study has affirmed the protective effects of the mutations in these two genes (Martin et al. 1998). The protective effects of CCR5-Δ32 and CCR2-64I were shown to be dominant, whereas the protective effect of SDF1-3′A is recessive (Dean et al. 1996; Liu et al. 1996; Smith et al. 1997; Winkler et al. 1998).
The deletion mutant, CCR5-Δ32, was found primarily (∼10%) in the populations of European descent, but no mutant alleles were reported in indigenous non-European populations (Martinson et al. 1997). However, the distributions of CCR2-64I and SDF1-3′A were found in different ethnic populations in America, and the relative hazard (RH) indices for all genotype combinations at these three loci were estimated in the cohort studies (Smith et al. 1997; Winkler et al. 1998).
In the present study, we investigated the distributions of CCR2-64I and SDF1-3′A in worldwide populations, with emphasis on east Asian populations. The RH index was estimated for each population studied to evaluate natural resistance, and, therefore, these indices are of special importance for the prevention of AIDS in those populations.
Methods
Genotyping
A PCR-RFLP assay was used for genotyping. The published primer sequences and PCR conditions were adapted to amplify SDF1 and CCR2 gene fragments covering the polymorphic sites (Smith et al. 1997; Winkler et al. 1998). PCR products were then subjected to restriction enzyme digestion (Msp I for SDF1 and BsaB I for CCR2) for 4 h. After digestion, the products were genotyped by means of agarose gel electrophoresis.
RH Evaluation
To evaluate risk of AIDS onset for the populations screened, the RH value was computed for each population on the basis of the two-locus genotype of each individual. The RH value for each of the nine possible two-locus genotypes was adapted on the basis of the published data of the cohort studies (Winkler et al. 1998). Three AIDS definitions were considered in the RH evaluations: AIDS-1993, AIDS-1987, and Death (Dean et al. 1996). The RH indices for each two-locus genotype used in this report are: RH of [CCR2-64I/+, SDF1-3′A/SDF1-3′A] or [CCR2-64I/CCR2-64I, SDF1-3′A/SDF1-3′A] are 0.55, 0.31, 0.0 (for AIDS-1993, AIDS-1987, and Death, respectively, and hereafter); RH of [+/+, SDF1-3′A/SDF1-3′A] are 0.63, 0.35, and 0.23; RH of [CCR2-64I/+, +/+], [CCR2-64I/+, SDF1-3′A/+], [CCR2-64I/CCR2-64I, +/+], [CCR2-64I/CCR2-64I, SDF1-3′A/+] are 0.65, 0.66, and 0.60.
Results and Discussion
To survey the distribution of the two HIV-1–resistant alleles, we genotyped 1,768 individuals representing presumably unaffected populations from five continents—Africa, Europe, Oceania, Asia, and America (table 1). Special efforts were made to study the populations in east Asia where AIDS prevalence is rapidly increasing, including the Han Chinese from 28 provinces (of the 30 provinces in the mainland) and 22 Chinese ethnic minority groups. The Han Chinese from the 28 provincial regions were grouped into eight geographical regions (not official administrative regions) on the basis of the frequencies of the alleles studied. The populations from neighboring provinces that showed similar allele frequencies at both loci studied were grouped together. All the samples were collected before AIDS became an epidemic in the corresponding regions.
Table 1.
Frequency Distribution of SDF1-3′A and CCR2-64I and RH Indices in 48 World Populations[Note]
| Continent | Population | Size | SDF1 (SE) | CCR2 (SE) | RH1 (SE) | RH2 (SE) | RH3 (SE) |
| Africa | Biaka pygmy | 35 | .03 (.02) | .11 (.04) | .92 (.02) | .92 (.02) | .91 (.03) |
| Mbuti pygmy | 20 | .08 (.04) | .10 (.05) | .93 (.03) | .93 (.03) | .92 (.04) | |
| Lissongo | 11 | .09 (.06) | .14 (.07) | .90 (.05) | .91 (.05) | .89 (.05) | |
| America | Karitiana | 36 | .06 (.03) | .07 (.03) | .93 (.03) | .92 (.05) | .90 (.05) |
| Mayan | 40 | .15 (.04) | .33 (.05) | .78 (.03) | .78 (.03) | .74 (.03) | |
| Surui | 20 | .25 (.07) | .03 (.03) | .96 (.02) | .95 (.02) | .95 (.03) | |
| Europe | Italian | 37 | .15 (.04) | .13 (.04) | .88 (.03) | .86 (.03) | .84 (.04) |
| Northern European | 23 | .22 (.06) | .16 (.05) | .84 (.04) | .83 (.04) | .80 (.05) | |
| Oceania | Australian Aborigine | 14 | .54 (.09) | .10 (.06) | .84 (.05) | .74 (.08) | .68 (.10) |
| New Guinean1 | 69 | .66 (.04) | .18 (.03) | .76 (.04) | .63 (.07) | .53 (.09) | |
| Nasioi Melanesian | 12 | .67 (.10) | 0 | .88 (.05) | .78 (.09) | .74 (.10) | |
| New Guinean2 | 21 | .71 (.07) | .17 (.06) | .76 (.02) | .64 (.04) | .55 (.05) | |
| Asia (North) | Ewenki | 17 | .09 (.05) | .28 (.08) | .84 (.04) | .84 (.04) | .81 (.05) |
| Buryat | 5 | .10 (.09) | .57 (.16) | .72 (.06) | .73 (.06) | .68 (.07) | |
| Hui | 19 | .11 (.05) | .20 (.06) | .87 (.04) | .87 (.04) | .85 (.04) | |
| Tibetan | 25 | .12 (.05) | .28 (.06) | .83 (.03) | .84 (.03) | .81 (.04) | |
| Uyghur | 10 | .20 (.09) | .20 (.09) | .85 (.05) | .86 (.05) | .84 (.06) | |
| Korean | 28 | .27 (.06) | .26 (.06) | .82 (.03) | .82 (.04) | .78 (.05) | |
| Tamir | 7 | .29 (.12) | .07 (.07) | .95 (.05) | .95 (.04) | .94 (.05) | |
| Japanese | 15 | .37 (.09) | .11 (.06) | .81 (.05) | .74 (.07) | .69 (.08) | |
| Manchurian | 33 | .38 (.06) | .12 (.04) | .89 (.03) | .86 (.04) | .82 (.05) | |
| Asia (South) | Wa | 35 | .06 (.03) | .16 (.04) | .89 (.03) | .89 (.03) | .87 (.03) |
| Ami | 10 | .10 (.07) | 0 | 1.00 (.00) | 1.00 (.00) | 1.00 (.00) | |
| Anni | 21 | .12 (.05) | 0 | 1.00 (.00) | 1.00 (.00) | 1.00 (.00) | |
| Lahu | 11 | .14 (.07) | 0 | 1.00 (.00) | 1.00 (.00) | 1.00 (.00) | |
| Bulang | 26 | .15 (.05) | 0 | 1.00 (.00) | 1.00 (.00) | 1.00 (.00) | |
| Yi | 19 | .16 (.06) | .24 (.07) | .83 (.04) | .82 (.05) | .79 (.05) | |
| Atayal | 32 | .17 (.05) | .11 (.04) | .93 (.02) | .94 (.02) | .93 (.03) | |
| Deang | 6 | .17 (.11) | .17 (.11) | .88 (.07) | .89 (.07) | .87 (.08) | |
| Jingpo | 15 | .23 (.08) | .07 (.05) | .92 (.04) | .91 (.05) | .89 (.06) | |
| Tujia | 20 | .28 (.07) | .25 (.07) | .80 (.04) | .76 (.05) | .71 (.07) | |
| Yao-Nandan | 20 | .28 (.07) | 0 | .98 (.02) | .97 (.03) | .96 (.04) | |
| Dong | 20 | .30 (.07) | .15 (.06) | .89 (.04) | .86 (.05) | .82 (.07) | |
| She | 20 | .30 (.07) | .15 (.06) | .90 (.04) | .90 (.03) | .88 (.04) | |
| Yao-Jinxiu | 20 | .30 (.07) | .08 (.04) | .93 (.03) | .92 (.04) | .90 (.05) | |
| Cambodian | 28 | .32 (.06) | .15 (.05) | .87 (.03) | .86 (.04) | .82 (.05) | |
| Paiwan | 20 | .33 (.07) | .03 (.03) | .96 (.02) | .95 (.03) | .94 (.04) | |
| Dai | 12 | .33 (.10) | .07 (.05) | .92 (.05) | .92 (.05) | .91 (.06) | |
| Li | 20 | .35 (.08) | .05 (.03) | .90 (.05) | .87 (.06) | .85 (.07) | |
| Yami | 20 | .43 (.08) | .05 (.03) | .90 (.04) | .84 (.06) | .81 (.08) | |
| Han Chinese | Northwest Han | 24 | .19 (.06) | .23 (.06) | .84 (.04) | .83 (.04) | .80 (.05) |
| North Han | 138 | .24 (.03) | .21 (.02) | .83 (.01) | .82 (.02) | .79 (.02) | |
| Northeast Han | 37 | .26 (.05) | .24 (.05) | .91 (.03) | .89 (.03) | .88 (.04) | |
| Southwest Han | 48 | .22 (.04) | .18 (.04) | .84 (.03) | .83 (.03) | .79 (.04) | |
| East Han | 407 | .25 (.02) | .20 (.01) | .85 (.01) | .82 (.01) | .78 (.02) | |
| Central Han | 168 | .26 (.02) | .22 (.02) | .84 (.02) | .84 (.02) | .80 (.02) | |
| Southeast Han | 40 | .26 (.05) | .22 (.05) | .85 (.03) | .82 (.03) | .78 (.04) | |
| South Han | 34 | .34 (.06) | .11 (.04) | .86 (.04) | .84 (.05) | .80 (.06) | |
| Total | 1,768 |
Note.— RH1, RH2, and RH3 were calculated under three different AIDS definitions—AIDS-1993, AIDS-1987, and Death, respectively.
Table 1 shows the frequencies of SDF1-3′A and CCR2-64I in 48 populations studied. The variation of SDF1-3′A frequency is quite extensive, and it is exceptionally high in Oceania, especially in the New Guinean Highlanders, which we have previously reported (Su et al. 1998). The frequency of SDF1-3′A in the other populations ranges from .029–.366. In contrast, the frequency of the CCR2-64I allele varies moderately, except that it is absent in the Nasioi Melanesians. In three American Indian populations, however, there is a significant frequency difference. The frequency in the Mayans is >10 times as much as that in the Surui, which could be the consequence of genetic drift in the Surui populations and extensive admixture in the Mayans. Analysis of genotype frequency data did not show any significant deviation from Hardy-Weinberg expectation in any population, implying the absence of detectable selection differentials between the individuals with and without the mutations in unaffected populations.
The frequencies of SDF1-3′A and CCR2-64I were plotted against the geographical distributions of the populations in east Asia (see fig. 1 and 2). In general, frequencies of both alleles in the east Asian populations are comparable to previous studies on Asians residing in the United States (Smith et al. 1997; Winkler et al. 1998). Interestingly, frequency distributions of CCR2-64I and SDF1-3′A revealed opposite patterns. The frequencies of CCR2-64I increase from south to north, whereas the clinal trend is reversed for the SDF1-3′A allele in east Asian populations. The sample sizes of Han Chinese populations are generally large; therefore, it is unlikely that this observation is simply due to sampling error. This observation can be confirmed by the result of a similar study in Southeast Asian populations, where the frequency of CCR2-64I is relatively low and that of SDF1-3′A is relatively high compared with Chinese populations (R. Deka, personal communication). The north-south differences for both alleles in the east Asian populations are generally consistent with the observations made with other genetic markers (see Chu et al. 1998 and references therein). It was hypothesized that modern humans entered east Asia from the south, possibly from Southeast Asia, ∼50,000–70,000 years ago, and later populations expanded from the south to the north of Asia (Chu et al. 1998). It is well recorded that, in the past 7,000 years, migrations in China have been predominantly from north to south. Therefore, two-way migrations could provide a mechanism to generate and sustain the two opposing north-to-south clines that we are currently observing. However, the minority populations studied are relatively isolated and small in size; therefore, genetic drift can play a significant role in influencing their allele frequencies. The frequencies of the two alleles in those populations may or may not be consistent with those in the nearby Han Chinese populations, depending largely on the level of gene flow between them and their respective ethnohistories.
Figure 1.
Allele frequency distribution of SDF1-3′A in the east Asian populations. The Chinese Han, from 28 provincial regions, were grouped into eight geographical regions, as shown in table 1: south (Guangdong, Guangxi, and Hainan), southeast (Fujian and Taiwan), southwest (Guizhou, Sichuan, and Yunnan), central (Anhui, Jiangxi, Hubei, and Hunan), east (Jiangsu, Shanghai, and Zhejiang), north (Shandong, Beijing, Tianjin, Hebei, and Henan), northeast (Inner Mongolia, Liaoning, Jilin, and Heilongjiang), and northwest (Shanxi, Shaanxi, Gansu, Ningxia, and Xinjiang). The frequencies of these populations were presented in color patches. The colored circles refer to the Chinese ethnic minority groups and other east Asian populations, defined as follows: 1 = Wa, 2 = Ewenki, 3 = Ami, 4 = Baryat, 5 = Hui, 6 = Anni, 7 = Tibetan, 8 = Lahu, 9 = Bulang, 10 = Yi, 11 = Deang, 12 = Atayal, 13 = Uyghur, 14 = Jingpo, 15 = Korean, 16 = Yao (Nandan), 17 = Tujia, 18 = Tamir, 19 = Dong, 20 = She, 21 = Yao (Jinxiu), 22 = Cambodian, 23 = Paiwan, 24 = Dai, 25 = Li, 26 = Japanese, 27 = Manchurian, and 28 = Yami.
Figure 2.
Allele frequency distribution of CCR2-64I in the east Asian populations. The description of the populations follows figure 1.
Unlike CCR5-Δ32, which is only present in the white population, the SDF1-3′A and CCR2-64I are distributed in all populations of the world. Stephens et al. (1998) estimated, on the basis of the coalescence of haplotypes of two microsatellite loci, that the origin of the CCR5-Δ32–containing ancestral haplotype is very recent, occurring only ∼700 years ago. If this estimate is correct, the origins of SDF1-3′A and CCR2-64I could be much more ancient, and probably happened before the out-of-Africa migrations of modern humans, which is reflected by their patterns of ubiquitous distributions in world populations.
Evaluating the risk of AIDS onset in different world populations provides important information to AIDS epidemiological issues. Using the RH indices in the cohort studies (see Methods, for details), we estimated, on the basis of the two-locus genotype distributions, the RH indices for each population (table 1). In general, RH values vary from population to population. Holding a high frequency of SDF1-3′A homozygotes, Oceanian populations have the lowest RH values, which indicates possibly the highest protection from AIDS onset or even HIV infection in those populations. African populations, in contrast, exhibit very high RH values. The RH values in the three American Indian populations are quite variable because of the large allele frequency difference. There is little variation of RH indices among Chinese Han populations, as shown in figure 3, largely because of the opposite clines of SDF1-3′A and CCR2-64I. However, high RH indices were observed in most of the southern Chinese ethnic groups, in contrast with the relatively low RH in northern Chinese ethnic groups, in Japanese ethnic groups, and in Korean ethnic groups. Most of those minority groups are in Yunnan (Wa, Deang, Jingpo, Dai, Anni, Bulang, and Lahu), where the prevalence of AIDS is highest in China.
Figure 3.
RH2 distribution in the east Asian populations. The description of the populations follows figure 1.
It is important to note that the estimation of RH indices may not be accurate, given the width of the 95% confidence intervals (see table 1, Winkler et al. 1998). Furthermore, the allele frequency estimation may introduce an additional source of error, especially when the sample sizes are small. In this report, we treated the RH indices for individuals as constants, and the standard errors of sampling are presented in table 1. Such standard errors are surprisingly small, even when the sample size of the population is small.
Several assumptions were made in the above evaluation. First, the protective effect of CCR5 is ignored. Given the rarity of the CCR5-Δ32 allele in nonwhite populations, the RH values of those populations are generally accurate, whereas in white populations they are overestimated. The overestimation is introduced, in particular, for individuals with the following groups of genotypes: [CCR5-Δ32/CCR5-Δ32, +/+, SDF1-3′A/SDF1-3′A], [CCR5-Δ32/+, +/+, SDF1-3′A/SDF1-3′A], [CCR5-Δ32/CCR5-Δ32, +/+, SDF1-3′A/+], [CCR5-Δ32/CCR5-Δ32, +/+, +/+], [CCR5-Δ32/+, +/+, SDF1-3′A/+], and [CCR5-Δ32/+, +/+, +/+]. However, the overestimation of RH indices for whites is <0.1%. Second, the difference of RH indices among racial groups is ignored. The errors introduced by this assumption should be very small, if not negligible, since the RH index differences between racial groups are generally small (see table 1, Winkler et al. 1998).
Acknowledgments
B.S. and R.C. are supported by National Institutes of Health (NIH) grant R01GM41399. L.J. is supported by the TOKTEN project of the United Nations, the Li Foundation, NIH grant R01GM41399, and NIH grant R01GM55759.
References
- Bleul CC, Farzan M, Choe H, Parolin C, Clark-Lewis I, Sodroski J, Springer TA (1996) The lymphocyte chemoattractant SDF-1 is a ligand for LESTR/fusin and blocks HIV-1 entry. Nature 382:829–833 [DOI] [PubMed]
- Chu JY, Jin L, Kuang SQ, Huang W, Wang JM, Xu JJ, Wu M, et al (1998) Genetic relationship of populations in China. Proc Natl Acad Sci USA 95:11763–11768. [DOI] [PMC free article] [PubMed]
- Dean M, Carrington M, Winkler C, Huttley GA, Smith MW, Allikmets R, Goedert JJ, et al (1996) Genetic restriction of HIV-1 infection and progression to AIDS by a deletion allele of the CKR5 structural gene. Science 273:1856–1862 [DOI] [PubMed]
- Fauci AS (1996) Resistance to HIV-1 infection: it's in the genes. Nat Med 2:966–967 [DOI] [PubMed]
- Feng Y, Broder CC, Kennedy PE, Berger EA (1996) HIV-1 entry cofactor: functional cDNA cloning of a seven-transmembrane G protein-coupled receptor. Science 272:872–877 [DOI] [PubMed]
- Hill CM, Littman DR (1996) Natural resistance to HIV? Nature 382:668–669 [DOI] [PubMed]
- Littman DR (1998) Chemokine receptors: keys to AIDS pathogenesis? Cell 93:677–680 [DOI] [PubMed]
- Liu R, Paxton WA, Choe S, Ceradini D, Martin SR, Horuk R, MacDonald ME, et al (1996) Homozygous defect in HIV-1 coreceptor accounts for resistance of some multiply-exposed individuals to HIV-1 infection. Cell 86:367–377 [DOI] [PubMed]
- Martin MP, Dean M, Smith MW, Winkler C, Gerrard B, Michael NL, Lee B, et al (1998) Genetic acceleration of AIDS progression by a promoter variant of CCR5. Science 282:1907–1911 [DOI] [PubMed]
- Martinson JJ, Chapman NH, Rees DC, Liu YT, Clegg JB (1997) Global distribution of the CCR5 gene 32-basepair deletion. Nat Genet 16:100–103 [DOI] [PubMed]
- Mummidi S, Ahuja SS, Gonzalez E, Anderson SA, Santiago EN, Stephan KT, Craig FE, et al (1998) Genealogy of the CCR5 locus and chemokine system gene variants associated with altered rates of HIV-1 disease progression. Nat Med 4:786 [DOI] [PubMed]
- Paxton WA, Martin SR, Tse D, O'Brien TR, Skurnick J, VanDevanter NL, Padian N, et al (1996) Relative resistance to HIV-1 infection of CD4 lymphocytes from persons who remain uninfected despite multiple high-risk sexual exposure. Nat Med 2:412–417 [DOI] [PubMed]
- Premack BA, Clapham PR (1996) Chemokine receptors: gateways to inflammation and infection. Nat Med 2:1174–1178 [DOI] [PubMed]
- Samson M, Libert F, Doranz BJ, Rucker J, Liesnard C, Farber CM, Saragosti S, et al (1996) Resistance to HIV-1 infection in Caucasian individuals bearing mutant alleles of the CCR-5 chemokine receptor. Nature 382:722–725 [DOI] [PubMed]
- Smith MW, Dean M, Carrington M, Winkler C, Huttley GA, Lomb DA, Goedert JJ, et al (1997) Contrasting genetic influence of CCR2 and CCR5 variants on HIV-1 infection and disease progression. Science 277:959–965 [DOI] [PubMed]
- Stephens JC, Reich DE, Goldstein DB, Shin HD, Smith MW, Carrington M, Winkler C, et al (1998) Dating the origin of the CCR5-Delta32 AIDS-resistance allele by the coalescence of haplotypes. Am J Hum Genet 62:1507–1515 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Su B, Chakraborty R, Jin L, Xiao JH, Lu DR (1998) Frequency of an HIV-resistant allele is exceptionally high in New Guinean Highlanders. JAMA 280:1830 [DOI] [PubMed]
- Winkler C, Modi W, Smith MW, Nelson GW, Wu X, Carrington M, Dean M, et al (1998) Genetic restriction of AIDS pathogenesis by an SDF-1 chemokine gene variant. Science 279:389–393 [DOI] [PubMed]



