Skip to main content
Genetics logoLink to Genetics
. 2008 Sep;180(1):445–457. doi: 10.1534/genetics.108.090340

Human Endogenous Retrovirus (HERVK9) Structural Polymorphism With Haplotypic HLA-A Allelic Associations

Jerzy K Kulski *,1, Atsuko Shigenari , Takashi Shiina , Masao Ota , Kazuyoshi Hosomichi , Ian James §, Hidetoshi Inoko
PMCID: PMC2535695  PMID: 18757922

Abstract

The frequency and HLA-A allelic associations of a HERVK9 DNA structural polymorphism located in close proximity to the highly polymorphic HLA-A gene within the major histocompatibility complex (MHC) genomic region were determined in Japanese, African Americans, and Australian Caucasians to better understand its human population evolutionary history. The HERVK9 insertion or deletion was detected as a 3′ LTR or a solo LTR, respectively, by separate PCR assays. The average insertion frequency of the HERVK9.HG was significantly different (P < 1.083e−6) between the Japanese (0.59) and the African Americans (0.34) or Australian Caucasians (0.37). LD analysis predicted a highly significant (P < 1.0e−5) linkage between the HLA-A and HERVK9 alleles, probably as a result of hitchhiking (linkage). Evolutionary time estimates of the solo, 5′ and 3′ LTR nucleotide sequence divergences suggest that the HERVK9 was inserted 17.3 MYA with the first structural deletion occurring 15.1 MYA. The LTR/HLA-A haplotypes appear to have been formed mostly during the past 3.9 MY. The HERVK9 insertion and deletion, detected by a simple and economical PCR method, is an informative genetic and evolutionary marker for the study of HLA-A haplotype variations, human migration, the origins of contemporary populations, and the possibility of disease associations.


THE major histocompatibility complex (MHC) genomic region on human chromosome 6.21.3 is characterized by extensive nucleotide and indel polymorphisms and multicopy gene families, such as the HLA class I, class II, and C4 class III genes (Dawkins et al. 1999; Shina et al. 2004; Stewart et al. 2004). Many of the MHC genes are involved with the regulation of the immune system against infection and the MHC class I and class II genes have a central role in the immune response via antigen recognition and presentation to T cells (Kulski and Inoko 2003; Prugnolle et al. 2005). The MHC is also a human endogenous retrovirus (HERV)-rich region consisting of at least 12 different HERV family members, including 16 duplicated copies of the LTR16/HERV-16 sequences within the class I region (Kulski et al. 1999) and the HERVK(C4) structural polymorphism (absent or present) within intron 9 of the duplicated complement C4 genes within the class III region (Dangel et al. 1994; Tassabehji et al. 1994; Schneider et al. 2001a). The HERVK(C4) insertion that contributes to the long form of the C4 gene in humans and nonhuman primates (Schneider et al. 2001a) expresses antisense transcripts that may act against exogenous retroviral infections (Schneider et al. 2001b; Mack et al. 2004).

A recent comparative study of the genomic sequences of two different MHC haplotypes has shown that another HERVK sequence is potentially a common structural polymorphism within the MHC class I region where it is present in the PGF cell line with the gene haplotype HLA-A3-B7-Cw7 and deleted from the COX cell line with the gene haplotype HLA-A1-B8-Cw7 (Stewart et al. 2004). This HERVK polymorphic sequence is a member of the HERVK9 (alias HERV-K HML-3) family (Mager and Medstrand 2003; Mayer and Meese 2005) and it is located between the HLA-H and -G genes (HG locus) ∼62.6 kb telomeric of the HLA-A gene (Stewart et al. 2004). The family of HERVK9 endogenous retroviruses has ∼150 full-length copies distributed in the human genome (Mayer and Meese 2005) and it is transcriptionally active in different normal and diseased tissues (Medstrand and Blomberg 1993; Seifarth et al. 2005). The HERVK9 internal sequence is flanked by a 5′ and 3′ LTR sequence, called the MER9 element (Kulski et al. 1999; Kapitonov et al. 2004), and single copies of the MER9 sequence, a solo LTR, are found more frequently in the genome than the internal proviral sequence (Mager and Medstrand 2003). The deletion of HERV internal sequences from the genome usually generates solitary LTR sequences at the deletion loci as a consequence of homologous recombination between the two LTR flanking the provirus (Hughes and Coffin 2004). HERVK9 sequences appear to have been first fixed in the genome ∼35 MYA, before the emergence of the rhesus macaque (Mayer and Meese 2005). Because of its presence in apes, the HERVK9.HG structural polymorphism located between HLA-H and -G (the HG locus) is considered to be a deletion polymorphism (Kulski et al. 2004, 2005) rather than an insertion polymorphism (Barbulescu et al. 1999; Turner et al. 2001).

To better understand the role and evolutionary history of the HERVK9.HG structural polymorphism in the MHC, information is needed about the population frequencies and characteristics of the HERVK9.HG structural polymorphism, particularly as none has yet been compiled. The aim of this study was to determine the frequency of the deletion polymorphism of the HERVK9.HG retroviral sequence within the α-block of the MHC class I region and its association with HLA-A alleles in the DNA samples of 100 Japanese, 100 African Americans, 174 Australian Caucasians, and 50 homozygous B-lymphoblastoid cell lines of different ethnic origins.

MATERIALS AND METHODS

DNA samples:

A reference set of 100 Japanese DNA samples genotyped for HLA alleles at the HLA-A, -B, and -DR loci by DNA sequencing was obtained from the Department of Legal Medicine, Shinshu University School of Medicine, Matsumoto, Nagano, Japan. This reference set of DNA samples represents a Japanese population of registered donors from the Nagano region in the Japanese unrelated bone marrow donor registry (Moriyama et al. 2006). A reference set of 174 Australian–Caucasian DNA samples genotyped for HLA alleles at the HLA class I gene loci by DNA sequencing was obtained from The Department of Clinical Immunology and Biochemical Genetics, Royal Perth Hospital, Perth, Western Australia. This reference set of samples represents a predominantly Caucasian (99.6%) population from the seaside town of Busselton in Western Australia (http://www.busseltonhealthstudy.com/). A panel of 100 African–American DNA samples was purchased from Coriell Cell Repositories as Human Variation panel HD100AA (http://ccr.coriell.org/nigms/nigms_cgi/panel.cgi?id=2&query=HD100AA). Another 50 DNA samples, extracted from B-lymphoblastoid cell lines of different ethnic origins and genotyped and/or serotyped for homozygous HLA alleles at the HLA-A, -B and -DR loci, were purchased from the European Collection of Cell Cultures (http://www.ecacc.org.uk/). Additional information about these homozygous cell lines (Table 1) can be obtained at http://www.ebi.ac.uk/imgt/hla/help/cell_help.html. Following HERVK9 PCR and HLA typing (as described below), we renamed the cell lines nos. 32 and 42 in Table 1 from TISI [International Histocompatibility Workshop (IHW) no. 9042] to PMA–TISI and from SSTO (IHW no. 9302) to PMA–SSTO, respectively, because we found that these cell-line DNA products were originally mislabeled. Ethics approval for the use of the human DNA samples in this study was obtained from the Tokai University Ethics Committee as ethics approval no. 07I-38.

TABLE 1.

HERVK9.HG insertion or deletion in 50 cell lines with homozygous HLA-A alleles

HLA gene alleles
HERVK9.HG
No. Cell-line name IHW no. Ethnic origin A* B* DRB1* Insertion Deletion
1 WAL, FD 9129 Caucasoid 3 7 1501 +
2 HO104 9082 French 3 7 1501 +
3 SCHU 9013 French 0301 0702 1501 +
4 EA 9081 Scandinavian 0301 0702 1501 +
5 WT100BIS 9006 Italian 1101 3501 0101 +
6 LBF (LBUF) 9048 Caucasoid 3001 1302 070101 +
7 SPL SPACH 9101 South American Indian 3101 1501 08021 +
8 LWAGS 9079 Ashkenasi Jewish 3301 1402 0102 +
9 WON, PY 9156 Oriental 33 58 0301 +
10 HAU, ML 9157 Oriental 33 58 0301 +
11 YAR 9026 Ashkenasi Jewish 2601 3801 0402 +
12 IBW9 9049 Sardinian 3301 1402 0701 +
13 WATANABE 9126 Oriental 2 46 08032 +
14 SPO010 SPO 9036 Italian 0201 4402 1101 +
15 AWELLS WEL 9090 Australian Caucasian 0201 4402 0401 +
16 EK 9054 Scandinavian 0201 4402 1401 +
17 BM16 9038 Italian 0201 1801 1201 +
18 EJ32B 9085 Australian Caucasian 3002 1801 3 +
19 BSM 9032 Dutch 0201 1501 4 +
20 BOLETH BO 9031 Swedish 0201 1501 0401 +
21 WT9 9061 Italian 0201 1801 1401 +
22 KOSE 9056 German 0201 3503 1302/1401 +
23 BER 9093 German 0201 1302 0701 +
24 E4181324 9011 Australian Caucasian 0101 5201 15021 +
25 DRI, SM 9128 3 7/35 0101/1501 +
26 TAB089 9066 Japanese 0207 4601 08031 +
27 J0528239 9041 Italian 0101 3502 1104 +
28 HAM 013 9178 South African Caucasian 1 8 1201/0301 +
29 VAVY 9023 French 0101 0801 0301 +
30 LO541265 9086 Australian Caucasian 0101 0801 0301 +
31 PF04015 9088 French 0101 0801 0301 +
32 PMA-TISI 1 57 7 +
33 SA 9001 Japanese 2402 0702 01 +
34 HOSONUM 9130 Oriental 24 7 0101 +
35 KUROIWA 9131 Oriental 24 7 0101 +
36 WBD001816 9154 1 17 7 +
37 DBB 9052 Amish 0201 5701 0701 +
38 MOU MANN 9050 Danish 2902 4403 0701 +
39 PLH 9047 Scandinavian 0301 4701 0701 +
40 PGF 9318 English 0301 0702 1501 +
41 APD 9291 1 60 0402 +
42 PMA-SSTO 31 15 8 +
43 QBL 9020 Dutch 2601 1801 0301 +
44 AKIBA 9286 Japanese 2402 5201 1502 +
45 LKT3 9107 Japanese 2402 5401 0405 +
46 HID 9074 Japanese 0201 4001 9/4006 +
47 HAY, KJ 9196 Australian Aborigine 2 15 1301/14 +
48 WON, C 9195 Australian Aborigine 34 40/15 1201/0803 +
49 WON, I 9194 Australian Aborigine 34 40/5601 0803/1405 +
50 COX 9022 South African Caucasian 0101 0801 0301 +

HLA-A genotyping:

The Japanese and Australian–Caucasian DNA samples were previously genotyped for HLA-A alleles to two or four digits by direct sequencing (Moriyama et al. 2006). The African–American DNA samples were genotyped for HLA-A alleles to two or four digits by the PCR–SSOP–Luminex method as previously described (Itoh et al. 2005).

HERVK9 PCR primers:

Two sets of PCR primer pairs were designed for the detection of the HERVK9 deletion and insertion, respectively (Figures 1 and 2). One primer pair (1Se1/3ASe2) was for the detection of the HERVK9 deletion as a solo LTR (MER9) sequence using the sense primer 1Se1 (5′-GTCACCCCCTAGAAGGAGACC-3′) and antisense primer 3ASe2 (5′-CAGAAGACTCAGGATGGAGTCTCC-3′). The other primer pair (3Si2/3ASe2) was for the detection of the HERVK9.HG insertion as a HERVK9-linked-3′ LTR (3′ MER9) sequence using the sense primer 3Si2 (5′-AGATGCAGATCCCGATTCCTGC-3′) and the antisense primer 3ASe2 (5′-CAGAAGACTCAGGATGGAGTCTCC-3′). The PCR primer sets were designed using the MHC genomic sequences that were determined for the COX and PGF cell lines (Stewart et al. 2004). The deletion PCR assay produced an amplified product size of 556 bp, whereas the insertion PCR assay produced an amplified product size of 625 bp. The HERVK9.HG insertion at 6.2 kb is too large to be amplified by the deletion PCR assay.

Figure 1.—

Figure 1.—

Genomic map of the HLA class I gene and pseudogene cluster from HLA-J to HLA-G within the α-block of the MHC shows the location of the HERVK9.HG deletion and insertion polymorphism in haplotypes HAPL A and HAPL B, respectively. HAPL B shows the HERVK9 insertion and not the orthologous HLA class I genes for convenience. The centromeric end of the sequences is on the left and the telomeric end is on the right of the sequences. The locations of the HLA-90, HLA-75, and HLA-F genes telomeric of HLA-G are not shown. The DNA regions amplified by the PCR reactions shown in Figure 2 are indicated by the double horizontal arrows labeled “PCR-A del” and “PCR-B ins.” The double horizontal arrow at the top indicates the approximate distance between the HERVK9.HG deletion and the HLA-A locus. The genes and genomic regions of the PAC clones 544A6 and 779F20 that were used as positive and negative DNA samples for the PCR reactions to detect the HERVK9.HG polymorphism are indicated by the horizontal lines at the bottom (Shiina et al. 1999). The genomic map is based on the MHC class I genomic sequence reported by Hampe et al. (1999) and submitted to GenBank as accession no. AF055066.

Figure 2.—

Figure 2.—

PCR detection of the HERVK9.HG insertion or deletion. (A) The PCR strategy used to detect the HERVK9i insertion (top) and the solo LTR (M9) sequence as the remnants of the HERVK9 deletion (bottom). The labeled vertical arrows indicate the relative locations of the PCR primers. The dotted lines indicate the homologous recombination between the 5′ LTR (M9) and the 3′ LTR (M9) involved in the HERVK9.HG deletion and the generation of the solo LTR (M9) sequence. “M9” is an abbreviation for MER9. (B) PCR products with the HERVK9 insertion (625 bp) primers and deletion (556 bp) primers. The homozygous insertion products are in lanes 1, 6, 7, and 9 and the PGF cell line. The homozygous deletion products are in lanes 2 and the COX cell line. The heterozygous products are in lanes 3, 4, 5, 8, and 10.

HERVK9 PCR genotyping:

Each PCR assay was performed in 10-μl aliquots using 2 pmol of each primer (200 nmol/liter), 1 ng of genomic DNA, 0.25 units of TaKaRa LA Taq polymerase, 0.8 μl of dNTP mixture (2.5 mm each), and 5 μl of 2xGC reaction buffer one with 5 mm MgCl2 purchased from TaKaRa, Shiga, Japan. The PCR was performed in eight strips of 0.2-ml thin-walled PCR tubes (QSP) using a GenAmp 9700 thermal cycler (Applied Biosystems, Foster City, CA) programmed for 35 cycles with a denaturation (at 96° for 30 sec) and annealing (at 62° for 3 min) step at each cycle. The reaction products were stained with ethidium bromide and sizes were compared with molecular size markers by horizontal gel electrophoresis in 2% agarose using tris–borate–EDTA running buffer (Figure 2). Control samples (without DNA template) were run to ensure that there was no amplification of contaminating DNA. Reference control DNA from the COX and PGF cell lines were used to verify the identified polymorphisms.

Sequencing of LTR (MER9) PCR products:

Homozygous PCR products of HERVK9 deletions (solo LTR) and HERVK9 insertions (3′ LTR and flanking HERVK9 sequence) were amplified from selected cell lines and sequenced directly with BigDye terminator Cycle Sequencing FS ready reaction kit, Version 3.1 (Applied Biosystems) according to the instructions provided by the manufacturer using the sense and antisense PCR primers as sequencing primers. The sequences were analyzed using an automated DNA sequencer (ABI PRISMTM 3130 DNA sequencer; Applied Biosystems). The MER9 LTR sequenced in this study will appear in the DDBJ/EMBL/GenBank nucleotide sequence databases with the successive accession nos. AB443932AB443937.

LTR (MER9) sequence data from GenBank:

MER9-LTR DNA sequences for SNP analysis and association with HLA-A alleles were also obtained by extracting the MER9 from genomic sequences that were available within the public DNA database GenBank at NCBI (http://www.ncbi.nlm.nih.gov/). The accession numbers (cell-line name, MHC class I allele) of previously sequenced MER9-LTR within extended genomic sequences that were downloaded for analysis were solo MER9 at the HG locus: BX284699 (SSTO, HLA-A32), CR388220 (DBB, HLA-A2), AL671561 (COX, HLA-A1), AL845454 (QBL, HLA-A26), and BX927141 (MANN, HLA-A29); and for 5′ and 3′ MER9 at the HG locus: CU104658 (GogoA), AL645929 (PGF, HLA-A3), and AC192848 (PatrA).

DNA sequence alignment analysis:

The nucleotide positions of the MER9.HG were first located within the genomic sequence of different accession numbers by using RepeatMasker v3.1.6 (http://www.repeatmasker.org/cgi-bin/WEBRepeatMasker). The MER9 sequences were then manually extracted from the genomic sequences using the BLAST extraction tool at NCBI (http://www.ncbi.nlm.nih.gov/). Multiple alignments of MER9 DNA sequences were examined using the multiple alignment programs provided by the CLC Free Workbench v4 (http://www.clcbio.com/), GeneDoc (http://www.nrbsc.org/gfx/genedoc/index.html), and the CLUSTAL W 1.8 program at DDBJ (http://clustalw.ddbj.nig.ac.jp/top-e.html) with the default settings for “DNA” type. Needle, a Needleman–Wunsch algorithm and part of the EMBOSS Pairwise Alignment Algorithms at EMBL-EBI (http://www.ebi.ac.uk/emboss/align/index.html), was also used to calculate the percentage similarity and to identify the SNP and gap positions between the two MER9 DNA sequences as required.

Divergence date estimations:

The percentage divergence between pairs of LTR (MER9) sequences was calculated by counting and converting the number of nucleotide differences to a percentage difference of their entire length and excluding regions containing deletions (gaps). Corrections were made to account for the presence of multiple mutations at the same site, back mutations and convergent substitutions using the Kimura two-parameter model (Kimura 1980), and the computation algorithm provided as part of the CLUSTALW analysis at DDBJ (http://clustalw.ddbj.nig.ac.jp/top-e.html). Mutation rates of homologous sequence pairs for each solo, 5′, or 3′ LTR element at the HG locus were compared to estimate the duplication times. Because LTRs are identical at the time of retroelement integration, divergence distances were calculated between the 5′ and 3′ LTR of the same element at the HG locus to estimate the time of integration. Comparison of the solo LTR sequences were compared with 5′ and 3′ LTR sequences to estimate their time of deletion. The divergence dates were estimated on the basis that the percentage of divergence rate between pairs of LTR sequences within the primate lineage was on average 10% for synonymous sites with a divergence date of 28 MYA for human and Old World monkeys (Purvis 1995; Goodman et al. 1998; Takahata 2001). The divergence date of 28 MYA for the human and Old World monkeys corresponds to an average nucleotide substitution rate of 3.6 × 10−9 substitutions/site for each year (Tristem 2000; Hughes and Coffin 2004). These time estimations do not necessarily represent exact dates, but provide relative approximations.

Crossing-over percentage:

The crossing-over percentage (CO%) was calculated as a percentage ratio of the lowest number of HERVK9 insertions or deletions divided by the total number of HERVK9 alleles that were associated with a particular HLA-A allele.

Statistical analyses:

Allele frequencies (AF) were calculated using the formula: AF equals the sum of each individual allele/2N, where N equals the total number of individuals. The test for deviation of the HERVK9 deletion from Hardy–Weinberg equilibrium (HWE) and the allele-frequency difference among Japanese, African Americans, and Australian Caucasians was performed using the web computer program at http://ihg2.helmholtz-muenchen.de/cgi-bin/hw/hwa1.pl (Sasieni 1997) and GenPop software at http://genepop.curtin.edu.au. Heterozygosity (H) was estimated as 1 − (p2 + q2), where p and q are the allele frequencies (Ott 1992). The number and percentage of the HERVK9 insertion or deletion associated with each HLA-A allele (configured as a two-digit A allele, such as HLA-A01 (or A1), HLA-A02 (or A2), and HLA-A11 (or A11), was manually counted and converted to a percentage of the total number of each particular HLA-A allele. Fisher's exact test was used to assess whether a HERVK9 insertion or deletion was associated with particular HLA-A alleles. The statistical methods used to calculate the significance between the HLA-A alleles and the structural polymorphism were corrected for multiple comparisons using the Bonferroni-corrected P-value (pc).

Analyses were carried out in SPlus 7.0 (Insightful, Seattle). The linkage disequilibrium (LD) value D′ was calculated as a pairwise analysis of the association between HLA-A alleles and HERVK9 alleles using the Haploview v4 software (Barrett et al. 2005) downloaded from http://www.broad.mit.edu/mpg/haploview/.

RESULTS AND DISCUSSION

Location of HERVK9.HG polymorphism:

Figure 1 shows a genomic map of the HERVK9.HG deletion and insertion polymorphism located between the HLA-H and -G genes within the MHC class I region of two different haplotypes, HAPL A and HAPL B, respectively. The location of the HERVK9.HG polymorphism is 62.6 kb upstream of the classical HLA-A gene, 5.4 kb upstream of the HLA-H pseudogene, and 44.9 kb downstream of the nonclassical HLA-G gene. The HERVK9.HG internal sequence in HAPL B is flanked by the 5′ and 3′ LTR-MER9 sequences, whereas HAPL A has the solitary LTR-MER9.HG sequence as a remnant of the HERVK9.HG internal sequence deletion. The deletion appears to have occurred ancestrally as a homologous recombination involving the 5′ and 3′ LTR sequences of the HERVK9, leaving behind a solitary copy of the LTR, designated here as solo LTR.HG or sLTR (Figure 2).

PCR detection of the HERVK9.HG insertion and deletion polymorphism:

Because the insertion product size of the internal HERVK9 sequence (HERVK9i) is ∼5.15 kb in length and beyond the amplification efficiency of normal PCR protocols, we developed two separate PCR assays to detect (1) the insertion by amplifying a fragment of the HERVK9i sequence using an HERVK9 internal and external primer set (3Si2 and 3ASe2) and (2) the deletion by amplifying the solitary LTR-MER9.HG sequence using primer sets (1Se1 and 3ASe2) that flank the LTR-MER9 sequence. Figure 2 shows an example of the results of the electrophoresis of the PCR products obtained for Japanese samples and the COX and PGF control DNA. The specificity of the PCR assays was confirmed by using the PAC clones PAC 544A6 that has the duplicated LTR-MER9.AK locus (no amplification) and PAC 779F20 that has the LTR-MER9.HG locus (amplification). The amplification bands were easily visualized, genotyped, and scored as deletion and insertion homozygotes or heterozygotes by employing these two primer sets.

The frequency and the result of the HWE test for the HERVK9 insertion and deletion genotypes in the MHC class I region of 99 Japanese, 97 African American, and 174 Australian–Caucasian DNA samples is shown in Table 2. We were unable to amplify either deletion or insertion PCR products in 1 of the 100 Japanese samples and in 3 of the 100 African–American samples, possibly because of the poor quality of the template DNA or because of nucleotide mutations at the primer binding sites within the template DNA not allowing for hybridizations to occur between the primer(s) and the DNA template. These negative samples were excluded from our frequency analysis of the HERVK9.HG polymorphisms.

TABLE 2.

Frequency and HWE test for the HERVK9 insertion in the MHC class I region of Japanese, African–American, and Australian–Caucasian DNA samples

HERVK9 genotypes No. observed No. expected HWE test P exact Insertion frequency (±SD)
Japanese (n = 99)
Homozygote insertion 32 33.40 0.681
Heterozygote 51 48.21
Homozygote deletion 16 17.40
Insertion frequency (±SD) 0.59 ± 0.034
African American (n = 97)
Homozygote insertion 13 11.23 0.496
Heterozygote 40 43.55
Homozygote deletion 44 42.23
Insertion frequency (±SD) 0.34 ± 0.035
Australian Caucasian (n = 174)
Homozygote insertion 27 23.54 0.257
Heterozygote 74 80.92
Homozygote deletion 73 69.54
Insertion frequency (±SD) 0.37 ± 0.027

The HERVK9 insertion frequency was 0.59 for Japanese, 0.34 for African Americans, and 0.37 for Australian Caucasians with no deviation from HWE (P > 0.05), confirming the reliability of the genotyping method and that the HERVK9 dimorphism is distributed normally in the investigated populations. However, the difference between the Japanese and African Americans or Australian Caucasians in allele frequency for the HERVK9 insertion or deletion was statistically significant (P = 1.083e−06, Pearson's goodness-of-fit chi square, d.f. = 1, χ2 = 23.77). The heterozygosity value for the HERVK9 allele frequencies was estimated to be 0.4838 for the Japanese, 0.4488 for the African Americans, and 0.4662 for the Australian Caucasians.

HLA-A allelic polymorphisms in Japanese, African Americans, and Australian Caucasians:

Table 3 shows the HLA-A allele types, numbers, and percentage of the total number of HLA-A alleles in the 100 Japanese, 100 African American, and 174 Australian–Caucasian DNA samples. HLA-A genotyping by the PCR–SSOP–Luminex method was unsuccessful for eight African–American DNA samples and the HERVK9 PCR failed in three of these samples. Overall, the number of different HLA-A alleles detected was 8 in Japanese with 25 homozygous samples, 16 in Australian Caucasians, and 20 in African Americans. HLA-A homozygosity was more frequent in Japanese (25% of 100 samples) than in the Australian Caucasians (13.8% of 241 samples) or African Americans (13% of 92 samples). The percentage frequency of HLA-A allele distribution for the 100 Japanese samples was fairly typical of previous findings for the Japanese population (Moriyama et al. 2006) with HLA-A24 (38.5%), -A2 (25%), -A11 (12%), -A31 (10.5%), and -A26 (10%) being the five most common alleles. In comparison, the five most common alleles for the African Americans were HLA-A2 (16.5%), -A30 (14%), -A23 (7.5%), -A33 (7%), and -A68 (6%), similar to the findings of a different study (Tu et al. 2007). The African Americans had one or other of the Japanese HLA-A alleles, except for HLA-A20, whereas the Japanese had 7 of the 18 (38.9%) African–American HLA-A alleles. The Australian Caucasians had similar HLA-A allele frequencies as Caucasians from Europe and North America (Middleton et al. 2007; http://www.allelefrequencies.net).

TABLE 3.

HLA-A alleles in African Americans, Japanese, and Australian Caucasians

HLA-A alleles African–American total no. (%) Japanese total no. (%) Australian–Caucasian total no. (%)
A1 8 (4) 1 (0.5) 62 (17.8)
A2 33 (16.5) 50 (25) 86 (24.7)
A3 10 (5) 43 (12.4)
A11 2 (1) 24 (12) 29 (8.3)
A20 1 (0.5)
A23 15 (7.5) 1 (0.3)
A24 9 (4.5) 77 (38.5) 36 (10.3)
A25 2 (0.6)
A26 2 (1) 19 (10) 9 (2.5)
A29 11 (5.5) 21 (6)
A30 28 (14) 11 (3.2)
A31 5 (2.5) 21 (10.5) 14 (4)
A32 6 (3) 11 (3.2)
A33 14 (7) 6 (3) 3 (0.9)
A34 9 (4.5)
A36 3 (1.5)
A43 1 (0.5)
A66 4 (2)
A68 12 (6) 17 (4.9)
A69 1 (0.3)
A74 10 (5)
A77 1 (0.5)
A80 1 (0.5)
Unknown 16 (8) 2 (0.6)
Total 200 (100) 200 (100) 348 (100)

HERVK9 and HLA-A haplotypes in homozygous cell lines:

To test the reliability of the HERVK9.HG PCR assay and examine the linkage between the HLA-A and HERVK9 alleles, we first analyzed a DNA reference set of 50 B-lymphoblastoid cell lines of various ethnic origins that were homozygous for different HLA-A alleles. Table 1 lists the individual homozygous cell lines by name, ID, ethnicity, HLA-A alleles and HLA-B alleles, and the HERVK9 insertion and deletion PCR results. Essentially there was a significant (P < 0.05, two-tail binomial probability test) 100% linkage between the HERVK9 insertion and HLA-A3 (seven samples) or HLA-A24 (five samples) and a 100% linkage between the HERVK9 deletion and HLA-A1 (10 samples) or HLA-A2 (14 samples). The sample numbers for the other HERVK9 and HLA-A linkages were too few to assess statistically.

On the basis of discrepancies in the expected HERVK9 PCR results, we found that two of our commercially purchased cell-line DNA samples, PMA–TISI and PMA–SSTO (cell-line nos. 32 and 42 in Table 1), were mislabeled. The cell-line no. 42, originally named as SSTO, had the HERVK9 insertion instead of the expected HERVK9 deletion as represented by the SSTO DNA sequence with the GenBank accession no. BX284699. The cell-line SSTO was therefore retyped for HLA alleles and found to have HLA-A31, -B15, and -DRB1*08 instead of the previously reported HLA-A32, -B4402, and -DRB1*0403 alleles that were originally linked with SSTO. We therefore renamed this previously mislabeled cell-line DNA sample as PMA–SSTO, although its retyped alleles suggest that it is similar to the cell-line DNA sample no. 7, named SPL. Similarly, we found that the PMA–TISI DNA sample reference no. 32 (originally named TIS) is composed of the allele combination, HLA-A1, -B57, and -DRB1*7 and the HERVK9 deletion instead of the expected HLA-A2402, B3508, and DRB1*1103 and the HERVK9 insertion. These two examples highlight the potential value of using the HERVK9 PCR for detecting mislabeled or contaminated HLA DNA samples.

HERVK9 and HLA-A allelic associations in Japanese, African Americans, and Australian Caucasians:

Table 4 provides a summary of the statistical significance (corrected and uncorrected for multiple comparisons) of association between the HERVK9 insertion or deletion with particular HLA-A alleles in the Japanese, African–American, and Australian–Caucasian DNA samples as determined by the Fisher's exact test for association or for equality of proportions of HERVK9 insertion or deletion homozygotes with HLA-A type. The HLA-A3, -A11, -A23, and -A24 alleles were associated significantly (P < 0.05) with the HERVK9 insertion, whereas HLA-A1 and -A2 were associated significantly (P < 0.05) with the HERVK9 deletion in one or the other of the three populations.

TABLE 4.

Association of homozygous HERVK9 insertion or deletion pairs with HLA-A alleles in Japanese, African Americans, and Australian Caucasians

HERVK9 homozygous pairs
HERVK9 heterozygous no. Fisher's exact tests
HLA alleles Insertion no. Deletion no. Homozygous insertion vs. deletion P-value two-tailed (pc)a Overall HERVK9 genotype P-value two-tailed (pc)a
A1 J 0 1 0 0.33 (1.0) 0.16 (1.0)
AA 0 7 1 0.17 (1.0) 0.04 (0.74)
AC 0 37 17 3.5e-7 (1.0e-5) 2.2e-7 (7.3e-6)
A2 J 1 11 31 2.0e-6 (5.8e-5) 7.2e-9 (2.4e-7)
AA 0 19 10 0.002 (0.056) 0.0019 (0.061)
AC 0 43 33 2.0e-8 (5.8e-7) 4.5e-8 (1.5e-6)
A3 AA 3 1 6 0.04 (0.69) 0.026 (0.58)
AC 16 1 24 1.2e-10 (3.5e-9) 1.4e-11 (4.6e-10)
A11 J 6 6 9 0.18 (1.0) 0.22 (1.0)
AA 0 2 0 1.0 (1.0) 0.63 (1.0)
AC 11 0 17 8.2e-8 (2.4e-6) 1.6e-8 (5.3e-7)
A23 AA 6 0 7 0.0001 (0.003) 4.0e-5 (0.0013)
AC 1 0 0 0.27 (1.0) 0.16 (1.0)
A24 J 27 0 36 9.0e-9 (2.6e-7) 5.1e-9 (1.7e-7)
AA 2 0 6 0.06 (0.83) 0.015 (0.39)
AC 16 0 18 8.2e-12 (2.4e-10) 4.0e-12 (1.3e-10)
A25 AC 0 1 1 1.0 (1.0) 1.0 (1.0)
A26 J 1 5 11 0.012 (0.30) 0.014 (0.37)
AA 1 1 0 0.43 (1.0) 0.26 (1.0)
AC 0 4 5 0.57 (1.0) 0.53 (1.0)
A29 AA 0 8 3 0.18 (1.0) 0.11 (0.98)
AC 0 14 7 0.019 (0.43) 0.024 (0.55)
A30 AA 1 12 13 0.15 (0.99) 0.21 (1.0)
AC 0 8 3 0.10 (0.95) 0.096 (0.96)
A31 J 11 0 9 0.009 (0.23) 0.012 (0.33)
AA 0 0 5 0.031 (0.65)
AC 3 0 11 0.018 (0.41) 0.0007 (0.023)
A32 AA 0 3 2 0.57 (1.0) 0.85 (1.0)
AC 0 8 3 0.10 (0.95) 0.096 (0.96)
A33 J 4 0 2 0.29 (1.0) 0.24 (1.0)
AA 3 0 11 0.012 (0.30) 0.0004 (0.013)
AC 2 0 1 0.070 (0.88) 0.033 (0.67)
A34 AA 2 4 2 0.63 (1.0) 0.36 (1.0)
A36 AA 2 0 1 0.06 (0.83) 0.027 (0.59)
A43 AA 0 1 0 1.0 (1.0) 1.0 (1.0)
A66 AA 0 1 3 1.0 (1.0) 0.38 (1.0)
A68 AA 1 9 1 0.42 (1.0) 0.019 (0.47)
AC 0 11 5 0.034 (0.63) 0.047 (0.78)
A69 AC 0 0 1 0.57 (1.0)
A74 AA 0 5 5 0.32 (1.0) 0.48 (1.0)
A77 AA 0 1 0 1.0 (1.0) 1.0 (1.0)
A80 AA 0 0 1 0.57 (1.0)
Total J 32 16 51
AA 13 40 39
AC 27 74 73

J, Japanese; AA, African American; AC, Australian Caucasian.

a

pc, Bonferroni-corrected P-value.

Homozygous HERVK9 deletions were detected in 17 Japanese represented by four different HLA-A alleles: one HLA-A1, 11 HLA-A2, six HLA-A11, and five HLA-A26. Homozygous HERVK9 insertions were found in 32 Japanese individuals represented by six different HLA-A alleles: one HLA-A2, six HLA-A11, 27 HLA-A24, one HLA-A26, 11 HLA-A31, and four HLA-A33. All 13 individuals homozygous for the HLA-A24 allele and all 51 heterozygous individuals with an HLA-A24 allele had the HERVK9 internal insertion. The African–American and Australian–Caucasian association analysis matched the Japanese trends to a large degree, but also showed that HLA-A23 and -A33 were associated significantly (P < 0.05) with the HERVK9 insertion in African Americans while HLA-A1 was associated significantly (P < 0.05) with the HERVK9 deletion in Australian Caucasians. The HERVK9 and HLA-A allelic associations observed for the three populations (Table 4) are consistent with the results of the homozygous HLA-A cell line study (Table 1).

LD and crossing-over analysis of HERVK9/HLA-A haplotypes:

LD analysis using the GenPop software revealed a highly significant (P < 1.0e−5) linkage between the HLA-A and the HERVK9 alleles in the Japanese and African Americans or Australian Caucasians. The multi-allelic D′ estimation by the Haploview software was 0.87 for Japanese, 0.86 for the African Americans, and 0.96 for the Australian Caucasians. Table 5 shows the haplotypes, haplotype frequency, D′ values, and crossing-over percentage between the two loci in the pairwise LD analysis of the HERVK9 and the HLA-A haplotypes in the Japanese, African Americans, Australian Caucasians, and a combination of all three populations. There were 33 HLA-A/HERVK9 haplotypes for all three populations with 25 haplotypes for the African Americans, 19 for the Australian Caucasians, and 12 for the Japanese. The two highest frequencies for all three populations were haplotype 2 (HLA-A2/HERVK9 deletion) at 0.2294 and haplotype 11 (HLA-A24/HERVK9 insertion) at 0.1648. The highest haplotype frequency was haplotype 2 for African Americans and Australian Caucasians and haplotype 11 for Japanese. The results of the LD analysis (Table 5) support the association results shown in Table 4. Taken together, the HERVK9 deletions in the cell lines (Table 1) and the Japanese, African–American, and Australian–Caucasian samples were strongly haplotypic for HLA-A1, -A2, -26, and -A68. The homozygote HERVK9 insertions were strongly haplotypic for HLA-A3, -A23, -A24, -A31, and -A33.

TABLE 5.

Haplotypes, haplotype frequency, D′ values and CO% for the pairwise LD analysis of HERVK9 insertion or deletion and HLA-A alleles in Japanese, African Americans, Australian Caucasians, and the combination of all three populations

All three populations
African American
Australian Caucasian
Japanese
Haplotype HLA-A HERVK9.HG n Frequency D CO% n Frequency D CO% n Frequency D CO% n Frequency D CO%
1 1 D 71 0.0975 1 0 8 0.0435 1 0 62 0.1792 1 0 1 0.0051 1 0
2 2 D 167 0.2294 0.972 33 0.1793 1 0 85 0.2457 0.968 49 0.2475 0.966
3 2 I 2 0.0027 0.972 1.18 1 0.0029 0.968 1.16 1 0.0051 0.966 2
4 3 D 2 0.0027 0.935 3.77 1 0.0054 0.845 10 1 0.0029 0.963 2.38
5 3 I 51 0.0701 0.935 9 0.0489 0.845 42 0.1214 0.963
6 11 D 16 0.0220 0.497 29.09 2 0.0109 1 14 0.0707 0.283
7 11 I 39 0.0536 0.497 29 0.0838 1 0 10 0.0505 0.283 41.7
8 20 I 1 0.0014 1 1 0.0051 1
9 23 I 16 0.0220 1 0 15 0.0815 1 0 1 0.0029 1
10 24 D 2 0.0027 0.972 1.67 2 0.0101 0.942
11 24 I 120 0.1648 0.972 9 0.0489 1 0 36 0.1040 1 0 75 0.3788 0.942 2.6
12 25 D 2 0.0027 1 2 0.0058 1
13 26 D 27 0.0371 0.836 1 0.0054 0.227 9 0.0260 1 0 17 0.0859 0.904
14 26 I 2 0.0027 0.836 6.9 1 0.0054 0.227 1 0.0051 0.904 5.56
15 29 D 30 0.0412 0.852 11 0.0598 1 0 19 0.0549 0.741
16 29 I 2 0.0027 0.852 6.25 2 0.0058 0.741 9.52
17 30 D 37 0.0508 0.878 26 0.1413 0.798 11 0.0318 1 0
18 30 I 2 0.0027 0.878 5.13 2 0.0109 0.798
19 31 I 38 0.0522 0.878 5 0.0272 1 12 0.0347 0.774 21 0.1061 1 0
20 31 D 2 0.0027 0.914 2 0.0058 0.774 14.28
21 32 D 17 0.0234 1 0 6 0.0326 1 11 0.0318 1 0
22 33 I 23 0.0316 1 0 14 0.0761 1 0 3 0.0087 1 0 6 0.0303 1 0
23 34 D 5 0.0069 0.039 5 0.0272 0.141
24 34 I 4 0.0055 0.039 44.44 4 0.0217 0.141 44.44
25 36 I 3 0.0041 1 3 0.0163 1
26 43 D 1 0.0014 1 1 0.0054 1
27 66 D 3 0.0041 0.407 3 0.0163 0.292
28 66 I 1 0.0014 0.407 25 1 0.0054 0.292 25
29 68 D 27 0.0371 0.836 10 0.0543 0.528 17 0.0491 1 0
30 68 I 2 0.0027 0.836 6.9 2 0.0109 0.528 16.67
31 69 I 1 0.0014 1 1 0.0029 1
32 74 D 11 0.0151 1 0 11 0.0598 1
33 80 D 1 0.0014 1 1 0.0054 1
Multi-allelic D 0.9 0.86 0.96 0.87

I, insertion; D, deletion.

The crossing-over analysis of haplotypes revealed a degree of variability in the crossing-over event between the HLA-A and the HERVK9 alleles. The common HLA-A alleles, such as HLA-A1, HLA-A2, and HLA-A24, had low crossing-over percentages (CO% <1.7%) with the HERVK9 alleles, suggesting that haplotypes 1, 2, and 11 are well established or fixed in the three populations. Most CO% were <7%, suggesting a relatively low level of crossing over between the loci, but some HLA-A alleles showed a high CO% such as 44.4% for HLA-A34, 29% for HLA-A11, and 25% for HLA-A66.

The HLA-A2 allele is distributed widely in humans and found in most ethnic groups of Caucasian, Asian, and African origin, suggesting that it is one of the oldest HLA-A alleles (Tanaka et al. 1997; Tu et al. 2007). There was only one HLA-A2 sample that was associated with the Japanese or Australian–Caucasian homozygous HERVK9 insertion rather than with the more common haplotypic association between HLA-A2 and the HERVK9 deletion, which may be the result of a historical recombination event at a genomic site between the HLA-A locus and the HERVK9 locus, which are ∼62.6 kb apart. The HLA-A68 allele in the African Americans and Australian Caucasians is closely related to the HLA-A2 alleles in sequence and phylogenetic analysis (data not shown), but is less strongly associated than HLA-A2 with the HERVK9 deletion, possibly as a result of the statistically smaller sample numbers. Since the HLA-A68 frequency was relatively similar in the African Americans and Australian Caucasians, but absent in the Japanese, this allele might have evolved more rapidly in Africans and Caucasians than in Japanese as a consequence of more frequent migrations and interbreeding events.

HLA-A30 was the second most frequent allele (14%) after HLA-A2 (16.5%) in the African Americans in our study. Some HLA-A30 alleles were linked to a HERVK9 insertion as in the LBF cell line (HLA-A3001-Cw6-B13-DR7-DQ2) and others to a HERVK9 deletion as in the cell line EJ32B (HLA-A3002-Cw5-B18-DR3-DQ2). In this regard, the different associations and/or linkages between HLA-A30 and HERVK9.HG support previous reports that the HLA-A30 allelic group has evolved into at least two major subgroups of two different ancestral haplotypes (Bodmer et al. 1997; Tanaka et al. 1997; Kulski et al. 2001).

The HLA-A11 allele was associated almost equally with the HERVK9 insertion and deletion allele in the Japanese, defining at least two distinct and relatively frequent HLA-A11 haplotypes. In contrast, HLA-A11 was associated only with the HERVK9 insertion in the Australian Caucasians. A similar variability between the HLA-A11 allele and a polymorphic retroelement was found previously between the HLA-A11 alleles and the AluyHG polymorphic marker in the Japanese (Kulski et al. 2001). The associations between HLA-A11 and HERVK9.HG might vary in the Japanese and not in the Australian Caucasians because (1) the original HERVK9.HG insertion that was previously linked to HLA-A11 was deleted from one or more founder individuals, (2) the original linkage between the HLA-A11 and HERVK9.HG deletion or insertion was lost due to recombinational crossing over in one or more founder individuals, or (3) HLA-A11 was originally linked with the HERVK9.HG deletion, but then a new HERVK9 insertion occurred at exactly the same location by recombination or a new retrotransposition event. While recombinational crossing over is the most likely explanation for the association of a HERVK9 insertion or deletion with a particular HLA-A allele, more extensive population and family studies are needed to support this hypothesis using other haplotype markers in addition to the HERVK9 marker. Nevertheless, the examples of the association between the HERVK9.HG deletion and the different HLA alleles, such as the HLA-A2, -A11, -A30, and -A68 haplotypes, demonstrates the potential usefulness of HERVK9.HG as an informative polymorphic or haplotype marker for population studies of human migration and the origins of contemporary populations.

Estimated dates for HERVK9 insertion and deletion based on the sequence divergence of the LTR-MER9:

The DNA nucleotides of the two LTRs flanking a HERV are considered to be identical in sequence at the time of insertion, but then mutate and diverge in sequence with evolutionary time (Hughes and Coffin 2004). Similarly, the solitary LTR, which results from a hybridization of the 5′ and 3′ LTR sequences with the deletion of the internal HERV sequence, will mutate and diverge in sequence with evolutionary time. If the sequence divergence can be determined for the solo, 5′, and 3′ LTR at a polymorphic locus, then evolutionary dates for the HERV insertion and deletion can be calculated on the basis of the assumption that an average nucleotide divergence of 10% for synonymous sites corresponds to 28 MY for human and nonhuman primates (Tristem 2000; Hughes and Coffin 2004).

Table 6 shows the number and percentage of base differences between the solo, 5′, and 3′ LTR of the HERVK9.HG sequences and the calculated evolutionary time for HERVK9 insertion and deletion events. The percentage of sequence differences between the 5′ and 3′ MER9.HG in humans with the HLA-A3 allele and the gorilla and chimpanzee was 6.2% on average, suggesting that the HERVK9.HG integration occurred ∼17.3 MYA, which is well after the split between the human and the Old World monkey lineages ∼28 MYA (Goodman et al. 1998). This evolutionary date for the HERVK9.HG integration and its evolution in the great apes is confirmed by the presence of the ERVK9 sequence in the HG locus of the chimpanzee (Kulski et al. 2005), gorilla (Sanger Institute, NCBI accession no. CU104658), and orangutan (M. Yawata, personal communication), but not in the rhesus macaque (Kulski et al. 2004), which is an Old World monkey (Goodman et al. 1998).

TABLE 6.

Base differences among solo, 5′, and 3′ LTR (MER9) sequences at the HG locus for estimating the evolutionary time of the HERVK9 insertion and deletion

Pairwise LTR (MER9) sequence comparisons No. of base differences % of base differences MYA Accession no. of MER9 sequence
HERVK9 insertion
5′ MER9.A3 vs. 3′MER9.A3 35 6.7 18.8 AL645929
5′ MER9.Patr vs. 3′ MER9.Patr 31 6.4 18.0 AC192848
5′ MER9.Gogo vs. 3′ MER9.Gogo 28 5.4 15.1 CU104658
Average 31.3 6.2 17.3
SD 2.87 0.56 1.59
HERVK9 deletion
5′ MER9.A3 vs. sMER9.A1 24 4 11.2 AL645929
3′ MER9.A3 vs. sMER9.A1 31 6.6 18.5 AL645929
5′ MER9.Patr vs. sMER9.A1 25 4.3 12 AC192848
3′ MER9.Patr vs. sMER9.A1 33 6.9 19.3 AC192848
5′ MER9.Gogo vs. sMER9.A1 24 4.3 12 CU104658
3′ MER9.Gogo vs. sMER9.A1 31 6.2 17.3 CU104658
Average 28.0 5.4 15.05
SD 3.7 1.2 3.38
MER9-LTR sequence diversity
sMER9.A1 vs. sMER9.A1 0 0 0 AL671561
sMER9.A2 vs. sMER9.A1 2 0.4 1.1 CR388220
sMER9.A26 vs. sMER9.A1 2 0.4 1.1 AL845454
sMER9.A29 vs. sMER9.A1 2 0.4 1.1 BX927141
sMER9.A32 vs. sMER9.A1 4 0.7 2 BX284699
sMER9.A34 vs. sMER9.A1 6 1.3 3.6 AB443937a
Average 2.67 0.53 1.56
SD 1.89 0.4 1.2
3′ MER9.A3 vs. 3′ MER9.A3 0 0 0 AB443936a
3′ MER9.A11 vs. 3′ MER9.A3 2 0.4 1.1 AB443932a
3′ MER9.A24 vs. 3′ MER9.A3 5 1 2.8 AB443933a
3′ MER9.A30 vs. 3′ MER9.A3 6 1.2 3.4 AB443934a
3′ MER9.A31 vs. 3′ MER9.A3 1 0.2 0.6 AB443935a
Average 2.8 0.56 1.58
SD 2.31 0.46 1.3
3′ MER9.HG Patr vs. 3′ MER9.A3 15 2.9 8.1 AC192848
3′ MER9.HG Gogo vs. 3′ MER9.A3 12 2.3 6.4 CU104658
Average 13.5 2.6 7.25
SD 1.5 0.3 0.85

Ten percent nucleotide difference corresponds to 28 MY of sequence divergence as a relative comparison.

a

Sequences were obtained in this study by sequencing PCR products as described in materials and methods.

In addition, the percentage of sequence differences between a representative solo MER9 sequence from a person with the HLA-A1 allele and the 5′ and 3′ MER9.HG in a person with the HLA-A3 allele and the gorilla and chimpanzee was 5.4% on average, suggesting that the HERVK9.HG deletion occurred ∼15.1 MYA in a founding ancestor, which, from an evolutionary perspective, is soon after the HERVK9 insertion. The evolutionary date of 15.1 MYA for the HERVK9.HG deletion can be confirmed in future detailed population studies of the Mhc-A and MER9/ERVK9 sequence variations in the nonhuman great apes, chimpanzee, gorilla, and orangutan.

Despite the large sequence divergence between the solo MER9 and 3′ or 5′ MER9, the sequence divergence of the solo MER9 for six different solo MER9/HLA-A allelic haplotypes is small at an average of 0.53% and only up to 1.3% sequence divergence that has evolved during the past 3.6 MY, which is at a time well after the emergence of the chimpanzee and the gorilla (Goodman et al. 1998; Takahata 2001). Similarly, there is relatively small sequence divergence (average of 0.56%) between different human 3′ MER9 sequences, which suggests that the solo MER9 and the 3′ MER9 sequences have co-emerged in the modern human population possibly as a result of population bottlenecks contributing to minor sequence divergence during the period of 3.6 MY. In comparison, the sequence divergence between the human 3′ MER9/HLA-A3 haplotype and the 3′ MER9 sequence of the chimpanzee (Patr) or gorilla (Gogo) was greater at 2.9 and 2.3%, respectively, which suggests a sequence divergence of 6.4–8.1 MY between the human and the chimpanzee or the gorilla.

Hitchhiking effect and concluding remarks:

This study has provided a unique insight into the haplotypic association between a HERV insertion/deletion polymorphism and HLA-A alleles within the MHC genomic region. A complexity of population genetics and evolutionary factors involving random genetic drift, migration, interbreeding, balancing selection, and hitchhiking effect have probably contributed to the difference in the HERVK9 deletion frequency between the Japanese and African Americans or Australian Caucasians. The strong haplotypic association of the HERVK9.HG polymorphism with particular HLA-A alleles, at loci separated by 62.6 kb, suggests that the HERVK9 deletion frequency is largely dependent on the hitchhiking effect (linkage) of the HLA-A locus, which is under balancing selection. Genetic hitchhiking occurs when alleles at neutral loci are changed in frequency because of a strong association or linkage with alleles at a selected locus (Kojima and Schaffer 1967; Hedrick 1980; Asmussen and Clegg 1981). In this regard, the HLA-A gene is considered to be under dominant balancing or heterozygous selection and the formation of polymorphisms at the HLA-A locus favors fitness or resistance to infections (Takahata et al. 1992; Black and Hedrick 1997). Consequently, the HERVK9 allelic frequency appears to be strongly dependent on its linkage to particular HLA-A alleles that are under balancing selection. This correlation is supported by our results where the HLA-A alleles associated with the HERVK9 deletions have occurred at a total frequency of 35.4% in the Nagano Japanese population and at 63.5% in the African Americans or Australian Caucasians, which is at about the same frequency as the HERVK9 deletion in the respective populations. The association of the HERVK9.HG deletion with the HLA-B alleles in Japanese or the cell lines (data not shown) was more random or mixed than the association between HERVK9.HG and HLA-A alleles, probably because of the greater likelihood of recombination breakpoints in the genomic sequence between the HLA-B and HERVK9.HG loci. It is evident from our analysis that while there is a strong hitchhiking effect between HERVK9 and the HLA-A locus, there is no obvious hitchhiking relationship between the frequency of the HERVK9 deletion and the alleles at HLA-B or HLA-DR (data not shown).

The hitchhiking effect at the HLA-A locus might be complicated by the effect of selection on the HLA-G alleles (Tan et al. 2005) and by competing selection forces acting on the HLA-A and HLA-G genes. Since the HERVK9 locus is slightly closer to the HLA-G than the HLA-A locus, the HERVK9 polymorphism might also be used as a haplotypic and evolutionary marker to examine its linkage with HLA-G alleles. Future analyses could be extended to an examination of disease associations and the haplotypic and evolutionary relationships among HERVK9.HG, HLA-A, HLA-G, and other polymorphic retroelements, such as the haplo-specific Alu elements, which are within close vicinity to the HLA-A gene (Kulski and Dunn 2005). The HERVK9 PCR genotyping is a simple and economical method that can be easily applied in conjunction with the more difficult and expensive HLA-A or HLA-G typing methods in population, evolutionary, and disease studies.

References

  1. Asmussen, M. A., and M. T. Clegg, 1981. Dynamics of the linkage disequilibrium function under models of gene-frequency hitchhiking. Genetics 99 337–356. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Barbulescu, M., G. Turner, M. I. Seaman, A. S. Deinard, K. K. Kidd et al., 1999. Many human endogenous retrovirus K (HERV-K) proviruses are unique to humans. Curr. Biol. 9 861–868. [DOI] [PubMed] [Google Scholar]
  3. Barrett, J. C., B. Fry, J. Maller and M. J. Daly, 2005. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21 263–265. [DOI] [PubMed] [Google Scholar]
  4. Black, F. L., and P. W. Hedrick, 1997. Strong balancing selection at HLA loci: evidence from segregation in South Amerindian families. Proc. Natl. Acad. Sci. USA 94 12452–12456. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bodmer, J., A. Cambon-Thomsen, J. Hors, A. Piazza and A. Sanchez-Mazas, 1997. Anthropology report: introduction, pp. 269–284 in Genetic Diversity of HLA: Functional and Medical Implication, edited by D. Charron. EDK, Paris.
  6. Dangel, A. W., A. R. Mendoza, B. J. Baker, C. M. Daniel, M. C. Carroll et al., 1994. The dichotomous size variation of human complement C4 genes is mediated by a novel family of endogenous retroviruses, which also establishes species-specific genomic patterns among Old World primates. Immunogenetics 40 425–436. [DOI] [PubMed] [Google Scholar]
  7. Dawkins, R., C. Leelayuwat, S. Gaudieri, G. Tay, J. Hui et al., 1999. Genomics of the major histocompatibility complex: haplotypes, duplication, retroviruses and disease. Immunol. Rev. 167 275–304. [DOI] [PubMed] [Google Scholar]
  8. Goodman, M., C. A. Porter, J. Czelusniak, S. L. Page, H. Schneider et al., 1998. Toward a phylogenetic classification of primates based on DNA evidence complemented by fossil evidence. Mol. Phylogenet. Evol. 9 585–598. [DOI] [PubMed] [Google Scholar]
  9. Hampe, A., O. Coriton, N. Andrieux, G. Carn, M. Lepourcelet et al., 1999. A 356-kb sequence of the subtelomeric part of the MHC class I region. DNA Seq. 10 263–299. [DOI] [PubMed] [Google Scholar]
  10. Hedrick, P.W., 1980. Hitchhiking: a comparison of linkage and partial selfing. Genetics 94 791–808. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Hughes, J. F., and J. M. Coffin, 2004. Human endogenous retrovirus K solo-LTR formation and insertional polymorphisms: implications for human and viral evolution. Proc. Natl. Acad. Sci. USA 101 1668–1672. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Itoh, Y., N. Mizuki, T. Shimada, F. Azuma, M. Itakura et al., 2005. High-throughput DNA typing of HLA-A, -B, -C, and -DRB1 loci by a PCR-SSOP-Luminex method in the Japanese population. Immunogenetics 57 717–729. [DOI] [PubMed] [Google Scholar]
  13. Kapitonov, V. V., A. Pavlicek and J. Jurka, 2004. Anthology of human repetitive DNA, pp. 251–305 in Encyclopedia of Molecular Cell Biology and Molecular Medicine, Vol. 1, edited by R. A. Meyers. Wiley-VCH Verlag GmbH, KGaA, Weinheim, Germany.
  14. Kimura, M., 1980. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16 116–120. [DOI] [PubMed] [Google Scholar]
  15. Kojima, K., and H. E. Schaffer, 1967. Survival process of linked mutant genes. Evolution 21 518–531. [DOI] [PubMed] [Google Scholar]
  16. Kulski, J. K., and D. S. Dunn, 2005. Polymorphic Alu insertions within the major histocompatibility complex class I genomic region: a brief review. Cytogenet. Genome Res. 110 193–202. [DOI] [PubMed] [Google Scholar]
  17. Kulski, J. K., and H. Inoko, 2003. Major histocompatibility complex (MHC) genes, pp. 778–785 in Nature Encyclopedia of the Human Genome, Vol. 3, edited by D. N. Cooper. Macmillan, Nature Publishing Group, London.
  18. Kulski, J. K., S. Gaudieri, H. Inoko and R. L. Dawkins, 1999. Comparison between two human endogenous retrovirus (HERV)-rich regions within the major histocompatibility complex. J. Mol. Evol. 48 675–683. [DOI] [PubMed] [Google Scholar]
  19. Kulski, J. K., P. Martinez, N. Longman-Jacobsen, W. Wang, J. Williamson et al., 2001. The association between HLA-A alleles and an Alu dimorphism near HLA-G. J. Mol. Evol. 53 114–123. [DOI] [PubMed] [Google Scholar]
  20. Kulski, J. K., T. Anzai, T. Shiina and H. Inoko, 2004. Rhesus macaque class I duplicon structures, organization, and evolution within the alpha block of the major histocompatibility complex. Mol. Biol. Evol. 21 2079–2091. [DOI] [PubMed] [Google Scholar]
  21. Kulski, J. K., T. Anzai and H. Inoko, 2005. ERVK9, transposons and the evolution of MHC class I duplicons within the alpha-block of the human and chimpanzee. Cytogenet. Genome Res. 110 181–192. [DOI] [PubMed] [Google Scholar]
  22. Mack, M., K. Bender and P. M. Schneider, 2004. Detection of retroviral antisense transcripts and promoter activity of the HERV-K(C4) insertion in the MHC class III region. Immunogenetics 56 321–332. [DOI] [PubMed] [Google Scholar]
  23. Mager, D. L., and P. Medstrand, 2003. Retroviral repeat sequences, pp. 57–63 in Nature Encyclopedia of the Human Genome, Vol. 5, edited by D. N. Cooper. Macmillan, Nature Publishing Group, London.
  24. Mayer, J., and E. Meese, 2005. Human endogenous retroviruses in the primate lineage and their influence on host genomes. Cytogenet. Genome Res. 110 448–456. [DOI] [PubMed] [Google Scholar]
  25. Medstrand, P., and J. Blomberg, 1993. Characterization of novel reverse transcriptase encoding human endogenous retroviral sequences similar to type A and type B retroviruses: differential transcription in normal human tissues. J. Virol. 67 6778–6787. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Middleton, D., L. Menchaca, H. Rood and R. Komerofsky, 2007. New allele frequency database. Tissue Antigens 61 403–407. [DOI] [PubMed] [Google Scholar]
  27. Moriyama, Y., K. Kato, T. Mura and T. Juji, 2006. Analysis of HLA gene frequencies and HLA haplotype frequencies for bone marrow donors in Japan. MHC 12 183–201 (in Japanese). [Google Scholar]
  28. Ott, J., 1992. Strategies for characterizing highly polymorphic markers in human gene mapping. Am. J. Hum. Genet. 51 283–290. [PMC free article] [PubMed] [Google Scholar]
  29. Prugnolle, F., A. Manica, M. Charpentier, J. F. Guegan, V. Guernier et al., 2005. Pathogen-driven selection and worldwide HLA class I diversity. Curr. Biol. 15 1022–1027. [DOI] [PubMed] [Google Scholar]
  30. Purvis, A., 1995. A composite estimate of primate phylogeny. Philos. Trans. R. Soc. Lond. B Biol. Sci. 348 405–421. [DOI] [PubMed] [Google Scholar]
  31. Sasieni, P. D., 1997. From genotypes to genes: doubling the sample size. Biometrics 53 1253–1261. [PubMed] [Google Scholar]
  32. Schneider, P. M., K. Witzel-Schlomp, C. Steinhauer, B. Stradmann-Bellinghausen and C. Rittner, 2001. a Rapid detection of the ERV-K(C4) retroviral insertion reveals further structural polymorphism of the complement C4 genes in Old World primates. Exp. Clin. Immunogenet. 18 130–134. [DOI] [PubMed] [Google Scholar]
  33. Schneider, P. M., K. Witzel-Schlomp, C. Rittner and L. Zhang, 2001. b The endogenous retroviral insertion in the human complement C4 gene modulates the expression of homologous genes by antisense inhibition. Immunogenetics 53 1–9. [DOI] [PubMed] [Google Scholar]
  34. Seifarth, W., O. Frank, U. Zeilfelder, B. Spiess, A. D. Greenwood et al., 2005. Comprehensive analysis of human endogenous retrovirus transcriptional activity in human tissues with a retrovirus-specific microarray. J. Virol. 79 341–352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Shiina, T., G. Tamiya, A. Oka, N. Takishima, T. Yamagata et al., 1999. Molecular dynamics of MHC genesis unraveled by sequence analysis of the 1,796,938-bp HLA class I region. Proc. Natl. Acad. Sci. USA 96 13282–13287. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Shiina, T., H. Inoko and J. K. Kulski, 2004. An update of the HLA genomic region, locus information and disease associations: 2004. Tissue Antigens 64 631–649. [DOI] [PubMed] [Google Scholar]
  37. Stewart, C. A., R. Horton, R. J. Allcock, J. L. Ashurst, A. M. Atrazhev et al., 2004. Complete MHC haplotype sequencing for common disease gene mapping. Genome Res. 14 1176–1187. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Takahata, N., 2001. Molecular phylogeny and demographic history of humans, pp. 299–305 in Humanity From African Naissance to Coming Millennia: Colloquia in Human Biology and Palaeoanthropology, edited by P.V. Tobias, M. A. Ratth, J. Morri-Cecchi and G. A. Doyle. Firenze University Press, Firenze/Witwatersrand University, Johannesburg, South Africa.
  39. Takahata, N., Y. Satta and J. Klein, 1992. Polymorphism and balancing selection at major histocompatibility complex loci. Genetics 130 925–938. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Tan, Z., A. M. Shon and C. Ober, 2005. Evidence of balancing selection at the HLA-G promoter region. Hum. Mol. Genet. 14 3619–3628. [DOI] [PubMed] [Google Scholar]
  41. Tanaka, H., K. Tokonaga, H. Inoko, K. Tsuji, N. O. Chimge et al., 1997. Distribution of HLA-A, B and DRB1 alleles and haplotypes in Northeast Asia, pp. 285–291 in Genetic Diversity of HLA: Functional and Medical Implication, edited by D. Charron. EDK, Paris.
  42. Tassabehji, M., T. Strachan, M. Anderson, R. D. Campbell, S. Collier et al., 1994. Identification of a novel family of human endogenous retroviruses and characterization of one family member, HERV-K(C4), located in the complement C4 gene cluster. Nucleic Acids Res. 22 5211–5217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Tristem, M., 2000. Identification and characterization of novel human endogenous retrovirus families by phylogenetic screening of the human genome mapping project database. J. Virol. 74 3715–3730. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Tu, B., S. J. Mack, A. Lazaro, A. Lancaster, G. Thomson et al., 2007. HLA-A, -B, -C, -DRB1 allele and haplotype frequencies in an African American population. Tissue Antigens 69 73–85. [DOI] [PubMed] [Google Scholar]
  45. Turner, G., M. Barbulescu, M. Su, M. I. Jensen-Seaman, K. K. Kidd et al., 2001. Insertional polymorphisms of full-length endogenous retroviruses in humans. Curr. Biol. 11 1531–1535. [DOI] [PubMed] [Google Scholar]

Articles from Genetics are provided here courtesy of Oxford University Press

RESOURCES