Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2004 Feb 18;32(3):e31. doi: 10.1093/nar/gnh035

Large-scale determination of the methylation status of retrotransposons in different tissues using a methylation tags approach

Konstantin Khodosevich 1,*, Yuri Lebedev 1, Eugene D Sverdlov 1
PMCID: PMC373429  PMID: 14973327

Abstract

A technique for simultaneous determination of the methylation status of numerous loci containing retroelements (REs) is reported. It is based on the observation that methylated and unmethylated areas in the genome are usually extended, and therefore the methylation of particular methyl-sensitive restriction endonuclease recognition sites might reflect the methylation status of DNA regions around them. The method includes dot-blot hybridization of repeat flanking sequences arrayed on a solid support with specifically amplified flanking regions of presumably unmethylated repeats. A multitude of flanking regions of REs adjacent to unmethylated restriction sites are amplified simultaneously, providing a complex hybridization probe. The technique thus allows the determination of the methylation status of restriction sites, which serve as tags of the methylation status of the surrounding regions. The validity of the technique was confirmed by various means, including bisulfite sequencing. The technique was successfully applied to the identification of methylation patterns of the regions surrounding 38 human-specific HERV-K(HML-2) long terminal repeats in cerebellum- and lymph node-derived genomic DNAs. The described technique can be readily adapted to the use of DNA microarray technology.

INTRODUCTION

It is widely recognized that DNA methylation has profound effects on gene expression. DNA methylation plays an important role in embryonic development, X-chromosome inactivation, imprinting, and suppression of repetitive sequences in the genome (1). Approximately 80% of CpG residues in the mammalian genome are methylated in most adult tissues (2). CpG dinucleotides are distributed non-randomly along DNA and clustered within so-called CpG islands that are usually protected from methylation in normal cells (3) and possibly are involved in tissue specificity of gene expression, although this suggestion is still not reliably proven (1). Even less is known about functions of CpG dinucleotides located outside CpG islands. One of the important functions usually assigned to methylation is suppression of the activity of retroelements (REs), including that of long interspersed elements (LINEs), short interspersed elements (SINEs) and long terminal repeats (LTRs).

Due to evolutionary selection, REs are mostly located in ‘inactive’ genomic regions such as heterochromatin and are heavily methylated. Moreover, it is quite possible that newly integrated REs methylated by the cellular methylation machinery become centers of spreading methylation over surrounding genomic regions, thus suppressing their functional activity (4). On the other hand, a multitude of REs were ‘domesticated’ by mammalian genomes and might be involved in modulation of expression of resident genes (5,6). The involvement of REs in regulation of gene expression has been demonstrated in a number of studies (710). REs of this group can be suggested to be active non-methylated regulatory units located in active non-methylated genome areas.

Determination of the methylation status of REs and their adjacent regions may be used to discriminate between potentially active and inactive REs. Furthermore, insofar as each type of RE represents a group of repetitive elements with conserved sequence features, REs can be used as universal genomic anchors allowing the design of easily detectable tags of the methylation status of the loci where they are located.

In this work, we developed a new technique for detection of tags of the methylation status of genomic loci containing interspersed repeats, and successfully applied it to 38 LTRs of human endogenous retroviruses (HERVs) belonging to the most functionally active family, HERV-K(HML-2), and located in different genomic loci (1113). The methylation patterns obtained in such a way for cerebellum- and lymph node-derived genomic DNAs were compared and shown to be different. The validity of this technique for determining the methylation status was confirmed by bisulfite sequencing and PCR amplification.

MATERIALS AND METHODS

Genomic DNA isolation, restriction and adaptor ligation

Genomic DNA from human tissues was isolated using proteinase K digestion and phenol extraction as described (14). A 1 µg aliquot of the genomic DNA from each tissue was digested with 70 U of HpaII restriction enzyme (New England Biolabs) at 37°C overnight in 100 µl of the reaction mixture. The completeness of HpaII digestion was confirmed by PCR with primers to either side of four previously identified non-methylated CCGG sites located within exons, introns or promoter regions of housekeeping genes (see table 1 of the Supplementary Material available at NAR Online). The digested DNAs were ethanol precipitated, dissolved in 10 µl of sterile water and ligated to an excess of adaptor (2 mM) at 16°C overnight using T4 DNA ligase (Promega). To form the adaptor, 300 pmol of each of the oligonucleotides A1A2 and a1 (Table 1) were annealed in 20 µl of TM (10 mM Tris–HCl, pH 7.8, 10 mM MgCl2) buffer. The ligation was terminated by incubation at 65°C for 15 min. The ligates were then purified from excess oligonucleotides by passing them through a QIAquick DNA Purification Kit (Qiagen), and their 3′ ends were filled in by the Klenow fragment of DNA polymerase I (Fermentas). Finally, the ligates were ethanol precipitated and redissolved in 40 µl of sterile water.

Table 1. Oligonucleotides and primers used in hybridization probe construction.

Primers Structure (5′–3′)
A1A2
AGCAGCGAACTCAGTACAACAAGTCGACGCGTGCCC GGGCTGGT
A1
AGCAGCGAACTCAGTACAACA
A2
AGTCGACGCGTGCCCGGGCTGGT
a1
CGACCAGCCC
T1
CCACCTTACGAGAAACACCCACAG
T2
TGTTTCAGAGAGCACGGGGTTGGG
T3
CGAGAAACACCCACAGGTGTG
T4 TCTGATCTCTCTTGCTTTTCCCCAC

Preparation of hybridization probes

Fractions of LTR-U5 and LTR-U3 flanking sequences were amplified using a PCR suppression approach (15). In the first round of suppressive PCR, A1 primer was used with T1 or T2 primers (Table 1). Primers T1 and T3, and T2 and T4 (Table 1) correspond to the most conserved parts of the U5 and U3 regions of HERV-K(HML-2) LTRs, respectively. The PCR mixture contained 10 ng of the ligate in 25 µl of PCR Buffer for Advantage™ Taq (BD Biosciences, Clontech) containing 200 µM of each dNTP, 0.4 µM of each primer and 3 U of Advantage Taq DNA polymerase (BD Biosciences, Clontech). PCR was carried out in a thermal cycler (MJ Research) as follows: 22 cycles at 95°C for 20 s, 65°C for 30 s, 72°C for 2 min. The PCR products were diluted 1000-fold and re-amplified with A2, T3 and T4 primers (Table 1) in the second PCR round (17 cycles at 95°C for 20 s, 65°C for 30 s, 72°C for 2 min).

Hybridization membranes preparation

Primers against flanking sequences of 38 human-specific HERV-K(HML-2) LTRs were designed using the ‘Gene Runner (Version 3.00)’ program (Hastings Software, Inc.). They were synthesized and used for PCR amplification of human placenta DNA. Aliquots of the PCR products (50 ng each) were then spotted and immobilized on nylon membranes (Hybond N; Amersham) according to the manufacturer’s recommendations. The same amount of λ phage DNA was also spotted on the membranes as a negative hybridization control.

Dot-blot hybridization

Hybridization probes were prepared by random labeling of the second round suppressive PCR products (see Preparation of hybridization probes). The DNA fragments dotted on the membrane and the probes had the same sequences at their 5′ or 3′ termini, i.e. 25 and ∼40 bp when amplified from the LTR U3 or U5 terminal regions, respectively. To prevent cross-hybridization between these sequences, the hybridization mixture was supplemented with a 200-fold molar excess of unlabeled competitive oligonucleotides corresponding to the common parts of the spotted flanks. After random labeling, probes were diluted, mixed with competitive oligonucleotides (20 pmol each), denatured at 100°C for 5 min, immediately chilled on ice and added to the hybridization mixture. Dot-blot hybridization was performed at 68°C overnight in a buffer containing 6× SSC, 5× Denhardt’s, 0.5% SDS and 100 µg/ml salmon sperm DNA. The membranes were washed under high stringency conditions and exposed to X-ray film (Renex, Russia) for 2 days.

Amplification of the second round PCR products

The results of dot-blot hybridizations were verified by PCR to check a correlation of signal intensity with the abundance of the corresponding flanking DNA in the amplicon used as a hybridization probe. PCR amplifications used T3 or T4 LTR primers (Table 1) with the primers specific for particular LTR flanking sequences (primer sequences and PCR conditions are shown in table 2 of the Supplementary Material), and the second round PCR products as templates.

Bisulfite sequencing

RE flanks were sequenced using agarose beads by a modified method for bisulfite-based cytosine methylation analysis, as described (16). Lymph node-derived DNA was digested with EcoRI restriction endonuclease, denaturated and then treated with bisulfite to convert all unmethylated cytosines into uracils. This reaction did not affect methylated cytosine residues. For each conversion, 100 ng of lymph node DNA was taken. The converted DNA fragments were subjected to two rounds of PCR amplification using a nested primer approach (Table 2) and then cloned into a TA cloning vector (Promega). Four clones for each RE-flanking fragment were sequenced. The cytosine residues converted with bisulfite are identified in the course of sequencing as Ts.

Table 2. Primers used in bisulfite sequencing.

Primers Structure (5′–3′)
LTR11bisfor1
TATAAGTTGGATTTGTATTAGAGGATTTG
LTR11bisfor2
GGTTTAGAGTTTAGGAGTTTTGGTTGTTAG
LTR11bisrev1
CTTACTCACACTAAACCTAACAATAAATACTC
LTR11bisrev2
ACATCCCCAACCTACATCTCCCTC
LTR12bisfor1
GTAGGATATGAAGTATGGAGAATAGGTAAG
LTR12bisfor2
TATTTTTGTAGGGATGTATAGTGTTGGG
LTR12bisrev1
CACCTATAAACCATCCTTTACTATTATAAAC
LTR12bisrev2
AAAATCCTTTTACCTCCATTCATCTTC
LTR27bisfor1
TTTTGTTATTGAGAAGTATATAGTTTAGTGG
LTR27bisfor2
GGAAGAAGAGTTAATTGGTGGTATAAGG
LTR27bisrev1
AAATCTCTAATACCATTCCTCCTACC
LTR27bisrev2 ATAAAAATCCCAATATTAAAATCACATCC

The PCR primers were designed to be fully complementary to the deaminated DNA strand and did not include CG dinucleotides. First round amplifications (with primers for1 and rev1) were performed in 25 µl of PCR buffer for Taq DNA polymerase (Promega) containing 200 µM of each dNTP, 1.5 mM MgCl2, 0.4 µM of each primer and 1 U of Taq DNA polymerase (Promega), with individual 3 µl beads in a thermal cycler (MJ Research) as follows: 35 cycles at 95°C for 20 s, 63°C for 30 s, and 72°C for 1.5 min. Amplifications using nested primers (for2 and rev2) were performed using 1 µl of the PCR mixture after the first round under PCR conditions as follows: 22 cycles at 95°C for 20 s, 60°C for 30 s, and 72°C for 1.5 min.

PCR of genomic DNA digested by HpaII

Cerebellum- and lymph node-derived DNAs were digested as described above in Genomic DNA isolation, restriction and adaptor ligation. The digested DNAs were then PCR amplified using primers T3 or T4 (Table 1) and primers corresponding to LTR-flanking sequences (primer sequences are available on request). The PCR mixture contained 10 ng of the digested DNA in 25 µl of PCR buffer for Taq DNA polymerase (Promega) containing 200 µM of each dNTP, 0.4 µM of each primer and 1 U of Taq DNA polymerase (Promega). PCR was carried out in a thermal cycler (MJ Research) as follows: 28 cycles at 95°C for 20 s, 61°C for 30 s, and 72°C for 1 min.

RESULTS

Technique rationale

The technique we have developed is aimed at whole-genome identification of DNA methylation patterns of interspersed repeats (interspersed repeat-containing loci) and is based on the DNA array technique that can be easily adapted for DNA microchip technology. The technique uses two basic properties of interspersed repeats: (i) characteristic conservative sequence features that can be used for selective simultaneous isolation of flanking sequences for all REs; and (ii) different flanking sequences that allow reliable discrimination of individual RE integration sites. Thus, REs can be considered natural repetitive universal markers of a multitude of genomic loci. Here we used these markers for evaluation of the methylation status of genomic regions surrounding HERV-K LTRs that were integrated in the human genome after the divergence of the human and chimpanzee lineages. The principles of the approach used are schematized in Figure 1.

Figure 1.

Figure 1

Scheme of the method. After restriction and adaptor ligation (stage 1), an unmethylated pool of RE flanks is selectively PCR amplified (stage 2) and hybridized (stage 3) with an (micro)array of the whole pool of the same RE-flanking sequences. Methylated and non-methylated HpaII (HhaI) restriction sites are shown by short vertical lines with black circles or without circles, respectively. REs and oligonucleotide adaptors are shown as green and black-and-white boxes, respectively. Primer positions and orientations are indicated by arrows.

The sequences representing flanks of REs were immobilized on a solid support as a DNA array. These RE-flanking sequences can be prepared by any available technique (chemical synthesis, PCR amplification, etc.). In the present work, we used flanks of HERV-K(HML-2) LTRs prepared by selective PCR amplification (Fig. 1).

A pool of genomic RE-flanking regions was prepared by selective PCR amplification with one of the primers representing a universal RE-specific sequence designed from the consensus of the RE type under investigation. The other primer was also universal but targeted at the artificial adaptors attached to the termini of the restriction fragments obtained by digestion of the genomic DNA with a methyl-sensitive restriction endonuclease. The labeled amplicon was used for hybridization with the array (Fig. 1, stage 3). Such a hybridization is supposed to reveal only those spots on the array that correspond to non-methylated RE-flanking restriction sites.

In our case, we used HERV-K(HML-2) LTR-specific primers, and the amplicon/hybridization probe thus contained flanks of HERV-K(HML-2) LTRs obtained by selective PCR amplification of the genomic DNA (Fig. 1, stage 2) digested with a methyl-sensitive endonuclease, such as HpaII or HhaI (Fig. 1, stage 1). Therefore, the selection resulted in only those genomic fragments bordered by the nearest to the LTR non-methylated restriction sites.

To make the amplification of the target fragments as specific as possible, we used the PCR suppression effect through ligation of special PCR suppressive adaptors to the restriction fragments (Fig. 1, stage 1). This modification prevents PCR amplification of the fragments lacking REs, with the adaptors on both termini, but does not preclude the amplification of RE-containing fragments by PCR initiated from sites within REs.

Comparative analysis of the distribution of methylation sites neighboring the LTRs in human cerebellum and lymph node

The genomic DNAs isolated from cerebellum and lymph node and digested with methyl-sensitive restriction endonuclease HpaII were ligated to the suppressive adaptors (see Materials and Methods). The first round of the PCR-suppressive selective amplification was followed by nested primer PCR to improve the selectivity. The amplicons obtained were labeled with 32P and used as hybridization probes.

Individual flanking regions of 38 human-specific HERV-K (HML-2) LTRs were prepared by PCR amplification using primers against the outermost parts of the LTR (T3 or T4, Table 1) and a unique region of the adjacent genomic flank, respectively (primer sequences are available on request). All the 38 flanking sequences obtained were dotted on Hybond-N nylon membranes. One of the identical membranes was then hybridized with a human cerebellum DNA-derived probe, and the other with a probe prepared from human lymph node DNA.

Figure 2 shows the resulting hybridizations of the human-specific LTR-flanking sequences with the human cerebellum and lymph node DNA-derived probes. The results of the hybridizations were used to identify the methylation status of CpG dinucleotides within various HpaII cleavage sites in the LTR-flanking sequences. Although a similar number of positive hybridization signals (approximately 15) was observed with the cerebellum (Fig. 2A) and lymph node (Fig. 2B) probes, the intensities and patterns of the signals were significantly different (see Table 3). The differences in intensities might be due to different methylation levels of various HpaII sites that lead to the different abundance of the corresponding fragments in the hybridization probes. Some of the HpaII sites in the LTR-flanking regions have the same methylation status in both tissues. For example, the HpaII site adjacent to LTR28 (C4 position in Fig. 2) seems to be non-methylated in both cerebellum and lymph node. In contrast, the HpaII site adjacent to LTR2 (A2 position in Fig. 2) is apparently heavily methylated in both tissues. Some loci have a different methylation status in these tissues. In particular, the LTR18-adjacent region (B6 position in Fig. 2) is more methylated, and the flank of LTR27 (C3 position in Fig. 2) is less methylated in lymph node than in the cerebellum-derived DNA.

Figure 2.

Figure 2

Dot-blot hybridization with cerebellum (A and B) and lymph node (C) DNA-derived probes. LTRs are numbered from left to right, and from top to bottom. In (A) and (C), position A1 corresponds to LTR1, A2 to LTR2, etc., B1 to LTR13, etc., C1 to LTR25, etc., D1 to LTR37, and D2 to LTR38. Spots D3 in (A) and (C) and A11 in (B) correspond to 50 ng of phage λ DNA. Designation of spots in (B) corresponds to that in (A). Rows A and B in (B) represent the same materials spotted in duplicate.

Table 3. Probe and dotted DNA characteristics for analyzed LTRs.

 
Accession no.
Length of probe fragments (bp); positions in corresponding accession no.
Length of dotted fragments (bp); positions in corresponding accession no.
Repeats within flank sequences for dotted fragments
Hybr signal
          Cerebellum Lymph node
LTR1
AC044819
1170 (130 901–131 973)
280 (130 901–131 181)
1–250 bp HERV7


LTR2
AC025420
1910 (46 613–48 529)
120 (46 613–46 736)
1–180 bp L2


LTR3
AC010267
1430 (106 778–108 212)
130 (106 778–106 909)
1–90 bp MSTA


LTR4
AC002508
430 (33 657–34 088)
300 (33 657–33 952)
20–150 bp MLTJ


LTR5
AL354855
490 (71 833–72 321)
210 (71 833–72 042)
120–250 bp L1

+
LTR6
AC012146
990 (24 102–25 089)
70 (24 102–24 171)
No repeats


LTR7
AL671681
1120 (167 436–168 552)
100 (167 436–167 534)
1–120 bp MLT1A1
+
+
LTR8
AL139421
530 (69 211–69 744)
820 (69 211–70 031)
No repeats

+
LTR9
AL592220
320 (93 880–94 203)
270 (93 933–94 203)
1–60 bp AluSg


LTR10
AL359701
520 (61 942–62 464)
650 (61 817–62 464)
1–700 bp HERV9
+

LTR11
AC021987
1100 (20 313–21 412)
50 (21 360–21 412)
1–20bp LTR26
+
+
LTR12
AC074117
530 (154 755–155 289)
160 (155 132–155 289)
No repeats


LTR13
AL139404
600 (60 431–61 027)
50 (60 977–61 027)
1–50 bp LTR49


LTR14
AC084028
50 (141 209–141 258)
150 (141 209–141 356)
1–120 bp MLT1A1


LTR15
AL162412
1610 (66 382–67 997)
210 (67 791–67 997)
150–250 bp AluSx


LTR16
AL139090
580 (1536–2115)
280 (1536–1816)
No repeats
+

LTR17
AC024884
170 (43 103–43 169)
40 (43 103–43 144)
No repeats


LTR18
AC000389
180 (9548–9731)
680 (9052–9731)
No repeats
+
+
LTR19
AC113425
240 (41 008–41 248)
202 (41 046–41 248)
No repeats

+
LTR20
AC021294
1410 (145 000–146 412)
200 (146 216–146 412)
1–200 bp L2


LTR21
AC091895
1080 (102 415–103 493)
210 (102 415–102 625)
1–240 bp L1PA13
+

LTR22
AL162723
210 (19 165–19 370)
130 (19 240–19 370)
1–50 bp MSTA


LTR23
AC099661
1590 (149 214–150 807)
160 (150 643–150 807)
No repeats

+
LTR24
AC074019
950 (111 321–112 273)
330 (111 947–112 273)
1–300 bp L1MC


LTR25
AL359644
550 (93 908–94 457)
40 (94 417–94 457)
No repeats


LTR26
AP000812
1020 (74 214–75 234)
400 (74 838–75 234)
1–40 bp MER61
+
+
LTR27
AC002400
380 (109 107–109 490)
330 (109 107–109 437)
1–150 bp L2
+
+
LTR28
U47924
410 (97 456–97 869)
660 (97 206–97 869)
250–500 bp AluSg
+
+
LTR29
AP001591
700 (52 405–53 106)
110 (52 405–52 510)
70–150 bp MER58A
+
+
LTR30
Z84493
90 (8120–8210)
520 (7689–8210)
No repeats
+
+
LTR31
AC002350
850 (45 768–46 614)
260 (45 768–46 027)
No repeats
+
+
LTR32
AC073898
900 (78 284–79 177)
150 (79 029–79 177)
No repeats

+
LTR33
Z80898
310 (6071–6380)
270 (6071–6337)
1–250 bp L1PA1
+
+
LTR34
AC007326
1120 (35 759–36 874)
770 (36 100–36 874)
1–150, 450–550 bp AluJ, 150–450 bp ERV1


LTR35
AC023074
1060 (119 305–120 365)
330 (120 037–120 365)
120–160 bp L1PA8A

+
LTR36
AC006432
1810 (10 078–11 894)
410 (10 078–10 490)
No repeats
+

LTR37
AL109763
210 (72 566–72 773)
670 (72 566–73 233)
1–700 bp L2
+
+
LTR38 AP001631 120 (108 235–108 355) 110 (108 235–108 345) No repeats +

The hybridization signals did not depend on probe or dotted DNA lengths or on the presence of various repetitive elements in these DNAs (see Table 3). This finding suggests that the signal intensity correlates with the abundance of the corresponding flank DNA in the amplicon used as the hybridization probe. To verify this correlation, seven LTR flanks were PCR re-amplified using specific primers with the amplicon as a template. The amount of each individual flank DNA was estimated by the minimal number of PCR rounds sufficient to visualize the band on electrophoregrams of the PCR products. For all of the investigated LTRs, the number of PCR cycles was found to be in a good reciprocal correlation with the intensities of the corresponding hybridization signals on the array. The results demonstrated that the intensity of the hybridization signals was mostly proportional to the content of the corresponding product in the hybridization probe (amplicon).

To confirm the technique reproducibility, we repeated the hybridization using independently prepared probes and filters. In all cases, the relative intensities of the hybridization signals were in good accord with those presented in Figure 2. Figure 2A and B demonstrates the hybridization reproducibility as well as the total reproducibility of relative intensities of the signals in independent experiments.

PCR amplification through HpaII sites confirms the methylation status determined by the DNA array

The methylation status of some HpaII sites was additionally verified by PCR through them using primers hybridizing on either side of the sites and DNA digested with HpaII. In this case, the DNA fragments with an internal methylated HpaII site are supposed to be amplified, while the amplification of the fragments with an unmethylated site will fail because such sites are not protected from digestion with HpaII.

As an example, Figure 3 presents PCR amplifications for three arbitrarily chosen LTR flanks. It can be seen that the HpaII sites adjacent to LTR30 and LTR28 are unmethylated in both lymph node and cerebellum DNAs, while the flank of LTR14 is methylated in both tissues. These and our previous data on LTR methylation analysis (17) are in good accord with the results of the DNA array hybridization.

Figure 3.

Figure 3

PCR through HpaII sites of interest. (A) Cerebellum- and lymph node-derived DNAs were digested by the methyl-sensitive restriction endonuclease HpaII and amplified with primers specific to either side of the HpaII site (e.g. P1 and P2, or P3 and P2, where P2 is the primer against the LTR sequence). Methylated or unmethylated CpG/CCGG sites are designated by filled and empty circles, respectively. (B) Gel electrophoresis of the PCR products after amplification for the LTR28 (lanes 1–3), LTR30 (lanes 4–6) and LTR14 (lanes 7–9) flanks through adjacent HpaII sites. Lanes 2, 5 and 8, and 3, 6 and 9 represent PCR products generated by amplification of HpaII-digested DNAs from cerebellum and lymph node, respectively. Lanes 1, 4 and 7 correspond to the amplified DNA fragments from native placenta. Lane 10, length marker.

Confirmation of the correlation of the methylation status of HpaII sites and neighboring CpGs by bisulfite sequencing

Three arbitrarily chosen LTR flanks were taken for bisulfite sequencing (Table 2). CpG dinucleotides within the flank of LTR12 were found to be predominantly methylated, whereas the LTR11 flank was unmethylated in lymph node-derived DNA. The results shown in Figure 4 allow one the following conclusions to be drawn. (i) For all the sequences analyzed, the results on the methylation status of HpaII sites obtained with all the techniques used (flank DNA arrays, PCR amplification of HpaII-digested genomic DNAs, and bisulfite sequencing) agreed well with each other. (ii) The methylation status of HpaII sites in all three studied cases coincided with the methylation status of the nearest CpG dinucleotides, thus demonstrating that methylation follows the cooperative principle due to which methylated and unmethylated CpGs are clustered. This feature of methylation was previously reported as ‘methylation spread’ (4). Due to this property, the methylation status of a particular CpG within such a cluster is characteristic of the whole cluster. Accordingly, a methylated/unmethylated recognition site of a methylation-sensitive restriction endonuclease within an extended region can serve as a tag of the methylation status of this region.

Figure 4.

Figure 4

Results of bisulfite sequencing. For each of three LTR-flanking sequences, four clones were sequenced. The methylation results obtained for each of the clones are schematically presented at the bottom of the figure as lines of circles. Filled (black) and empty circles designate methylated and unmethylated CpGs, respectively.

DISCUSSION

Although REs are often considered inert components of the genome, there is evidence of their participation in genome functioning (6,13,1821). However, genome-wide analysis of the functional status of LTRs is still at the very beginning and restricted to some individual LTRs chosen more or less on a random basis. The novel technique described here allows one to perform systematic genome-wide analysis of the methylation status of CpG sites neighboring LTRs. These sites might serve as tags of methylation of extended regions harboring the LTRs. In turn, the methylation status of a genomic region may reflect its functional status. A rationale for such a suggestion is a widely discussed methylation spread effect (4), i.e. an ability of methylation to spread over adjacent genome regions. The results obtained here with bisulfite sequencing are in agreement with this effect. Indeed, Figure 4 clearly demonstrates clustering of methylated and unmethylated CpGs. Thus we can consider HpaII sites with their methylation status as tags of the methylation status of the wider surrounding region.

The technique described here allowed us to show that some human-specific LTRs are differentially methylated in human tissues, their methylation status being probably linked with the expression of neighboring genes. The technique can be used for analysis of methylation patterns of any low and medium copy number REs of genomes. In particular, it might be useful for studying certain subgroups of such important human genome constituents as ERVs, SINEs and LINEs. Although these RE families are very abundant in the human genome, being represented by ∼4.5 × 105, 1.6 × 106 and 8.7 × 105 individual elements (22), respectively, each family consists of a lot of less abundant subfamilies amplified during various evolutionary periods. Using our technique, each subfamily can be analyzed separately. The youngest HERV, L1 and Alu subfamilies comprise from tens to hundreds of members (2325). Some of these recently inserted REs are still not fixed in the human species, and it would be interesting to estimate their impact on the functioning of the host genome.

The technique can easily be applied to any family of repeats providing that their copy number is not too high, because the main restriction of the method is the complexity of hybridization probes prepared by selective PCR amplification. High complexity of the probes results in weak hybridization signals.

There are also some limitations to the technique regarding tissue samples due to their heterogeneity that leads to heterogeneous methylation of DNA isolated from different cells of one and the same tissue. However, positive hybridization signals always mean the presence of unmethylated sites in a given sample. Sorting of cells using, for example, a cell sorter might improve the results.

We hope that the technique described here will be helpful for a better understanding of the role of REs in genome functioning and molecular evolution. It is an efficient and inexpensive tool for genome-wide analysis of methylation profiles and therefore for studying epigenetic effects of REs that is still a black box of genome functioning analysis.

SUPPLEMENTARY MATERIAL

Supplementary Material is available at NAR Online.

[Supplementary Material]

Acknowledgments

ACKNOWLEDGEMENTS

The authors thank Boris O. Glotov for critical reading of the manuscript and valuable comments. The work was supported by INTAS 01-0759 and the Russian Foundation for Basic Research 01-04-48900 and 2006.200054 grants, and by the Physico-Chemical Biological Program of the Russian Academy of Sciences.

REFERENCES

  • 1.Costello J.F. and Plass,C. (2001) Methylation matters. J. Med. Genet., 38, 285–303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Jaenisch R. and Bird,A. (2003) Epigenetic regulation of gene expression: how the genome integrates intrinsic and environmental signals. Nature Genet., 33 Suppl., 245–254. [DOI] [PubMed] [Google Scholar]
  • 3.Bird A.P. (1986) CpG-rich islands and the function of DNA methylation. Nature, 321, 209–213. [DOI] [PubMed] [Google Scholar]
  • 4.Turker M.S. (2002) Gene silencing in mammalian cells and the spread of DNA methylation. Oncogene, 21, 5388–5393. [DOI] [PubMed] [Google Scholar]
  • 5.vandeLagemaat L.N., Landry,J.R., Mager,D.L. and Medstrand,P. (2003) Transposable elements in mammals promote regulatory variation and diversification of genes with specialized functions. Trends Genet., 19, 530–536. [DOI] [PubMed] [Google Scholar]
  • 6.Whitelaw E. and Martin,D.I.K. (2001) Retrotransposons as epigenetic mediators of phenotypic variation in mammals. Nature Genet., 27, 361–365. [DOI] [PubMed] [Google Scholar]
  • 7.Mi S., Lee,X., Li,X., Veldman,G.M., Finnerty,H., Racie,L., LaVallie,E., Tang,X.Y., Edouard,P., Howes,S. et al. (2000) Syncytin is a captive retroviral envelope protein involved in human placental morphogenesis. Nature, 403, 785–789. [DOI] [PubMed] [Google Scholar]
  • 8.Domansky A.N., Kopantzev,E.P., Snezhkov,E.V., Lebedev,Y.B., Leib-Mosch,C. and Sverdlov,E.D. (2000) Solitary HERV-K LTRs possess bi-directional promoter activity and contain a negative regulatory element in the U5 region. FEBS Lett., 472, 191–195. [DOI] [PubMed] [Google Scholar]
  • 9.Landry J.R., Rouhi,A., Medstrand,P. and Mager,D.L. (2002) The Opitz syndrome gene Mid1 is transcribed from a human endogenous retroviral promoter. Mol. Biol. Evol., 19, 1934–1942. [DOI] [PubMed] [Google Scholar]
  • 10.Schon U., Seifarth,W., Baust,C., Hohenadl,C., Erfle,V. and Leib-Mosch,C. (2001) Cell type-specific expression and promoter activity of human endogenous retroviral long terminal repeats. Virology, 279, 280–291. [DOI] [PubMed] [Google Scholar]
  • 11.Mager D. and Medstrand,P. (2002) In Gardiner,K. (ed.), Encyclopedia of the Human Genome. Nature Publishing Group, http://www.ehgonline.net. [Google Scholar]
  • 12.Medstrand P., van de Lagemaat,L.N. and Mager,D.L. (2002) Retroelement distributions in the human genome: variations associated with age and proximity to genes. Genome Res., 12, 1483–1495. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Sverdlov E.D. (2000) Retroviruses and primate evolution. Bioessays, 22, 161–171. [DOI] [PubMed] [Google Scholar]
  • 14.Sambrook J. and Russell,D.W. (2001) Molecular Cloning: A Laboratory Manual, 3rd edn. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. [Google Scholar]
  • 15.Buzdin A., Khodosevich,K., Mamedov,I., Vinogradova,T., Lebedev,Y., Hunsmann,G. and Sverdlov,E. (2002) A technique for genome-wide identification of differences in the interspersed repeats integrations between closely related genomes and its application to detection of human-specific integrations of HERV-K LTRs. Genomics, 79, 413–422. [DOI] [PubMed] [Google Scholar]
  • 16.Olek A., Oswald,J. and Walter,J. (1996) A modified and improved method for bisulphite based cytosine methylation analysis. Nucleic Acids Res., 24, 5064–5066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Khodosevich K., Lebedev,Y. and Sverdlov,E. (2004) Tissue-specific methylation of human-specific long terminal repeats of endogenous retroviruses. Rus. J. Bioorg. Chem., 30, in press. [DOI] [PubMed] [Google Scholar]
  • 18.Brosius J. (1999) RNAs from all categories generate retrosequences that may be exapted as novel genes or regulatory elements. Gene, 238, 115–134. [DOI] [PubMed] [Google Scholar]
  • 19.Khodosevich K., Lebedev,Y. and Sverdlov,E. (2003) Endogenous retroviruses and human evolution. Comp. Funct. Genom., 3, 494–498. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Lower R., Lower,J. and Kurth,R. (1996) The viruses in all of us: characteristics and biological significance of human endogenous retrovirus sequences. Proc. Natl Acad. Sci. USA, 93, 5177–5184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Sverdlov E.D. (1998) Perpetually mobile footprints of ancient infections in human genome. FEBS Lett., 428, 1–6. [DOI] [PubMed] [Google Scholar]
  • 22.Lander E.S., Linton,L.M., Birren,B., Nusbaum,C., Zody,M.C., Baldwin,J., Devon,K., Dewar,K., Doyle,M., FitzHugh,W. et al. (2001) Initial sequencing and analysis of the human genome. Nature, 409, 860–921. [DOI] [PubMed] [Google Scholar]
  • 23.Salem A.H., Kilroy,G.E., Watkins,W.S., Jorde,L.B. and Batzer,M.A. (2003) Recently integrated Alu elements and human genomic diversity. Mol. Biol. Evol., 20, 1349–1361. [DOI] [PubMed] [Google Scholar]
  • 24.Boissinot S., Chevret,P. and Furano,A.V. (2000) L1 (LINE-1) retrotransposon evolution and amplification in recent human history. Mol. Biol. Evol., 17, 915–928. [DOI] [PubMed] [Google Scholar]
  • 25.Gifford R. and Tristem,M. (2003) The evolution, distribution and diversity of endogenous retroviruses. Virus Genes, 26, 291–315. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

[Supplementary Material]
nar_32_3_e31__1.pdf (9.5KB, pdf)
nar_32_3_e31__2.pdf (6.6KB, pdf)

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES