Skip to main content
American Journal of Human Genetics logoLink to American Journal of Human Genetics
. 2006 May 25;79(1):67–84. doi: 10.1086/504729

Intra- and Interindividual Epigenetic Variation in Human Germ Cells

James M Flanagan 1, Violeta Popendikyte 1, Natalija Pozdniakovaite 1, Martha Sobolev 1, Abbas Assadzadeh 1, Axel Schumacher 1, Masood Zangeneh 1, Lynette Lau 1, Carl Virtanen 1, Sun-Chong Wang 1, Arturas Petronis 1
PMCID: PMC1474120  PMID: 16773567

Abstract

Epigenetics represents a secondary inheritance system that has been poorly investigated in human biology. The objective of this study was to perform a comprehensive analysis of DNA methylation variation between and within the germlines of normal males. First, methylated cytosines were mapped using bisulphite modification–based sequencing in the promoter regions of the following disease genes: presenilins (PSEN1 and PSEN2), breast cancer (BRCA1 and BRCA2), myotonic dystrophy (DM1), and Huntington disease (HD). Major epigenetic variation was detected within samples, since the majority of sperm cells of the same individual exhibited unique DNA methylation profiles. In the interindividual analysis, 41 of 61 pairwise comparisons revealed distinct DNA methylation profiles (P=.036 to 6.8 × 10−14). Second, a microarray-based epigenetic profiling of the same sperm samples was performed using a 12,198-feature CpG island microarray. The microarray analysis has identified numerous DNA methylation–variable positions in the germ cell genome. The largest degree of variation was detected within the promoter CpG islands and pericentromeric satellites among the single-copy DNA fragments and repetitive elements, respectively. A number of genes, such as EED, CTNNA2, CALM1, CDH13, and STMN2, exhibited age-related DNA methylation changes. Finally, allele-specific methylation patterns in CDH13 were detected. This study provides evidence for significant epigenetic variability in human germ cells, which warrants further research to determine whether such epigenetic patterns can be efficiently transmitted across generations and what impact inherited epigenetic individuality may have on phenotypic outcomes in health and disease.


Phenotypic differences among individuals have traditionally been attributed to genetic (DNA sequence) variation and environmental differences. Over the past several decades, documentation of DNA sequence variants has been one of the top priorities in biomedical research. Numerous major international projects—from the sequencing of the Human Genome1,2 to the creation of SNP databases (dbSNP, now called “Entrez SNP”) and the Haplotype Map3 (HapMap)—have contributed significantly to the understanding of the position, degree, and structure of DNA polymorphisms. However, SNPs and other DNA sequence differences are relatively rare, and DNA sequences of two unrelated individuals are 99.5% identical. Furthermore, only a small fraction of these polymorphisms are functional—that is, polymorphisms that change amino acid sequence in the protein or have an impact on gene expression. Sequencing of the chimpanzee (Pan troglodytes) genome revealed 98.67% DNA sequence identity to the human genome, and, again, only a fraction of polymorphisms appear to result in structural or functional gene differences.4 Such findings raise the question: is this low DNA sequence variation across unrelated individuals and our closest related species sufficient to account for all major differences in physiological and psychological phenotypic outcomes?

One potential, although poorly investigated, source of phenotypic differences is epigenetic variation. By definition, “epigenetics” refers to the regulation of various genomic functions that are controlled by partially stable modifications of DNA and chromatin proteins.5 Epigenetic signals are critical to the proper functioning of the genome, as seen in Dnmt1-knockout mice that die in early embryogenesis,6 in several rare pediatric syndromes, and in cancer.7 One important feature of epigenetic regulation is partial epigenetic stability, or metastability. Epigenetic profiles in different cells of the same organism can be quite different, and developmental programs, environmental factors, or stochastic events in the nucleus of a cell can induce this variation. The first systematic effort to document DNA methylation differences and similarities across different genome regions, the Human Epigenome Project, was recently launched. The pilot study of the MHC locus on chromosome 6 investigated seven cell types (adipose, brain, breast, lung, liver, prostate, and muscle) across 32 individuals.8 In this study, which was not controlled for sex and age, around half (118/253) of the tested loci showed some interindividual variability in at least one tissue. The next phase of the Human Epigenome Project, which will be controlled for the above parameters, will provide coverage of >5,000 loci (∼3,000 genes) from chromosomes 6, 13, 20, and 22 across >20 tissue types.8 The Human Epigenome Project and other smaller-scale studies have investigated epigenetic variation primarily in somatic cells. However, there has been very little effort to document epigenetic variation in the germline, apart from imprinted genes9,10 and isolated cases of germ cell epimutations.11,12

There are several reasons to believe that the germline may contain substantial epigenetic variation. Epigenetic reprogramming during gametogenesis, fertilization, and embryogenesis involves dramatic chromatin remodeling.13 Methylation reprogramming during gametogenesis involves the erasure and reestablishment of methylation of imprinted genes and other nonimprinted genes and, then, a second wave of reprogramming during fertilization (paternal) and embryogenesis (maternal).13 This process is thought to (1) ensure that both gametes acquire the appropriate sex-specific epigenetic states and establish the epigenetic states required for early embryonic development and toti- or pluripotency and (2) allow the erasure of epimutations that adult germ cells may have inherited or developed during their lifetime.14,15 In parallel with DNA methylation, chromatin changes during spermatogenesis involve the compaction of the haploid genome by the replacement of the core histones via transition proteins to the much smaller basic protamines 1 and 2.16 However, a number of testis-specific histones and histone variants—such as TSH2B, histones H2A, H3, and H4, variants of H2B, and CENP-A—are present, to some extent, in the mature spermatozoa.1719 How these remaining histones are arranged and to what extent interindividual variability in histone placement and modification can affect development and phenotype are subjects yet to be investigated. Despite dramatic changes, not all epigenetic signals are erased in the germline, and recent studies in mice have suggested that this phenomenon could underlie epigenetic inheritance.20,21 Therefore, there is ample opportunity during these phases of reprogramming to either maintain or generate substantial epigenetic variability in the germ cells.

Although there is evidence that some individual loci exhibit partial epigenetic stability during meiosis in mice and in other organisms, to further understand the degree, mechanisms, and importance of epigenetic inheritance across generations in humans, three main questions need to be addressed in the following order. (1) Is there any evidence for epigenetic variation in the germ cells? (2) To what extent is the epigenetic variation meiotically stable? (3) What is the impact of epigenetic variation in germ cells on phenotypic differences? In this study, we attempted to answer the first question by estimating the intra- and interindividual epigenetic variation detectable in the mature sperm of healthy individuals. For this goal, we used two different laboratory strategies. The first approach focused on promoter regions of several disease-related genes—such as PSEN1 (MIM 104311), PSEN2 (MIM 600759), BRCA1 (MIM 113705), BRCA2 (MIM 600185), DM1 (MIM 160900), and HD (MIM 143100)—in healthy individuals, with the use of bisulphite modification–based mapping of methylated cytosines and measured epigenetic “distances” between individuals. The second strategy was to perform a microarray-based epigenetic profiling of sperm DNA with the use of a CpG island microarray, which provides genomewide information on methylation variability across different unique and repetitive DNA sequences. Several loci of interest identified in the microarray experiments were further investigated using methylation-sensitive single-nucleotide primer extension (MS-SNuPE) reaction.

Material and Methods

Samples

Two sperm sample sets were used in this study. The first sample set was received from the Fairfax Cryobank, Genetics & IVF Institute, in Fairfax, VA, and consisted of 25 sperm samples collected from healthy white sperm donors at an average age of 27 years (range 22–35 years). The second set of sperm samples was collected at the Centre for Addiction and Mental Health in Toronto from 21 healthy white individuals at an average age of 39 years (range 24–56 years). This study was approved by an institutional ethics board, and informed consent was obtained from all participants. Some aspects of sperm DNA data analysis required a nonsperm tissue of reference; for this purpose, postmortem brain tissues were used (donated by The Stanley Medical Research Institute’s brain collection, courtesy of Drs. M. B. Knable, E. F. Torrey, M. J. Webster, S. Weis, and R. H. Yolken). These brain samples were from 22 white males who had an average age at death of 46 years (range 31–66 years). Extraction of DNA was performed using standard salt and phenol/chloroform extraction.

Bisulphite Modification–Based Mapping of Methylated Cytosines

Bisulphite modification–based mapping of methylated cytosines was performed as described elsewhere.22 In brief, genomic DNA (700 ng) was digested with BglII (Fermentas) for 1 h at 37°C, was denatured at 100°C for 5 min, was chilled on ice, and was then incubated at 50°C for 15 min in 0.3 M NaOH. The DNA was then mixed with 2% low–melting point agarose (SeaPlaque Agarose, FMC) and was dropped into ice-cold mineral oil to form seven beads of ∼10 μl, and, finally, the beads were placed into a freshly prepared solution containing 2.5 M sodium bisulphite (pH 5.0) plus 1 mM hydroquinone (both from Sigma). The beads were then incubated on ice for 30 min, followed by incubation at 50°C for 3.5 h. The beads were washed in four changes of Tris-EDTA (TE) (pH 8.0) for 1 h and then were desulphonated in 0.2 M NaOH for 30 min. After desulphonation, the beads were washed a second time in three changes of TE for 30 min. Before amplification, the beads were washed in H2O for 30 min. PCR amplification of the target sequences consisted of 5 μl of agarose beads containing the bisulphite-treated DNA, 2 mM MgCl2, 0.2 mM deoxynucleotide triphosphates (dNTPs), 0.4 μM each of forward and reverse primer, 250 ng/ml BSA, and 2.5 U Taq polymerase (New England Biolabs) in 1× PCR buffer, to a total volume of 50 μl. PCR was performed using either a seminested or a fully nested approach, with the first PCR consisting of one cycle at 97°C for 4 min, 53°C for 2 min, and 72°C for 2 min, followed by 24 cycles at 94°C for 45 s, 53°C for 1 min, and 72°C for 1 min. The second PCR used 5 μl of the first PCR as a template and consisted of one cycle at 97°C for 2 min, 53°C for 2 min, and 72°C for 1 min, followed by 24 cycles at 94°C for 45 s, 55°C for 45 s, and 72°C for 1 min. CpG islands in the 5′ promoter sequences were analyzed in six genes: PSEN1 (Entrez GeneID 5663; chromosome 14: 72672525–72673163), PSEN2 (GeneID 5664; chromosome 1: 223365273–223365990), BRCA1 (GeneID 672; chromosome 17: 38530561–38531181), BRCA2 (GeneID 675; chromosome 13: 31787367–31788153), HD (GeneID 3064; chromosome 4: 3113281–3113816), and DM1 (GeneID 1760; chromosome 19: 50964670–50965254). An intronic CpG island within the CDH13 gene (GeneID 1012; chromosome 16: 81218597–81218988) was also analyzed by bisulphite genomic sequencing. Nucleotide positions given are from the May 2004 Genome (hg17) version (UCSC Genome Browser). The primers used for amplification of bisulphite-modified DNA fragments are available in table 1.

Table 1. .

Primer Sequences

Gene and Primer Name Methoda Sequence (5′→3′)
BRCA1:
 prBRCA1_for Bis-PCR GAGTAGAGGTTAGAGGGTAGGT
 prBRCA1_rev1 Bis-PCR CAAAACATATTCCAATTCCTATCAC
 prBRCA1_rev2 Bis-PCR TCAATACCCCCAACCTAATCCTC
BRCA2:
 prBRCA2_for Bis-PCR GGGGGATTTGGAGTAGGTATAGG
 prBRCA2_rev1 Bis-PCR CACTTCCCCAAAACAACAATATTCC
 prBRCA2_rev2 Bis-PCR AACCCACTACCACCACCACTAACC
PSEN1:
 prPSEN1for_1 Bis-PCR GATTTATTGAGTGGTGGGAGAG
 prPSEN1for_2 Bis-PCR TGATAGGTGTTAAATTTAGGATGG
 prPSEN1rev Bis-PCR CCCCTCATCTTTTAAAACACC
PSEN2:
 prPSEN2_for Bis-PCR GGGGTGGAGAGAGGAGAGTGT
 prPSEN2_rev1 Bis-PCR AAATACAATTACTTCACTCAACACC
 prPSEN2_rev2 Bis-PCR AACTCTATAACCTCAATTTCTTCATC
HD:
 prHDfor_1 Bis-PCR GGTTTTTTGGTTAGTTATTGGTAGAG
 prHDfor_2 Bis-PCR GTAGGTTAGGGTTGTTAATTATGTTGG
 prHDrev_1 Bis-PCR CAATACAACAACTCCTCAACCACAACC
DM1:
 prDM1for_1 Bis-PCR GTGGATGGGTAAATTGTAGGTTTGG
 prDM1rev_1 Bis-PCR AACATTCCCAACTACAAAAACCCTTC
 prDM1rev_2 Bis-PCR CTTTTCCTCCCCCAACCCTAATTC
CDH13:
 CDH13 44g5F1 Bis-PCR ATAAAATTTAAGTTAGGATGGGAGATATAG
 CDH13 44g5R1 Bis-PCR ATAAATAAACCAAAACAATACTTTACCTA
 CDH13 44g5F2 Bis-PCR TTTGGTATTTAGTAGTTGTTTAATAAAGTT
 CDH13 44g5R2 Bis-PCR TACAAAATATCATACTCTAATCACTAAACC
ARH1:
 22_F_7F1 Bis-PCR AGTGGATATTGTGTTAATTTTTAAG
 22_F_7R1 Bis-PCR ATATCCTATCTTTTCAATATCATTT
 22_F_7F2 Bis-PCR TATTTAAAGTGTAGTTGTAGGATTTGTTAG
 22_F_7R2 Bis-PCR ACTACTTTTAAACTTTCATCTAAAATCAAT
NELL2:
 44_A_4F1 Bis-PCR TTGTTTTATTATAATGTTGATGAGA
 44_A_4R1 Bis-PCR TTCTTAAACTCTACATTACCTCTATTT
 44_A_4F2 Bis-PCR GGGGTTATAGTTTGTGTGGAGT
 44_A_4R2 Bis-PCR CAATATATAATAAAATACAATAAAACCTCCTC
 44_A_4F3 Bis-PCR TTATTTATTATGGGTTTTTGTTTG
 44_A_4R3 Bis-PCR CTACTCTTTCTTATAAAAATAACTAAATTC
 44_A_4F4 Bis-PCR GTGTAGTGGTGTAATTATGTTTTATTGTAG
 44_A_4R4 Bis-PCR TCTAAACATACTAACTCATACCTATAATCC
RHOQ:
 50_H_2F1 Bis-PCR ATTAGATTTGGAGTTTGAGAGTTTAG
 50_H_2R1 Bis-PCR AAAAACAAAAACCCTAACTTATTACAT
 50_H_2F2 Bis-PCR GATAGTGAGGAATGGGAATATAATAG
 50_H_2R2 Bis-PCR CCATAACACAAACTTTAATCATTTACTA
SCAM:
 31_F_12F1 Bis-PCR GTATAGAGGAAAGGAATGTTATTTTTATT
 31_F_12R1 Bis-PCR TAATACTAAAACTCTAAATAATCACCCAAA
 31_F_12F2 Bis-PCR TTTTGTGTTATTAGGTTGGAATTT
 31_F_12R2 Bis-PCR AATACTTTCCTCATATACCTCCTCTAC
PDE:
 9_H_3F1 Bis-PCR GGAGATGAAATGGGTAATATTTTT
 9_H_3R1 Bis-PCR ATATTCTTATAACTACCATCAACCAAAAC
 9_H_3F2 Bis-PCR GAGGAGAGTTGGTTAGTTAAGATTT
 9_H_3R2 Bis-PCR AAACTCAACTTAAATTCCAAAAATAC
MKL:
 22_A_4F1 Bis-PCR TTAGTGTTTATTTTGATTGTAGAGTTG
 22_A_4R1 Bis-PCR ATTCTAACAAAATTAAAAACCACTATTC
 22_A_4F2 Bis-PCR GTTTATGGAGTTTTTGTTGTGTG
 22_A_4R2 Bis-PCR CTAAACTCCAATATTCCACTTCATTA
NEIL:
 43_B_5F1 Bis-PCR GTAGAAGAGGATTAGGTATTTAATTGGTTA
 43_B_5R1 Bis-PCR AACTATAAACCTCTAACACCTCCTAACT
 43_B_5F2 Bis-PCR AAAGTTTGATGAGGGGAAATAGTA
 43_B_5R2 Bis-PCR ATTCTAACCCACTACTACCAACTTATT
DSCAM:
 103_C_9F1 Bis-PCR ATATTTTATTGATGATAGAAGAGAAGGTAG
 103_C_9R1 Bis-PCR AACCTACTAATACAATACAAAATATAACCA
 103_C_9F2 Bis-PCR GGAGATGTAGGTAATATGTGTATTTAGTT
 103_C_9R2 Bis-PCR ACATTAAAACACTTTCCTAAAATAACAA
 103_C_9F3 Bis-PCR AAAGTTTAGTTGGATTTATAGTTTT
 103_C_9R3 Bis-PCR TTACTATATTAATCTATTTCCACACTACTA
 103_C_9F4 Bis-PCR TAGAATTATGGTGGGAGGTAAAAG
 103_C_9R4 Bis-PCR TAAATTAAACTTACAATTCCACATAACTAA
OLR1:
 53_H_11F1 Bis-PCR TTAATTTTTGTATTTTTAGTAGAGATAGGG
 53_H_11R1 Bis-PCR ATTACAATAAACTAAAATCACACCACTAC
 53_H_11F2 Bis-PCR TTTTTAAAGTGTTAGGATTATAGG
 53_H_11R2 Bis-PCR AAAACTTAAATCCACCAAAAA
 53_H_11F3 Bis-PCR TATTTGATTTTAATTTTTGGAGATG
 53_H_11R3 Bis-PCR ATAAATAAACTTCTTAAACTCCTCATATTT
 53_H_11F4 Bis-PCR TTTAGGTTGGAATGTAGTGGTTT
 53_H_11R4 Bis-PCR ACCAATCTACCTCCATTAACTCTATT
AHR:
 55_F_9F1 Bis-PCR GTTTTGTTAGAATGTTTTAAAGTTGTTT
 55_F_9R1 Bis-PCR CAAATAACTCCCACTTTTAATAAATATC
 55_F_9F2 Bis-PCR GATGTGTATAGGTATTTTTATATTTATTTTTAGG
 55_F_9R2 Bis-PCR ATTCTATCAATTACCAATATCCACATACT
 55_F_9F3 Bis-PCR GGATATAAGTTATGGAAATAATTAGAAAAT
 55_F_9R3 Bis-PCR CTAATCAACACAAACAATATATACATAAAA
 55_F_9F4 Bis-PCR AGTATTATAGGAATTTGAAGTAGAGAAAAA
 55_F_9R4 Bis-PCR TTTACACAATATTTACTTCAATTATTTACC
CDH13:
 44_G_5F1 Bis-PCR ATAAAATTTAAGTTAGGATGGGAGATATAG
 44_G_5R1 Bis-PCR ATAAATAAACCAAAACAATACTTTACCTA
 44_G_5F2 Bis-PCR TTTGGTATTTAGTAGTTGTTTAATAAAGTT
 44_G_5R2 Bis-PCR TACAAAATATCATACTCTAATCACTAAACC
 44_G_5F3 Bis-PCR GGATAGTTTTGATGTTGTATAATAAATAGT
 44_G_5R3 Bis-PCR ATAAATTTAACCAAATCTATATCTCAAAAC
 44_G_5F4 Bis-PCR TAATTTTAAGGATAGTGATTATGTAATTGG
 44_G_5R4 Bis-PCR ACTCCCATATCCCACCAAAA
FBN1:
 52_A_2F1 Bis-PCR ATTTTATGATAGATAGGATATAGGTATTGA
 52_A_2R1 Bis-PCR TACTAAATATATACAAATAAACATCCTTCC
 52_A_2F2 Bis-PCR GTAGTAGGGTAGAAATTTATAGTTAGGTTT
 52_A_2R2 Bis-PCR CCACTTTTATCCACCTATTTTCTAAT
 52_A_2F3 Bis-PCR TTTTTATTATTTTTAGATTGATGGTAGG
 52_A_2R3 Bis-PCR TAATATAAAACTACCTTTCAAATATCACAT
 52_A_2F4 Bis-PCR TGATTTAAATAATGAGAATAGATAGGTTT
 52_A_2R4 Bis-PCR TACCACTACAAACTTAATACTTTAATAACC
ARH1:
 22_F_7Sp1 MS-SNuPE (GACT)X6GYGGGTGGTTTTTGYGTAATYG
 22_F_7Sp2 MS-SNuPE GACTTTTAGGGATAATTYGTTTATAAATTTTTATTG
 22_F_7Sp5 MS-SNuPE AGYGGGGTGYGGGGG
 22_F_7Sp6 MS-SNuPE GGGTAATAGGTATAGATTTYGTTT
 22_F_7Sp3 MS-SNuPE GACTTAYGAATTAAGTAGTTTAGAAGATAAATG
 22_F_7Sp4 MS-SNuPE (GACT)X3TTTATTTTTTYGYGGTTTTATATTTYGATTTG
 22_F_7Sp7 MS-SNuPE GGTTGGTTTAAYGTAGAGYGG
 22_F_7Sp8 MS-SNuPE (GACT)X5AATAAGTTTTAGGTAAAATTTTGTTAATAAAAAT
NELL2:
 44_A_4Sp1 MS-SNuPE GTTTTTGTTTTGTTTTTTGGGTTGTT
 44_A_4Sp2 MS-SNuPE GACTGACTGTAGGGTTTTATTGTGTTATTATGTT
RHOQ:
 50_H_2Sp4 MS-SNuPE ATTTGGTTAGATGGGTGTGTTT
 50_H_2Sp3 MS-SNuPE (GACT)X3GGTGTTAGYGTAGAGTGTATTTT
 50_H_2Sp2 MS-SNuPE (GACT)X6AGGGTGATGGTTTAGTTGATTTT
 50_H_2Sp5 MS-SNuPE TAGAGGGGTAGGGGTTGT
 50_H_2Sp7 MS-SNuPE GACTGAGTTTAGGTATATTTGTYGGGTT
 50_H_2Sp1 MS-SNuPE (GACT)X4GTAGTTGGAAGTTTTTGGGATAAT
 50_H_2Sp6 MS-SNuPE TTTTAGAGTATTAGTGTGTATTAGTTTTT
 50_H_2Sp8 MS-SNuPE GACTGACTTGGTTTAATAYGTTGTATTTTTTTTAGTTAT
SCAM:
 31_F_12Sp5 MS-SNuPE GTTGTGGTTAYGAGGGGG
 31_F_12Sp1 MS-SNuPE GTTTTTTTTTAGGTTTTAAAGTGGATAG
 31_F_12Sp6 MS-SNuPE GACTGACTTGTGTTTATTTATTGGAAGATGTTGT
 31_F_12Sp4 MS-SNuPE GTTGGGTTTYGAGGTTGTGT
 31_F_12Sp3 MS-SNuPE GACTGACTTTATAGGYGTTTTTGGTTAGGAG
 31_F_12Sp2 MS-SNuPE (GACT)X3TGGTYGTTTTATTTTYGGTTTTATAAGAT
PDE:
 9_H_3Sp8 MS-SNuPE YGGYGGGAGYGATGGAG
 9_H_3Sp4 MS-SNuPE GACTGAGATTTGAATGAGTTAAAGTYGG
 9_H_3Sp5 MS-SNuPE (GACT)X4ATAGTAGGTYGTTGATYGGTYG
 9_H_3Sp3 MS-SNuPE (GACT)X4TYGGAAGTATTTTATTTTTTTTTTTYGTTAGT
 9_H_3Sp7 MS-SNuPE GGYGGTGGAGAAGTTGAGT
 9_H_3Sp2 MS-SNuPE GACTGAGTATTATAGATATGYGTGTTTAYG
 9_H_3Sp6 MS-SNuPE (GACT)X5AGGTTTTTAGGYGTTYGYGTYG
 9_H_3Sp1 MS-SNuPE (GACT)X5AGYGGAATAYGTGATATTATTTTATTTATTTT
MKL:
 22_A_4Sp3 MS-SNuPE GGYGGTGGGAGGGGAT
 22_A_4Sp5 MS-SNuPE GACTGACTGAGGTGGGTTGAGAGTAG
 22_A_4Sp2 MS-SNuPE (GACT)X4TGTATAGAGAGGTGGAGYGTT
 22_A_4Sp4 MS-SNuPE AGGGGGTGTTTGGGAG
 22_A_4Sp1 MS-SNuPE GACTTTGTGTGTYGGGTATTTAAGTTG
NEIL:
 43_B_5Sp6 MS-SNuPE GTYGGYGGTTTTGGAGGG
 43_B_5Sp5 MS-SNuPE GATTAGAGATAATTGTTTGTAGTTATGT
 43_B_5Sp3 MS-SNuPE (GACT)X3TGGGAGTTTTTTTTGTTYGAATAGTT
 43_B_5Sp1 MS-SNuPE TYGGTATTAGYGAGTGTAAGATG
 43_B_5Sp2 MS-SNuPE (GACT)X3AATAGGTGGTTAGGTAGTTGTTT
 43_B_5Sp4 MS-SNuPE (GACT)X4TGTATATTATTTAGATTGTTTTATGTAGG
DSCAM:
 103_C_9Sp1 MS-SNuPE TAGTTTTTTATTGAAGAGTGTTTAATTATTTT
 103_C_9Sp2 MS-SNuPE GGTAAAAGGTATTTTTTATATGGTAG
 103_C_9Sp3 MS-SNuPE GTTTTTGTTTTTATTTTATTTTTATTTTTTTTTTGT
OLR1:
 53_H_11Sp1 MS-SNuPE TGTTAGGATTATAGGTATGAGTTAT
 53_H_11Sp2 MS-SNuPE GGTTGGTTTTAAATTTTTTATTTTAAGTTATT
 53_H_11Sp3 MS-SNuPE TTGGGATTATAGGTGTGAGTTAT
 53_H_11SpN4 MS-SNuPE TAATTCTAATTAAAAATTTAAAACTTCTTACC
 53_H_11SpN5 MS-SNuPE AACATAAAAATTACTTAAACCCTAAAAAC
 53_H_11SpN6 MS-SNuPE CCTCTCAAAAAAAAAAAAAAAAAATTAACC
 53_H_11Sp7 MS-SNuPE GACTGACTGTGGTATATTTTYGGTTTATTGTAATTTT
 53_H_11Sp12 MS-SNuPE GATTATAGGTGTGAGTTATTGTGT
 53_H_11SpN8 MS-SNuPE AAACAAAAAAATCCCTCRAACCC
 53_H_11SpN9 MS-SNuPE CTAAAAATACAAAAATTAACCAAATATAATAAC
 53_H_11Sp10 MS-SNuPE GTTTTGAATTTTTGATTTYGGGTGATT
 53_H_11SpN11 MS-SNuPE CCCAACACTTTAAAAAACCRAAAC
AHR:
 55_F_9Sp1 MS-SNuPE GACTAGGTATTATATTTTGTAAAGTGGTTTTTT
 55_F_9Sp2 MS-SNuPE TGTATTTATAGTTTGGGAGGAAG
 55_F_9Sp3 MS-SNuPE TAGGAAAAGTGATAAGTTTATTTGG
CDH13:
 44_G_5Sp1 MS-SNuPE GACTGACTGYGTGATTTYGGTTTATTGTAAGTTT
 44_G_5Sp3 MS-SNuPE TTTYGAGTAGTTGGGATTATAGG
 44_G_5SpN2 MS-SNuPE AAACTAAAACAAAAAAATAACRTAAACCC
 44_G_5SpN4 MS-SNuPE ATCTCTACTAAAAATACAAAAAATTAACC
 44_G_5SpN5 MS-SNuPE ATAATCCTAACACTTTAAAAAACTAAAAC
 44_G_5Sp6 MS-SNuPE GTTAGGATTATAGGYGTGAGTTAT
 44_G_5SpN7 MS-SNuPE ATTTTAACTTTTTTAAAAAAAAAAAAAAAAACCC
 44_G_5Sp8 MS-SNuPE TGGTATTGGAGTTTGTGGTAG
FBN1:
 52_A_2Sp1 MS-SNuPE TTGTTGGATTGTAAAGGTTATTTATG
 52_A_2Sp2 MS-SNuPE TGAATTTTTGTAATGTAGAGTTTGTATT
a

Bis-PCR = treated with sodium bisulfite and PCR amplified.

PCR products were electrophoresed on an agarose gel. DNA fragments were excised, were cleaned using Qiagen Gel Extraction Kit (Qiagen), and were cloned into the pGEM-T vector (Promega). Thirty clones from each PCR product (locus/individual) were sequenced. To evaluate the degree of intraindividual variation, we sequenced an additional 30 clones from separate bisulphite reactions in five cases: two in BRCA1, one in BRCA2, and two in PSEN2. A total of 1,020 clones were analyzed, which required >1,500 sequencing reactions, since some longer fragments had to be sequenced from both ends.

Analyses of DNA Methylation Variation in Bisulphite Modification–Based Experiments

The degree of epigenetic diversity within and across individuals was evaluated using the concept of epigenetic “distance.”23 Each of the 30 sequenced clones was binary coded, with “0” for an unmethylated cytosine and “1” for a methylated cytosine. Each clone was, therefore, represented by a row vector of n 0 and 1, where n is the number of cytosines in the tested region.

Estimation of intraindividual variation

Unique methylation profiles were identified for each set of 30 clones. For example, a set of clones 0101, 0101, 0111, and 1100 exhibits three types of methylation profiles (1/2, 3, and 4), and, therefore, the proportion of unique methylation profiles is 3/4. This calculation was performed for every set of 30 clones, and then the mean and SD of the proportion of unique clones across individuals were calculated for each locus. In the second round of analysis, because of possible imperfect C→T conversion with bisulphite treatment, two clones different by a single position were treated as identical. With use of the above example, profiles 0101 and 0111 are now treated as identical, and the degree of uniqueness is 2/4. In the final analysis, the tolerance was increased to two differences—that is, the clones that exhibited two or fewer differences were treated as identical.

Comparison of DNA methylation distances across individuals

The average methylation-intensity vector for each locus/individual was calculated by dividing the sum of the methylated cytosines by 30 for each different cytosine position. The degree of epigenetic dissimilarity was measured by Euclidean distance, by use of the following equation:

graphic file with name AJHGv79p67df1.jpg

where m1 is the average methylation vector of individual 1, m2 is the average methylation vector of individual 2, and d12 is the Euclidean DNA methylation distance between individuals 1 and 2. The larger the distance, the more dissimilar the two individuals’ methylation profiles are to each other. With this metric, we calculated the distances between all possible pairs of individuals for each promoter locus of BRCA1, BRCA2, HD, DM1, PSEN1, and PSEN2. To test statistical significance of methylation differences, we performed the following analysis. For each locus, all clones from all individuals were pooled together, and two sets of 30 randomly selected clones from the pool formed the methylation profiles of two pseudo-individuals. The epigenetic distance between the two pseudo-individuals was then calculated with the same procedure as above, and this procedure was repeated 100,000 times to generate 100,000 distances, the density distribution of which was plotted, and the mean (±2 SD) was calculated. The (one-tailed) P value of a distance was then obtained by finding the area under the distribution curve, from the left up to the calculated distance. An epigenetic distance in two real individuals with P<.05 (i.e., >2 SD) indicates that difference in the DNA methylation of two individuals is statistically significant.

Microarray-Based DNA Methylation Analysis

Microarrays

Genomewide epigenetic profiling was performed using the 12,192 CpG island microarrays24 purchased from the University Health Network Microarray Facility in Toronto.

Enrichment of unmethylated DNA

We used our developed technology for enrichment of the unmethylated DNA fraction and for epigenetic profiling described in detail elsewhere.25 The general principle of the DNA methylation profiling consists of interrogation of the unmethylated fraction of genomic DNA on the microarray. Intensity of hybridization inversely correlates with the DNA methylation status at the genomic locus homologous to a specific DNA fragment on the array. In brief, methylation-sensitive restriction enzymes were used to digest 1 μg of genomic DNA, and two enzyme scenarios were used in this project. First, sperm DNA samples from 25 individuals were analyzed using methylation-sensitive enzymes HpaII, Hin6I, and AciI (designated “sperm DNA–HHA array” set). This enzyme “cocktail” strategy, however, is not ideal for GC-rich regions, such as CpG islands, since these three enzymes would generate DNA fragments too small for efficient amplification and hybridization. Therefore, a single-digestion approach with HpaII alone was used on a second set of sperm DNA samples from 21 individuals (designated “sperm DNA–HpaII” array set). DNA adaptors (annealing products of two primers, U-CG1a [5′-CGTGGAGACTGACTACCAGAT-3′] and U-CG1b [5′-AGTTACATCTGGTAGTCAGTCTCCA-3′]) were ligated to the restricted DNA fragments, followed by treatment with McrBC (New England Biolabs), which cleaves the fragments containing two or more methylated cytosines, thereby further enriching the unmethylated fraction. Adaptor-PCR amplification of the ligated products, with the use of primers complementary to the adaptor sequence, consisted of 250 ng of ligated DNA, 2.5 mM MgCl2, 0.2 mM aminoallyl-dNTPs (15 mM aminoallyl–2′-deoxyuridine 5′-triphosphate, 10 mM 2′-deoxythymidine 5′-triphosphate, and 25 mM each of 2′-deoxycytidine 5′-triphosphate, 2′-deoxyguanosine 5′-triphosphate, and 2′-deoxyadenosine 5′-triphosphate), 200 pmol primer U-CG1b, and 5 U Taq polymerase (New England Biolabs) in 1× PCR reaction buffer (Sigma), to a final volume of 100 μl. PCR conditions are adjusted in such a way that only fragments <1.5 kb (i.e., short, digested, and, therefore, unmethylated) will amplify preferentially. Cycling consisted of an initial cycle at 72°C for 5 min and 95°C for 1 min, 25 cycles at 95°C for 40 s and 68°C for 2 min 30 s, and a final extension at 72°C for 5 min. Equal amounts of amplicons from each sample were mixed to form the pooled control, which was labeled with Cy3 and was cohybridized against each individual amplicon labeled with Cy5. Hybridization was performed at 42°C with the use of standard procedure.25

For comparison with the sperm DNA methylation profiles, DNA samples from postmortem brains of 22 individuals who did not have any known brain disease were subjected to the same microarray-based DNA methylation profiling that used a single-digestion approach with HpaII (designated “brain DNA–HpaII” array set).

Microarray data processing and analysis

Methylation differences between the individuals and the pooled control were analyzed by the ratio of hybridization intensities of Cy5 (individual samples) over Cy3 (pooled control). As we have learned from our previous analyses of arrays used for DNA methylation analysis, such ratios show normal distribution; therefore, the data can be treated similarly to those in classical microarray experiments. The array data were normalized in two steps—first, in a global intensity normalization, to adjust the Cy5:Cy3 ratio to 1:1 across the entire array, followed by a block-by-block LOWESS normalization. The data were trimmed to remove spots with ambiguous genome locations, including spots with no sequence or annotation (647 spots), spots with >30% repetitive elements (2,706 spots), and translocation hotspots (633 spots). The spots for which the microarray clones represented identical sequences were averaged, which resulted in ∼4,970 unique loci. Coefficient of variation (CV) was calculated for each remaining spot by dividing the SD in Cy5/Cy3 by the mean of Cy5/Cy3 across all individuals. The sperm DNA–HHA experiments were performed in duplicate, and the data were averaged ratios. The sperm DNA–HpaII and the brain DNA–HpaII data sets consisted of one array per individual, because we opted for increased biological replicates rather than for increased technical replicates for the number of microarrays available.

Age-covariate analysis

For the CpG island microarray experiment, the age-covariate analysis was performed using a correlation coefficient between two series of quantities, to measure the linear relationship between the series. Pearson correlation coefficient was calculated between the mean fold change (log Cy5/Cy3) across individuals and the ages across individuals, for each spot on the microarray. A large absolute value (|r|>0.5) of the coefficient indicates that the methylation intensity at the locus covariates with age in a positive or a negative way. To test their statistical significance, the ages across individuals were permuted, and, again, the coefficients were computed using the permuted age series. For each spot, the permutation was repeated 5,000 times to get 5,000 coefficients. The one-tailed P value of the coefficient was then obtained by finding the fraction of times that the coefficients were larger (or smaller) than the original coefficient. Although a P-value cutoff at .05 may lead to many false positives, the adjustment of P value for multiple testing, by controlling the probability of making at least one false positive, dramatically lowers the power of the experiment and is also considered too conservative for microarray studies.26 “False discovery rate” (FDR) is defined as the expected proportion of false-positive predictions among the positive predictions. For example, in the 100 positives declared by FDR at 0.1, 90 are expected to be true positives on average. The FDR criterion has increasingly been adopted over P value in microarray analysis. We have, therefore, applied FDR for the findings described in this article.

The autocorrelation clustering analysis for the CpG island microarray experiment was performed using the autocorrelation function ACF(x), which measures how strongly two methylation intensities x loci apart influence each other.

Measurement of Densities of Methylated Cytosines in the Selected Loci

Further analysis of a selected set of DNA fragments identified as the most variable was performed using the MS-SNuPE reaction on the ABI SNapShot platform accommodated for measuring the C/T ratios in the bisulphite-treated genomic DNA.27 In brief, genomic DNA was digested with NdeI (Fermentas), followed by treatment with sodium bisulphite, as described above. The loci of interest were amplified using nested PCR (primers available in table 1). Typical PCR amplification consisted of one cycle at 95°C for 1 min, then 40 cycles at 95°C for 30 s, 50°C for 30 s, and 72°C for 40 s, followed by a final extension at 72°C for 5 min. Quantitative interrogation of the bisulphite-induced C→T transition at CpG dinucleotides in such amplicons was performed with primers targeted to the CpG dinucleotides within the restriction sites for HpaII, Hin6I, or AciI.

Results

Intra- and Interindividual DNA Methylation Differences in the Promoters of BRCA1, BRCA2, HD, DM1, PSEN1, and PSEN2

The bisulphite modification–based mapping of methylated cytosines for all of these genes demonstrated that numerous individual clones (representing individual sperm cells) demonstrated quite different DNA methylation profiles within individuals (fig. 1A). This finding was confirmed by the analysis of the degree of uniqueness of DNA methylation profiles (fig. 1B). In the case of HD, ∼80% of all clones exhibited unique patterns of methylated cytosine distribution. This estimate did not change dramatically when potential bisulphite modification–induced artifacts were taken into account; on average, 72% of clones were different from one another when one methylated cytosine difference was tolerated, and up to 53% were different when two differences were allowed. The latter situation is a very conservative estimate of the degree of uniqueness, since such a high artifactual C→T nonconversion rate is unrealistic. In our experiments, the artifactual C→T bisulphite conversion was always <1%. The lowest degree of intraindividual DNA methylation uniqueness was detected for PSEN2: 36%, 20%, and 13% for 0, 1, and 2 levels of tolerance, respectively. This analysis of uniqueness is, however, related to the clone length and correlates specifically with the density of the CpGs analyzed (Pearson R=0.64, 0.93, and 0.98 for 0, 1, and 2 levels of tolerance, respectively), since more methylatable CpG sites allow more opportunity for variation.

Figure 1. .

Figure  1. 

Intraindividual variability of DNA methylation. A, DNA methylation profiles of the promoter CpG islands of BRCA1 and PSEN2, determined on the basis of sequencing 60 clones of bisulphite-modified sperm DNA. The BRCA1 locus covered 32 CpGs, and the PSEN2 region included 45 CpGs. Nine monomorphic (unmethylated) CpGs (BRCA1 or PSEN2) were excluded from the figure. Each individual is represented, with individual CpG dinucleotides from left to right (black = methylated cytosines; white = unmethylated cytosines) and individual clones from top to bottom. Like the presented BRCA1 and PSEN2 cases, a substantial proportion of clones in other loci (HD, DM1, BRCA2, and PSEN1) revealed unique DNA methylation profiles. B, Estimates of the proportion of unique methylation profiles in the promoter regions of the six analyzed genes. The Y-axis shows the proportion of clones carrying unique methylation profiles over the total number of sequenced clones; the X-axis shows the proportion of unique profiles that contain at least 1, 2, and 3 differences (left, middle, and right bars, respectively), compared with the other profiles at the same locus in the same individual.

Whereas the intraindividual analysis can show variability within an individual, significantly variable methylation patterns between individuals were also revealed (fig. 2). The gene-specific results were as follows: in BRCA1, n=4, 32 CpGs were analyzed, and 5/6 pairwise comparisons exhibited statistically significant differences (average P=2.53×10-5); in BRCA2, n=4, 36 CpGs were analyzed, and 3/6 pairwise comparisons were significant (average P=8.56×10-7); in PSEN1, n=3, 43 CpGs were analyzed, and 2/3 pairwise comparisons were significant (average P=1.89×10-4); in PSEN2, n=5, 45 CpGs were analyzed, and 6/10 pairwise comparisons were significant (average P=5.11×10-3); in DM1, n=7, 99 CpGs were analyzed, and 13/21 pairwise comparisons were significant (average P=5.60×10-4); and, in HD, n=6, 108 CpGs were analyzed, and 12/15 pairwise comparisons were significant (average P=1.66×10-3). Overall, 67% (41/61) of the pairwise comparisons were significantly different, which suggests a high overall level of interindividual variability in the methylation patterns of the tested genes. For the five cases in which 60 sequenced clones were available for BRCA1, BRCA2, and PSEN2, the comparisons were performed using two randomly selected groups of 30 clones (fig. 2). As a validation of this statistical method, the additional sets of 30 clones representing BRCA1, BRCA2, and PSEN2 were compared with the primary sets of 30 clones of the same individuals. In all cases, the results (compare pairs 55′ and 66′ in BRCA1 in fig. 2) showed that their profiles were not different, which is to be expected since they are from the same individual.

Figure 2. .

Figure  2. 

Interindividual variability of DNA methylation in six human disease genes. Bisulphite modification–based mapping of methylated cytosines in BRCA1, BRCA2, HD, DM1, PSEN1, and PSEN2. Thirty individual clones were sequenced from three to seven individuals. Analysis for each gene is represented in two panels. Left panels, graphical profile of the percentage of methylation (Y-axis, ranging from 0% to 40%) for every CpG dinucleotide (X-axis, ranging from 32 to 108 CpG dinucleotides), out of the total number of clones for each individual. Right panels, Euclidean distances (Y-axis) of pairwise comparisons between individual methylation profiles (X-axis). The blue line is the mean distance, and red lines are ±2 SD from the mean, both obtained for each gene from the permutation study (see the “Material and Methods” section). Pairwise comparisons are annotated—for example, as “16”—for the comparison of the Euclidean distance of individual 1 with that of individual 6. Primed individual numbers (e.g., 4′) represent a second set of 30 clones from those individuals. The error bars on some data points represent SDs from 100,000 permutations of 30 clone groups from the individuals from whom 60 clones were sequenced.

DNA Methylation Differences Detected by the CpG Island Microarrays

This CpG island microarray contains 12,192 DNA fragments; however, unique sequences are represented by 4,970 distinct loci, of which only about half meet the commonly used criteria for CpG islands: GC content of ⩾50%, length >200 bp, and observed/expected CG dinucleotide ratio >0.6.28 As described in the “Material and Methods” section, we used two strategies to increase the informativeness of our microarray analysis. The first strategy was to use an enzyme cocktail of HpaII, Hin6I, and AciI (sperm DNA–HHA data set), which is more informative for lower GC content loci, such as those that do not meet the CpG island criteria. The second strategy was to use HpaII alone (sperm DNA–HpaII data set), which would be more informative for the higher GC-containing loci. Lastly, we analyzed a brain DNA microarray data set (brain DNA–HpaII data set), to compare tissue-specific differences. As a measure of methylation variation, we have calculated the CV across individuals for each array set. The CV is calculated by dividing the SD in the Cy5/Cy3 ratio by the mean of the Cy5/Cy3 ratio, and it is expressed as a percentage. The variation in CV among individuals across the genome ranged from 2.1% to 30.5% (mean=6.7%), from 0.8% to 66.2% (mean=9.2%), and from 2.1% to 97.4% (mean=10.9%) for the sperm DNA–HHA, sperm DNA–HpaII, and brain DNA–HpaII data sets, respectively (table 2). We considered the loci within the top 10% of CVs (90th percentile) as highly variable regions.

Table 2. .

Statistical Analysis of Microarray Data from the Sperm DNA–HHA, Sperm DNA–HpaII, and Brain DNA–HpaII Data Sets

Descriptive Statistics Sperm DNA–HHA
(n=25)
Sperm DNA–HpaIIa
(n=21)
Brain DNA–HpaII
(n=22)
Mean CV (±SD)b 6.72 (2.48) 9.23 (4.92) 10.89 (5.40)
Loci countc 4,969 4,947 4,952
90th percentile (%)d >9.53 >13.71 >16.33
10th percentile (%)d <4.33 <5.16 <6.44
SNP analysise:
 90th percentile:
  No. with SNPs 78 32 ND
  No. with no SNPs 72 118 ND
 10th percentile:
  No. with SNPs 74 22 ND
  No. with no SNPs 76 128 ND
 χ2 .12 1.829 ND
P value .729 .176 ND
CGI analysisf:
 Count:
  CGI 2,523 2,512 2,478
  Non-CGI 2,446 2,435 2,401
 Mean CV (±SD):
  CGI 6.69 (2.57) 9.6 (5.28) 11.06 (4.94)
  Non-CGI 6.79 (2.32) 8.97 (4.39) 11.05 (5.60)
t Test P value .14815 4.92×10-6 .93995
CGI χ2 testg:
 90th percentile:
  No. CGI 235 296 256
  No. non-CGI 217 198 238
 10th percentile:
  No. CGI 226 218 255
  No. non-CGI 209 277 238
 χ2 .003 24.34 .001
P value .955 5.81×10-7 .974
Promoter χ2 testg:
 90th percentile:
  No. promoter CGI NA 245 NA
  No. non–promoter CGI NA 52 NA
 10th percentile:
  No. promoter CGI NA 152 NA
  No. non–promoter CGI NA 67 NA
 χ2 NA 11.44 NA
P value NA 4.87×10-4 NA
a

Values in bold type are statistically significant.

b

The mean (±SD) of the CV in ratio Cy5/Cy3 across the individuals (n) for each data set was calculated.

c

The count represents the number of unique loci remaining after data trimming (see the “Material and Methods” section).

d

Loci with CVs >90th percentile are in the top 10% of methylation-variable regions, and loci with CVs <10th percentile are the least variable loci.

e

The SNP analysis was performed to test for the effects of SNPs from our DNA methylation analysis. We randomly selected 300 loci from the 10th and 90th percentile loci, and the clone sequence plus 1-kb flanking regions were screened for the presence of SNPs—in particular, SNPs that create or disrupt HpaII, Hin6I, and AciI restriction sites. ND = not done.

f

The CpG island (CGI) analyses were performed by separating loci into either CGI or non-CGI categories. A Student's t test was performed to analyze the difference in mean CV. The numbers of loci within each group in the 90th and 10th percentile loci were counted, and χ2 analysis was performed.

g

The CGI loci were further subdivided into loci within promoter regions of genes or loci not within promoters, the numbers of loci in the 90th and 10th percentiles were counted, and χ2 analysis was performed. NA= not applicable.

The data for each locus were plotted on the genome (figs. 3 and 4) and are also available online as a custom annotation track (Center for Addiction and Mental Health) with use of the UCSC Genome Browser (fig. 3B). Figure 3A depicts the sperm DNA–HpaII data set and highlights the highly variable regions on the genome. To assess whether this distribution of highly variable spots is nonrandom, we performed an autocorrelation analysis; however, this analysis did not identify any evidence for autocorrelation, most likely because of the large genomic distance between microarray clones (average 0.6 Mb). Other analyses included testing if the detected variability is confounded by DNA sequence variation; comparison of DNA methylation variation in CpG islands and in non–CpG islands, as well as across different classes of repetitive elements; and assessing if DNA methylation variation correlates with the GC content, clone length, or particular chromosomal cytobands.

Figure 3. .

Figure  3. 

Chromosomal view of methylation variability by CpG island microarray analysis. A, Unmethylated fraction of genomic DNA extracted from sperm samples (n=21) hybridized individually (Cy5), in contrast to the pooled reference control (Cy3). The CV of the Cy5/Cy3 ratio was calculated for each spot across the 21 individuals and was mapped to the corresponding genomic location. Each chromosome ideogram is overlaid with red bars that represent the position of each clone on the CpG island microarray. The bars highlighted in green are the loci that showed variance in the 90th percentile (the top 10% of loci exhibiting the largest degree of DNA methylation variation). B, Screenshot of the custom annotation track on the UCSC Genome Browser (available from the Center for Addiction and Mental Health). Shown is chromosome 6, which includes the major histocompatibility complex locus that was screened for epigenetic variability by the Human Epigenome Project pilot study.8

Figure 4. .

Figure  4. 

Genomewide view of brain DNA–HpaII (A) and sperm DNA–HHA (B) data sets. The unmethylated fraction of genomic DNA was enriched from brain DNA (n=22) or sperm samples (n=25), and each was hybridized individually (Cy5), in contrast to the pooled samples (Cy3). The CV of ratio Cy5/Cy3 was calculated for each spot across the 22 or 25 individuals and was mapped to the corresponding genomic location. Each chromosome ideogram is overlaid with red bars that represent the position of each clone on the array. The bars highlighted in green are the loci that showed statistically significant variance (90th percentile).

Exclusion of Genetic Confounding Effects: SNPs and Copy-Number Polymorphisms

Any method that relies on restriction-enzyme digestion to differentiate between methylated and unmethylated DNA can be influenced by SNPs within the enzyme-restriction sites. Therefore, from each of the sperm DNA–HHA and sperm DNA–HpaII data sets, we selected 150 highly variable loci and 150 conserved loci and performed in silico screening, to identify all known SNPs within a 2-kb region of the selected clone that disrupt or create HpaII, Hin6I, or AciI enzyme sites for the sperm DNA–HHA data set or just HpaII sites for the sperm DNA–HpaII data set (SNP annotation of the UCSC Genome Browser). If SNPs were a significant confounding factor in DNA methylation variation, we would expect a higher proportion of SNPs in highly variable loci (e.g., 90th percentile) compared with the lowest variable loci (e.g., 10th percentile). The χ2 analysis revealed no association between the number of potentially disruptive enzyme-restriction sites and the degree of variability in either data set (sperm DNA–HHA χ2=0.12, P=.729; sperm DNA–HpaII χ2=1.83, P=.176). This finding suggests that the degree of variability in the sperm DNA microarray analysis is more dependent on DNA methylation differences than on DNA sequence differences.

Recent reports have identified >200 copy-number polymorphisms (CNPs) that represent large duplications and deletions that contribute significantly to genomic variation between individuals.2931 Like SNPs, CNPs could simulate DNA methylation variability in the microarray analysis. We have cross-referenced the CNPs identified in these studies with the CpG island microarray loci and have identified 25 microarray loci that occur within known CNP regions. These include large CNPs in chromosomes 3 (covering the genes OSTα, AB018337, UNQ3030, BC015560, and DLG1), 16 (BC008967, XYLT1, ARL61P, MIR16, MGC16943, and CDR2), and 17 (AY302137, BHD, RAI1, FLJ20308, TOP3A, and SMCR8) and smaller CNPs on chromosomes 1 (NEGR1), 2 (AK024244), 6 (RDBP), 8 (TSTA3), 9 (LHX2), 11 (TNNT3), and 14 (AK090461). Microarray results for these genes listed could, therefore, be influenced by deletions or duplications as much as by methylation variability; however, none of these loci appear in the list of highly variable (>90th percentile) loci.

CpG Island Analysis

Not all DNA fragments on the CpG island microarray met the criteria for CpG islands. The list of loci were divided into “CpG islands” or “non–CpG islands,” and a Student's t test was performed to test for any statistically significant difference in the mean CV (table 2). A significantly increased DNA methylation variability was found in loci defined as CpG islands in the sperm DNA–HpaII data set (t test P=4.92×10-6), and this variability was exemplified by a bias towards CpG islands in the 90th percentile (highly variable regions) (χ2=24.34; P=5.81×10-7). In addition, when the CpG islands were split into promoter CpG islands and CpG islands not associated with known gene promoters, significantly higher variability in promoter CpG islands (χ2=11.44; P=4.87×10-4) was detected. However, analyses of methylation variability with other measures, including GC percent alone and clone length, did not reveal any association. No evidence for higher DNA methylation variation was detected in the promoter CpG islands in the brain DNA–HpaII data set, and there also was no association with SNPs. Therefore, this sperm DNA–HpaII experiment appears to have revealed genuine increased methylation differences in the promoter CpG islands.

Cytoband Analysis

It has been well described that different cytobands could have evolved in different ways and that the genes within each band could have evolutionary similarities.32,33 Since these bands are based on, among other things, GC content and Alu content, we sought to identify whether methylation variability was one of the aspects that showed similarities within bands. The CpG island microarray annotation includes the division of loci into different cytobands, including the G bands (Giemsa negative: gneg) and the four classes of R bands (Giemsa positive: gpos25, gpos50, gpos75, and gpos100). These bands are defined as follows. The darkest R bands, gpos100, are very rich in GC and Alu; the next darkest, gpos75, are very rich in GC but not in Alu; gpos50 are not rich in GC but are rich in Alu; and gpos25 are Giemsa-dark bands that are rich in neither GC nor Alu.34 Mean CVs for all the loci within each of these cytobands were calculated, and a Student's t test was performed to identify any statistically significant differences. In each of the data sets, marginally significant associations with certain cytobands were identified. In the sperm DNA–HHA data set, significant decreases in variability between gpos75 band loci (CV = 6.51) and the other three R bands—gpos25, gpos50, and gpos100 (average CV = 6.83; average P=.023)—were detected. In the sperm DNA–HpaII data set, gpos25 exhibited a lower degree of methylation compared with gpos50 (CV = 8.97 and CV = 9.50, respectively; P=.041). Although the significance of these statistical tests diminished when corrected for multiple testing, the result is suggestive of an increase in variability in the Alu-rich cytobands, such as the gpos50 and gpos100 cytobands, compared with the Alu-poorer bands gpos25 and gpos75.

Age-Dependent DNA Methylation Changes in the Sperm

Methylation dynamics with age (sperm DNA–HHA age range 22–35 years; sperm DNA–HpaII age range 24–56 years) as a covariate were investigated. Individuals were ordered by increasing age, and the Pearson correlation between the age and relative methylation-signal intensity (ratio of case to reference) was calculated for each locus. In the sperm DNA–HpaII and sperm DNA–HHA data sets, 105 and 8 loci, respectively, were found, whose absolute correlation coefficients were >0.5 and whose P values were <.05. Numerous genes were identified in the germ cell data that corresponded to genes involved in spermatogenesis and development (e.g., INSM1, TZFP, and EED) and neurogenesis (e.g., CALM1, STMN2, ARHGEF9, and ARX) or to disease-related genes (e.g., MAF, DCC, and CDH13 [MIM 601364]). A number of examples are shown in figure 5. The lists of genes for each data set are available in tables 3 and 4.

Figure 5. .

Figure  5. 

Age-related DNA methylation changes in the sperm. Individuals were ordered by increasing age (top left panel), and gene-specific DNA methylation dynamics were investigated using the individual ages (sperm DNA–HpaII age range 24–56 years) as a covariate. Pearson correlation was calculated for each locus, and the one-tailed P value of the coefficient was obtained. In the sperm DNA–HpaII data set, 105 loci were identified as significantly (P<.05) correlated (r>0.5) or inversely correlated (r<-0.5) with age. Since the unmethylated fraction of DNA was interrogated, positive correlation indicates decreasing DNA methylation with age, whereas inverse correlation reflects increasing methylation with age. The genes CTNNA2, EED, CALM1, CDH13, and STMN2 are shown as examples. Other genes for the sperm DNA–HpaII, sperm DNA–HHA, and brain DNA–HpaII data sets are available in tables 3 and 4.

Table 3. .

Age-Related Correlation in Sperm DNA–HpaII Data Set[Note]

University Health Network Accession Number Age Correlationa Genome Locationb Nearest Gene Distance
(bp)
Nearest Gene Entrez GeneID Nearest Gene
UHNhscpg0007432 −.918421653 chr 14: 73296338-73296842 0 91748 C14orf43
UHNhscpg0004878 −.644590225 chr 18: 58533334-58533572 1,366 23239 PHLPP
UHNhscpg0001279 −.635499301 chr 2: 15682160-15683012 0 1653 DDX1
UHNhscpg0010337 −.633961562 chr 4: 24692990-24693129 0 55203 LGI2
UHNhscpg0000523 −.628700319 chr 14: 89918538-89919063 14,066 801 CALM1d
UHNhscpg0009687 −.61929126 chr 11: 71430395-71430772 0 4926 NUMA1
UHNhscpg0002060 −.607457801 chr 7: 117447932-117448126 10,775 56311 ANKRD7
UHNhscpg0006282 −.603136381 chr 14: 50203675-50203889 0 60485 SAV1
UHNhscpg0008336 −.60134971 chr 3: 4211475-4211590 0 AY358092
UHNhscpg0002269 −.594487623 chr 16: 18719890-18721070 0 23204 ARL6IP
UHNhscpg0006883 −.590701745 chr 7: 20603836-20604518 4,091 221833 SP8
UHNhscpg0003324 −.57876246 chr 8: 80412773-80412879 272,989 11075 STMN2c
UHNhscpg0007072 −.578612647 chr 13: 20930559-20930802 0 253832 FLJ25952
UHNhscpg0002392 −.577309667 chr 12: 55758348-55759109 0 23306 AB006624
UHNhscpg0010814 −.574206485 chr 9: 111503780-111504547 0 548645 GNG10
UHNhscpg0002833 −.568147866 chr 11: 62202212-62203556 0 LOC51035
UHNhscpg0007366 −.562953631 chr 13: 20770028-20770950 73,766 FLJ25952
UHNhscpg0010640 −.55975393 chr 16: 81828071-81828454 0 1012 CDH13d
UHNhscpg0003417 −.55808056 chr 8: 80412825-80412879 272,989 11075 STMN2c
UHNhscpg0007351 −.557144063 chr 11: 133162749-133163293 52,433 219938 SPATA19e
UHNhscpg0011679 −.553844494 chr 19: 5999835-6000028 0 5990 RFX2e
UHNhscpg0011303 −.551957348 chr 1: 114639352-114639536 12,748 51592 TRIM33
UHNhscpg0000547 −.551565288 chr 9: 86126053-86126501 0 81689 HBLD2c
UHNhscpg0010993 −.550180272 chr 1: 46478949-46480024 0 10489 AF370430
UHNhscpg0000380 −.545084964 chr X: 134996172-134996628 0 2273 FHL1
UHNhscpg0003556 −.544679103 chr X: 62745590-62745765 0 23229 ARHGEF9c
UHNhscpg0002314 −.539967585 chr 3: 157491673-157492091 0 7881 KCNAB1
UHNhscpg0003596 −.531788057 chr 4: 172320378-172320459 934,465 51166 AADAT
UHNhscpg0001705 −.526787621 chr 15: 99009623-99010314 2,859 140460 ASB7
UHNhscpg0009822 −.526562678 chr 1: 21855364-21855679 370 AK026930
UHNhscpg0000063 −.52421876 chr 3: 72584222-72584350 5,758 23429 RYBP
UHNhscpg0001205 −.521667613 chr 17: 38793230-38793666 8,954 AK128207
UHNhscpg0009465 −.518509703 chr 5: 92982058-92982196 0 83989 DKFZP564D172
UHNhscpg0009804 −.517599705 chr 6: 95040381-95040442 854,395 2045 EPHA7d
UHNhscpg0010111 −.516980672 chr 3: 45163070-45163300 152 64866 CDCP1
UHNhscpg0008321 −.51607039 chr 4: 32393339-32393439 1,572,342 5099 PCDH7
UHNhscpg0006596 −.515801694 chr 12: 6667579-6668374 0 171017 ZNF384c
UHNhscpg0007045 −.514160426 chr 6: 39190 376-39191144 0 55776 C6orf64
UHNhscpg0003064 −.513774745 chr 15: 50863986-50864266 0 3175 ONECUT1
UHNhscpg0008779 −.512617631 chr 12: 64004487-64004759 0 253827 LOC253827
UHNhscpg0002807 −.510358672 chr 9: 91266376-91266791 859 4783 NFIL3
UHNhscpg0007252 −.510001953 chr 13: 110249836-110250880 68,680 283487 LOC283487
UHNhscpg0008979 −.509333714 chr 3: 97915795-97915891 100,433 AY358738
UHNhscpg0002130 −.508146229 chr 20: 61676052-61676691 6,181 85441 PRIC285
UHNhscpg0000750 −.506824133 chr 11: 111449826-111451103 0 55216 FLJ10726
UHNhscpg0004733 −.506433015 chr 11: 77963050-77963233 0 79731 FLJ23441
UHNhscpg0007997 −.504269038 chr 11: 19691578-19692280 0 89797 AJ488207
UHNhscpg0011253 −.503229452 chr 5: 5123770-5123955 111,307 170690 ADAMTS16
UHNhscpg0001019 −.502788473 chr 17: 38793230-38793666 8,954 AK128207
UHNhscpg0000830 −.502263623 chr 20: 11257478-11257784 588,782 22903 BTBD3
UHNhscpg0008888 −.501752931 chr 14: 46575274-46575377 0 161357 MAMDC1c
UHNhscpg0005027 −.501245356 chr 10: 127671190-127671371 0 92565 AY251163
UHNhscpg0008252 .50313828 chr X: 24795508-24795821 1,997 170302 ARXc
UHNhscpg0005849 .505284858 chr 9: 69169771-69170443 3,351 9413 C9orf61
UHNhscpg0002396 .505405308 chr 9: 15499859-15499960 0 11168 PSIP1
UHNhscpg0000331 .505543224 chr1: 114065861-114066820 0 54665 FLJ11220
UHNhscpg0004028 .505850764 chr 6: 109920727-109920873 0 FLJ25791
UHNhscpg0003238 .508910182 chr 5: 72642966-72643484 134,358 2297 FOXD1
UHNhscpg0005599 .512086932 chr 11: 57161641-57162347 6,789 219539 YPEL4
UHNhscpg0004696 .515663002 chr 2: 26012699-26013466 0 55252 ASXL2
UHNhscpg0008470 .516009324 chr 1: 225950806-225950903 0 55746 NUP133
UHNhscpg0008340 .51741056 chr 4: 84313172-84313686 0 51138 COPS4
UHNhscpg0003990 .517872516 chr 12: 43376666-43376983 0 4753 NELL2c
UHNhscpg0008557 .518846707 chr 10: 239933-240068 0 10771 ZMYND11
UHNhscpg0000482 .520984085 chr 11: 85649369-85649478 0 8726 EEDe
UHNhscpg0005884 .520992457 chr 3: 128381578-128381626 13,045 285311 AK097460
UHNhscpg0008792 .521754696 chr 4: 145190267-145191203 5,950 2996 GYPE
UHNhscpg0009520 .525317277 chr 1: 91678271-91678520 0 8317 CDC7d
UHNhscpg0004355 .526505907 chr 6: 27464330-27464706 0 441136 AK092633
UHNhscpg0001611 .526967248 chr 6: 30289787-30290357 683 7726 TRIM26
UHNhscpg0004783 .529777441 chr 22: 15638898-15638988 0 150165 MGC57211
UHNhscpg0000025 .532588588 chr 10: 75370203-75371061 22,943 5328 PLAUd
UHNhscpg0005928 .535021909 chr 1: 212554059-212554157 0 7399 USH2A
UHNhscpg0001828 .537335195 chr 17: 17507093-17507808 17,703 10743 RAI1
UHNhscpg0005479 .543975527 chr 1: 215895774-215895935 122,221 127018 LYPLAL1
UHNhscpg0002565 .545509304 chr 6: 116998748-116999090 1,641 51389 RWDD1
UHNhscpg0008258 .545922515 chr 1: 114408830-114409330 316 148281 SYT6
UHNhscpg0010487 .547254207 chr 12: 92273678-92274142 234 11163 NUDT4
UHNhscpg0002717 .55323384 chr 12: 48302621-48302970 649 AK123353
UHNhscpg0002673 .553941933 chr 15: 81825124-81825456 80,654 646 BNC1
UHNhscpg0010939 .554017064 chr 10: 8108807-8109019 23,399 FLJ45983
UHNhscpg0001474 .557227088 chr 8: 53488737-53489332 3,881 9705 ST18d
UHNhscpg0008206 .559375672 chr 14: 89155480-89155723 41,909 29018 AF118074
UHNhscpg0004409 .560612707 chr 12: 63850471-63850714 0 23592 MAN1
UHNhscpg0005280 .566882321 chr 8: 114808371-114808497 289,953 114788 CSMD3
UHNhscpg0002623 .566952142 chr 12: 14237814-14238207 171,686 55729 BC063855e
UHNhscpg0003738 .567376726 chr 2: 88155376-88155675 10,310 51315 LOC51315
UHNhscpg0008444 .570746223 chr 10: 122728932-122729159 69,907 55717 WDR11d
UHNhscpg0007259 .575414924 chr 1: 208139197-208139388 390 7779 SLC30A1
UHNhscpg0002607 .582792699 chr 2: 26481086-26481443 0 165082 GPR113
UHNhscpg0008656 .58347414 chr 1: 115293188-115293360 4,205 7252 TSHB
UHNhscpg0002406 .584415421 chr 18: 48120746-48121468 0 1630 DCCd
UHNhscpg0009002 .5849705 chr 16: 78361983-78362481 169,871 4094 MAFd
UHNhscpg0007649 .586330872 chr 14: 80296231-80296366 0 145508 C14orf145
UHNhscpg0004597 .587453741 chr 22: 15638898-15638988 0 MGC57211
UHNhscpg0003289 .613126037 chr 14: 20640986-20641802 4,239 554207 BC031469
UHNhscpg0002376 .616193621 chr 18: 33116047-33117280 0 56853 BRUNOL4
UHNhscpg0002145 .617261027 chr 1: 116672902-116673528 13,466 476 ATP1A1
UHNhscpg0000928 .628741946 chr 18: 32114987-32115360 12,305 55034 MOCOS
UHNhscpg0008601 .636604373 chr 19: 40897102-40898011 0 27033 TZFPe
UHNhscpg0002312 .640025401 chr 12: 52959901-52960341 457 3178 HNRPA1
UHNhscpg0004507 .640711129 chr 2: 80187757-80187860 0 1496 CTNNA2c
UHNhscpg0002864 .644197159 chr 20: 20293918-20294056 2,708 3642 INSM1e
UHNhscpg0008280 .693689486 chr 3: 111231569-111231663 692,505 55211 DPPA4e
UHNhscpg0003180 .698353684 chr 2: 45078887-45079081 1,606 6496 SIX3

Note.— Individuals were ordered by increasing age, and gene-specific DNA methylation dynamics were investigated using the individual ages (sperm DNA–HpaII age range 24–56 years) as a covariate.

a

Negative score = increasing methylation with respect to age. Positive score = decreasing methylation with respect to age.

b

chr = chromosome.

c

Genes related to brain/neuronal development.

d

Genes related to cancer or other disease.

e

Genes related to spermatogenesis, embryogenesis, and development.

Table 4. .

Age-Related Correlation in Sperm DNA–HHA Data Set[Note]

University Health Network Accession Number Age Correlationa Genome Locationb Nearest Gene Distance
(bp)
Nearest Gene Entrez GeneID Nearest Gene
UHNhscpg0006311 −.571461 chr 16: 73575728-73575882 0 79726 BC004519
UHNhscpg0000757 −.5414097 chr 16: 3084575-3084850 1,713 84891 ZNF206
UHNhscpg0002369 −.5080423 chr 12: 102467385-102467934 15,583 55576 STAB2
UHNhscpg0001087 −.5049054 chr 2: 222992625-222992945 0 0 FLJ32447
UHNhscpg0000495 −.4991295 chr 2: 38215365-38216278 448 1545 CYP1B1
UHNhscpg0000562 −.4981613 chr 10: 21822844-21823534 18,875 387640 FLJ45187
UHNhscpg0009206 −.4953882 chr 2: 120996167-120996370 56,240 84931 FLJ14816
UHNhscpg0005717 −.4945551 chr 21: 46567171-46567396 1,086 5116 PCNT2
UHNhscpg0007805 −.49356 chr 2: 223623949-223624362 0 2181 ACSL3
UHNhscpg0006512 −.4900278 chr 12: 21985242-21985363 0 10060 BC033804
UHNhscpg0001346 −.4890173 chr 12: 94687435-94687958 0 0 METAP2
UHNhscpg0000367 −.4675992 chr 9: 123770922-123771907 0 57706 AK024782
UHNhscpg0009946 −.4518449 chr 17: 70661243-70661746 0 51155 HN1
UHNhscpg0007913 −.449313 chr 3: 44011141-44011398 246,983 375337 AK093476
UHNhscpg0002168 −.437548 chr 7: 13802034-13802474 0 2115 ETV1
UHNhscpg0000973 −.4372439 chr 18: 54013705-54014057 0 23327 NEDD4Lc
UHNhscpg0001825 −.432339 chr 12: 94687426-94687979 0 0 METAP2
UHNhscpg0008872 −.428427 chr 3: 25681850-25682051 1,058 7155 TOP2Bd
UHNhscpg0004998 −.425298 chr 18: 24676574-24676664 665,482 1000 CDH2c
UHNhscpg0003543 −.4174782 chr 13: 93951550-93951661 21,626 1638 DCT
UHNhscpg0000752 −.4161375 chr 19: 52308085-52308632 0 23211 C19orf7
UHNhscpg0000402 −.4145143 chr 10: 70330981-70331591 0 0 AK056044
UHNhscpg0000851 −.4097783 chr 15: 43466690-43467020 8,677 2628 GATM
UHNhscpg0008103 −.4090966 chr 5: 78724780-78724896 0 9456 HOMER1
UHNhscpg0000874 −.4090138 chr 13: 45524409-45524808 0 23091 BC019000
UHNhscpg0003075 −.4080136 chr 12: 22379747-22379948 832 0 SIAT8A
UHNhscpg0007389 −.405673 chr 1: 192571739-192571804 354,765 343450 SLICK
UHNhscpg0001303 −.405598 chr 12: 50749103-50749772 254 60673 FLJ11773
UHNhscpg0008902 −.4048977 chr 14: 46575274-46575748 0 161357 MAMDC1c
UHNhscpg0001410 −.4042417 chr 15: 43466690-43467020 8,677 2628 GATM
UHNhscpg0001498 −.4035585 chr 19: 14052743-14053182 5,870 113230 BC011002
UHNhscpg0001795 −.4022476 chr 22: 45478515-45478970 97 25771 C22orf4
UHNhscpg0000092 −.4017695 chr 1: 243700095-243700378 45,036 317705 VN1R5
UHNhscpg0000407 −.4016029 chr 3: 159310225-159310601 0 51319 MGC12197
UHNhscpg0001152 −.401599 chr 4: 85773951-85774343 0 4825 NKX6-1
UHNhscpg0011244 .400513 chr 1: 233050707-233050741 0 55127 AK098212
UHNhscpg0008036 .4027472 chr 22: 22430339-22430668 0 150248 FLJ36561
UHNhscpg0009859 .4084672 chr 20: 38753133-38753612 1,843 9935 MAFBd
UHNhscpg0011733 .4095515 chr 5: 78381324-78381406 0 29958 DMGDH
UHNhscpg0010258 .4143755 chr 1: 10393997-10394274 0 5226 PGD
UHNhscpg0005083 .4183838 chr 6: 122167296-122167347 354,726 2697 GJA1
UHNhscpg0008843 .422049 chr 3: 54282033-54282522 0 55799 AF516696
UHNhscpg0003506 .4240404 chr 2: 36496443-36496693 0 0 CRIM1c
UHNhscpg0005633 .4240989 chr 18: 30341908-30341962 85,357 1837 DTNAc
UHNhscpg0005683 .4247309 chr 4: 84391329-84391815 0 51316 PLAC8
UHNhscpg0010991 .4284163 chr 12: 78167517-78167778 0 6857 SYT1c
UHNhscpg0010414 .428641 chr 14: 69723455-69723658 0 6547 SLC8A3
UHNhscpg0011471 .4289618 chr 19: 63431698-63431877 765 27300 ZNF544
UHNhscpg0011163 .4336394 chr 19: 58188067-58188175 0 90338 ZNF160
UHNhscpg0003691 .4439328 chr 8: 116422851-116423033 66,866 7227 TRPS1
UHNhscpg0011833 .4440889 chr 13: 99427912-99428530 3,789 0 ZIC2c
UHNhscpg0005658 .4461968 chr 7: 4636276-4636727 0 55698 FLJ10324
UHNhscpg0003824 .4462915 chr 2: 107673648-107673883 238,538 285190 BX537861
UHNhscpg0006149 .4473537 chr 1: 193309526-193310390 370 343450 SLICK
UHNhscpg0005850 .4475083 chr 11: 959952-960191 0 161 AP2A2
UHNhscpg0009680 .4498291 chr 14: 38935426-38935732 978 254170 FBXO33
UHNhscpg0009704 .4509138 chr 19: 45765931-45766515 0 57731 SPTBN4
UHNhscpg0007755 .4520231 chr 15: 76517964-76518338 0 0 IREB2
UHNhscpg0008364 .4540589 chr 3: 172679809-172679986 18,989 23043 AB011123
UHNhscpg0008495 .4544119 chr 21: 36354670-36354909 10 54093 C21orf18
UHNhscpg0006360 .4566517 chr 10: 102972869-102973961 2,762 10660 LBX1e
UHNhscpg0005626 .4603208 chr 1: 43493737-43494422 0 991 CDC20
UHNhscpg0011755 .4645775 chr 11: 18684679-18684913 0 0 FLJ37794
UHNhscpg0009584 .4827054 chr 14: 38935426-38935732 978 254170 FBXO33
UHNhscpg0010541 .4897825 chr 20: 1875274-1875325 6,734 140885 PTPNS1c
UHNhscpg0007861 .4988212 chr 19: 8563202-8563420 0 81794 ADAMTS10d
UHNhscpg0010637 .5227931 chr 20: 1875274-1875325 6,734 140885 PTPNS1c
UHNhscpg0005162 .5267622 chr 4: 123228934-123229348 16,718 0 TRPC3
UHNhscpg0011884 .5376706 chr 8: 39748508-39748595 0 2515 ADAM2e
UHNhscpg0005387 .5521166 chr 19: 19634927-19635609 0 57130 ATP13A

Note.— Individuals were ordered by increasing age, and gene-specific DNA methylation dynamics were investigated using the individual ages (sperm DNA–HHA age range 22–35 years) as a covariate.

a

Negative score = increasing methylation with respect to age. Positive score = decreasing methylation with respect to age.

b

chr = chromsome.

c

Genes related to brain/neuronal development.

d

Genes related to cancer or other disease.

e

Genes related to spermatogenesis, embryogenesis, and development.

DNA Methylation in the Repetitive Elements

All the above analyses were performed on unique DNA sequences; however, the CpG island microarray also contains a large number of clones containing repetitive elements, which, as a rule, are heavily methylated.35 Although it is difficult to directly distinguish between methylation and copy-number differences, one possible approach is to compare methylation of repetitive elements in the sperm to that in other tissues. For this reason, the sperm DNA–HpaII data set was analyzed in comparison with the brain DNA–HpaII data set. The microarray loci that contain a single repetitive element were separated into each repeat class, and the mean CV (±SD) was calculated. If the repetitive elements were influencing the methylation variability, one would expect that those loci containing repetitive elements would display significantly different mean CVs than those of nonrepetitive loci. This analysis revealed the overall average repetitive-element CV of 10.5 in the sperm, compared with the overall average CV in nonrepetitive elements of 9.6. The breakdown of CV for each type of repetitive element represented on the microarray is shown in figure 6. This analysis identified that satellite DNA repeats were statistically more variable than other repetitive elements in the sperm DNA–HpaII data set (P=6.12×10-17). In comparison, this effect was far less pronounced in the brain DNA–HpaII data set (P=.0027) (fig. 6A). When the satellite repeats were further separated into specific repeat classes, a number of repeat classes, predominantly centromeric or pericentromeric satellite repeats, were identified as responsible for this increase in interindividual variability, including (GAATTC)n (CV=18.5), ALR/α (human α-repetitive DNA [CV=25.0]), CER (human D22Z3-centromeric–repetitive DNA [CV=18.7]), and HSATII repeats (human satellite II DNA [CV=34.8]) (fig. 6B).

Figure 6. .

Figure  6. 

Repetitive-element analysis in sperm DNA–HpaII and brain DNA–HpaII data sets. The microarray loci that contain a single repetitive element were separated into each repeat class, and the mean CV (±SD) was calculated (A). The repeat classes include DNA transposons (n=209), long interspersed transposable elements (LINEs [n=771]), low-complexity repeats (n=461), long terminal repeats (LTRs [n=360]), satellites (n=208), simple repeats (n=346), SINEs (n=1,058), small nuclear RNA (snRNA [n=30]), and tRNA (n=40), and the nonrepetitive loci (n=6,976) are presented for comparison. The satellite repeats were the only class to show significantly increased variability in the sperm DNA–HpaII (P=6.12×10-17) and less-significantly increased variability in the brain DNA–HpaII (P=.0027) data sets. B, Breakdown of the satellite repeats into specific satellite-repeat classes, which reveals a number of repeat classes with increased variability—predominantly, the centromeric satellite repeats, including (GAATTC)n (P=8.44×10-17; n=55), ALR/α (P=4.08×10-25; n=119), CER (P=.0026; n=6), and HSATII repeats (P=3.91×10-5; n=19) but not BSR/β repeats (P>.05; n=7).

Validation of the Microarray Data with Use of Bisulphite Modification–Based Methylated/Unmethylated Cytosine Analysis

For validation of the microarray data, 12 loci that were detected as variable in the CpG island microarray analysis (table 5) were analyzed using the MS-SNuPE reaction on the ABI SNapShot platform27 at the CpG dinucleotides in the HpaII and the Hin6I or AciI restriction sites. Initially, such loci were selected on the basis of increased variability (>90th percentile) in the sperm DNA–HHA data set; in addition, a number of these loci were also highly variable in the sperm DNA–HpaII data set (CDH13, SCAM1, MKL2, and DIRAS3). Each of the 12 loci selected were initially resequenced to confirm the identity of the sequence. DNA samples from 11 individuals were treated with sodium bisulphite and were PCR amplified, and primer extension reactions were performed to interrogate 65 CpG dinucleotides within the 12 sequences. Examples of six loci are presented in figure 7. This analysis revealed variable levels of methylation differences in at least one enzyme-restriction site in 11 of 12 loci tested. It should be noted here that DNA methylation differences in a single restriction site may be sufficient to generate significant differences in the microarray analysis. Only one locus (DIRAS3) showed no methylation differences between the 11 individuals; however, we were able to test only 5 of 20 CpG sites at this locus, so methylation variation in the untested CpG sites cannot be ruled out. To assess the replicability of the assay, we repeated the MS-SNuPE/SNaPshot experiment on five loci in five individuals. Consistent with published data,27 the results in this second round of experiments were within 5% of the first experiment, on average (range 1.7%–9.9%).

Table 5. .

List of Clones Selected for Bisulfite-Modification MS-SNuPE Analysis

University Health Network Accession Number Gene Gene Description Chromosome
Location
Total
Enzyme
CpGsa
CpGs
Testedb
Variable
Methylated
CpGsc
UHNhscpg0004931 OLR1 Oxidized low-density lipoprotein receptor 1 12p13.2 12 11 9
UHNhscpg0004063 CDH13 Cadherin 13 preproprotein 16q23.3 8 8 8
UHNhscpg0002847 SCAM1 Sorbin and SH3 domain–containing 3 8p21.3 13 6 5
UHNhscpg0003990 NELL2 NEL-like 2 (chicken) 12q12 2 2 2
UHNhscpg0003907 NEIL2 Nei-like 2 (Escherichia coli) 8p23.1 7 6 4
UHNhscpg0001947 MKL2 Megakaryoblastic leukemia 2 16p13.12 10 5 2
UHNhscpg0000823 2-PDE 2′-Phosphodiesterase 3p14.3 20 6 1
UHNhscpg0004641 RHOQ Ras-related GTP-binding protein TC10 2p21 12 6 1
UHNhscpg0002006 DIRAS3 Ras homolog gene family, member I 1p31.2 20 5 0
UHNhscpg0005090 AHR RWD domain–containing 3 1p21.3 3 3 2
UHNhscpg0009548 DSCAM Down syndrome cell-adhesion molecule 21q22.2 4 3 3
UHNhscpg0004745 FBN1 Fibrillin 1 (Marfan syndrome) 15q21.1 2 2 2
a

The number of CpG sites within the recognition sequence of restriction enzymes HpaII, Hin6I, and AciI.

b

The number of CpG sites tested by MS-SNuPE.

c

The number of CpG sites that showed variable methylation between individuals.

Figure 7. .

Figure  7. 

MS-SNuPE analysis of densities of methylated cytosines in CpG dinucleotides of selected genes. Genomic DNA from 11 individuals was treated with sodium bisulphite and then was PCR amplified for each gene. The genes NELL2, SCAM1, NEIL2, MKL2, CDH13, and OLR1 are represented. The methylation status of CpG dinucleotides within each of the restriction-enzyme sites was interrogated using the primer-extension reactions. Methylation of each of the CpG dinucleotides is represented as a percentage of methylated PCR products: completely unmethylated (white circles), partially methylated (partially black circles), or completely methylated (black circles).

Finally, for further validation of the MS-SNuPE method and microarray results, we performed bisulphite genomic sequencing of 30 clones from five individuals on a locus within the gene that encodes cadherin 13 (CDH13 [University Health Network accession number UHNhscpg0004063]). This analysis revealed a clear-cut bimodal distribution of epialleles, with the majority of clone sequences being either mostly methylated across all 16 CpG dinucleotides tested or predominantly unmethylated. In addition, this sequencing analysis identified a SNP, C/G—also identified in dbSNP as rs16961372—with a rare C allele frequency of 0.396 in whites. Of the five individuals sequenced, one was homozygous C, one was homozygous G, and the other three were C/G heterozygous (fig. 8). Of particular interest is the substantially higher density of methylated cytosines on the G allele, whereas the C alleles predominantly exhibit a low degree of methylation. Counting all clones across all five individuals together, we found that 67 (77%) of 87 of the sequences with the G allele were methylated, whereas only 14 (22%) of 63 sequences containing the C allele were methylated (χ2=40.4; P=2.08×10-10).

Figure 8. .

Figure  8. 

Methylation profiles of CDH13. Methylation status of 16 CpG sites surrounding the CDH13 C/G SNP across 30 clones sequenced in each of five tested individuals. Seventy-seven percent (67/87) of the G alleles are methylated (four or more methylated CpGs), whereas 78% (49/63) of the C (bisulphite-converted to T) alleles are unmethylated. The first seven CpG dinucleotides interrogated by MS-SNuPE in figure 7 are represented in this figure as CpGs 5, 6, 9, 10, 13, 15, and 16. CpG 9 is the third MS-SNuPE primer that was predominantly unmethylated in all individuals. Each individual is represented, with single CpG dinucleotides from left to right (black = methylated; white = unmethylated) and with individual clones from top to bottom.

Given that the microarray analysis suggested that promoter CpG islands were significantly more variable, we also performed bisulphite genomic sequencing of the promoter CpG island of CDH13, which was not represented on the CpG microarray. This analysis, however, found that the promoter CpG island of CDH13 is predominantly unmethylated in all individuals, with only solitary methylation sites present in one to three clones for each of the individuals.

Discussion

In this study, we performed an in-depth analysis to address the question of epigenetic variability in the germline. The main conclusions are that (1) the male germline exhibits locus-, cell-, and age-dependent DNA methylation differences and that (2) DNA methylation variation is significant across unrelated individuals, at a level that, by far, exceeds DNA sequence variation. These findings are interesting from both basic molecular biological and biomedical points of view.

First, our study contributes to the understanding of epigenetic peculiarities of gene regulatory regions in the germline. It has been generally accepted that CpG islands are predominantly unmethylated,36 which implies that DNA methylation differences would not be expected there. From our studies, we find that even relatively low densities of methylated cytosines in the CpG islands are sufficient to generate unique epigenetic profiles in DNA regions that do not exhibit any DNA sequence variation, both in different cells of the same individual and also across individuals. Fine-mapping of methylated cytosines of relatively short DNA fragments of BRCA1, BRCA2, PSEN1, PSEN2, DM1, and HD suggest that each sperm cell is unique not only in terms of DNA sequence but also in epigenomic profile, and variation of the latter by far exceeds the former.

At the genomewide level, unexpectedly, promoter CpG islands exhibited larger interindividual variation compared with other single-copy DNA sequences, including the non–promoter CpG islands. This epigenetic phenomenon seems to be discordant, with a general rule that functionally important loci exhibit a low degree of DNA variation, as is seen in the case of SNPs being less common in promoters and exonic sequences than in introns and intergenic regions. In addition, promoter CpG–rich regions are often highly conserved between species; for instance, the mouse genome contains 15,500 CpG islands, of which ∼10,000 are highly conserved.37 Therefore, if the epigenetic variability were just “noise” of little functional relevance, one would expect more variability in these less biologically important regions, such as introns and intergenic sequences. Evidence of the opposite—increased epigenetic variability in the regions that directly control gene activity—may indicate some peculiarities of DNA methylation machinery during gametogenesis that may or may not be of functional importance in the somatic cells (see below for discussion of the postzygotic [in]stability of inherited epigenetic profiles).

Our study has also identified a larger degree of interindividual variability of centromeric satellite repeats. Although we cannot strictly rule out the possibility of DNA copy differences, which are common in centromeric satellite repeats,38 the fact that the germ cell data set showed substantially larger CVs in comparison with the brain DNA data set suggests that germline satellite methylation differences in the germ cells could be a genuine biological phenomenon. Interindividual methylation variability in satellite repeats is consistent with current knowledge39 and may contribute to phenotypic variability in immunodeficiency-centromeric instability-facial anomalies syndrome (ICF [MIM 242860]), a disease that is associated with methylation defects in pericentromeric satellites.40 In addition, microRNAs (or siRNAs) regulate gene expression, heterochromatin formation, and genome stability and often arise from demethylation of tandem repeats that are common in pericentromeric sequences.41 Therefore, interindividual methylation variability in tandem repeats that give rise to microRNAs could also be involved in the variability in gene expression that results in inherited phenotypic variation. A recent study has described increased interindividual variability in the methylation of Alu repeats42 in whole-blood DNA, a finding that was not obvious in our analysis of germ cells, in which short interspersed transposable elements (SINEs) were not statistically different from nonrepetitive elements. However, Sandovici et al. noted that the parental-origin differences in methylation were identified only for Alu elements in pericentromeric chromosomal bands,42 which is consistent with our results.

Second, epigenetic variation within and across germline samples could be of significant interest in human morbid genetics, which, thus far, has nearly exclusively concentrated on DNA sequence differences. Inherited epigenetic variation may provide the basis for new hypotheses and experimental designs in the studies of various human diseases, where the traditional DNA sequence–based studies are reaching the limit of explanatory power. For example, although Huntington disease (HD) is caused by trinucleotide-repeat expansion in the HD gene, the correlation between the number of trinucleotide repeats and age at onset for patients with later-onset HD (at age >50 years) is low.43 The epigenetic status of the HD promoter region may contribute to the steady-state HD mRNA levels and, therefore, to the production of toxic polyglutamine-containing proteins. HD genes containing identical trinucleotide-repeat expansion but differential DNA methylation and chromatin compaction in the promoter region may exhibit significant differences, in terms of their pathogenic potential reflected in the age at disease onset and severity of disease.

The role of differential germline epigenetic modification in complex non-Mendelian disease may be even more critical. Despite significant efforts over the past several decades, DNA sequence–based risk factors have been uncovered in only a small fraction of complex diseases, such as familial breast cancer and early-onset Alzheimer disease. For a number of complex diseases, genetic epidemiological studies showed that DNA sequence differences account for only a small proportion of phenotypic variance among relatives, whereas the substantial remaining fraction of phenotypic differences (in some cancers, 58%–82%44) are typically attributed to environment. Identification of causal environmental factors is very difficult because methodologically impeccable designs in epidemiological studies, as a rule, cannot be applied to humans.45 At the same time, there is an increasing body of evidence that environmental factors play a minimal role in a number of complex traits and disease conditions.46 In this context, epigenetic variation in the germline arises as a new molecular mechanism that may help the understanding of complex phenotypes that are not the outcome of DNA sequence variation or differential environment. The recent finding of germline epimutations of MLH1 in three individuals affected with multiple cancers11,47 provides a good starting point for a systematic search for disease-specific epimutations in the germline.12

In our bisulphite modification–based analyses, the overwhelming majority of loci exhibited rather subtle DNA methylation differences (“shades of gray” type), whereas methylation of the locus within CDH13 is clearly bimodal (“black or white” type). The cadherin gene is a putative mediator of cell-cell interaction in the heart and may act as a negative regulator of neural cell growth. This gene is not imprinted; however, the promoter is hypermethylated in numerous cancers.4853 Of particular interest is the finding that DNA methylation profiles are associated with DNA alleles; the C allele of CDH13 is predominantly unmethylated, whereas the G allele is predominantly methylated. To our knowledge, only a few human studies have identified associations between DNA sequence and epigenetic profiles. In Beckwith-Wiedemann syndrome, loss of maternal allele–specific methylation was more common on the G allele at the T382G SNP (CAGA haplotype) of the differentially methylated region KvDMR1.54 A common variant, 677C→T, at the 5′ 10-methylenetetrahydrofolate reductase gene (MTHFR), is associated with an increased risk of imprinting defects in the Prader-Willi syndrome/Angelman syndrome region of 15q.55 A comprehensive screen of chromosome 21q has identified a single CpG island with a C/G SNP that was methylated in peripheral blood DNA on the C allele, regardless of the parent of origin.56 Finally, the C102T polymorphism in the serotonin 5-HT2A receptor gene (5HT2AR), which has been associated with several psychiatric disorders, was methylated specifically on the C allele.57 In mice, a recent study identified a number of genes that, on in vitro mutation, affect epigenetic reprogramming during gametogenesis and early development on a genomewide level, suggesting a further mechanism by which DNA sequence (mutations) in trans can affect the epigenetic state.58 A comprehensive epigenetic analysis of SNPs is warranted, and this effort may shed a new light on rather inconsistent genetic association studies in complex disease. Epialleles and epihaplotypes that combine both DNA sequence and epigenetic information may be better predictors of the risk for a disease than any of the two components analyzed separately.

A number of genes in the sperm exhibited DNA methylation changes that correlate with age (fig. 5 and tables 3 and 4). This finding is particularly interesting in light of the evidence that older paternal age is associated with risk for schizophrenia in the offspring.59,60 Although it has been hypothesized that such effects could be due to epigenetic changes in the paternal genome, no locus-specific and age-dependent epigenetic changes in the human male germline have been identified thus far. In this study, a number of genes that show age-related changes in their DNA methylation have been detected, including a number of important developmental genes. The embryonic ectoderm development gene, EED, is a polycomb group gene involved in maintaining the epigenetically regulated repressive state of developmental genes over successive cell generations.61 CTNNA2, or catenin, is a neuronal cadherin-associated protein and may play a major role in the folding and lamination of the cerebral cortex.62 CALM1, or calmodulin, is a key calcium-modulated protein that functions in growth and in the cell cycle, as well as in signal transduction and in the synthesis and release of neurotransmitters. STMN2, or stathmin-like 2, is a neuronal growth-associated protein that shares significant amino acid–sequence similarity with the phosphoprotein stathmin, and CDH13, as described above, is the heart cadherin and is hypermethylated in a number of cancers.

All the above phenotype-related aspects were discussed under the assumption that the epigenetic peculiarities of the germline are, at least to some extent, reflected in the somatic cells after birth. What proportion and to what extent these inherited epigenetic signals can “survive” the reprogramming that immediately follows fertilization, as well as during the later stages of embryogenesis,13,63,64 remain to be investigated. The methylation clearing is not complete and, on a global DNA level, is reduced to ∼10%.65,66 That could represent 90% of all methylation for each gene being erased or could mean that 90% of methylated genes are completely cleared and that 10% of genes retain their methylation, or there could be numerous combinations of the two. It is also unknown what happens to the histone modifications through these phases of loss of DNA methylation signals. Since modifications of DNA and histones are codependent, even if the DNA methylation signals are erased, the histones may be able to carry on specific epigenetic messages to the next stage until the DNA gets remethylated. This concept of cellular memory through histone modifications has been demonstrated for polycomb group proteins through H3K27 trimethylation.67 A combined analysis of both histone modifications and DNA methylation dynamics, from zygote to postnatal stage, is required for the understanding of the importance of germline epigenetics to phenotypic outcomes.

The second aspect that will determine biological importance of the epigenetic variation in the germline is transgenerational epigenetic inheritance: can complex DNA methylation patterns, at least to some extent, be inherited from the parents and transmitted to the offspring? There is already experimental evidence demonstrating epigenetic meiotic inheritance across different species, such as yeast,68 Arabidopsis,69 Drosophila,70,71 and mice.20,21 Although there is no doubt that transgenerational epigenetic inheritance does exist, it is not clear if this is limited to a few loci or if it is a common genomewide phenomenon.

Acknowledgments

This research has been supported by the Special Initiative grant from the Ontario Mental Health Foundation, Canadian Institutes for Health and Research, National Institute of Mental Health, and by the National Alliance for Research on Schizophrenia and Depression, the Stanley Foundation, and the Crohn’s and Colitis Foundation of Canada. We acknowledge Sigrid Ziegler for technical assistance.

Web Resources

Accession numbers and URLs for data presented herein are as follows:

  1. Center for Addiction and Mental Health: Epigenomics, http://www.epigenomics.ca (for the online data linking the germline epigenetic variation to the genome, by use of the UCSC Genome Browser)
  2. dbSNP, http://www.ncbi.nlm.nih.gov/SNP/ (for rs16961372)
  3. Entrez Gene, http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene (for PSEN1 [accession number 5663], PSEN2 [accession number 5664], BRCA1 [accession number 672], BRCA2 [accession number 675], HD [accession number 3064], DM1 [accession number 1760], CDH13 [accession number 1012], and genes in tables and )
  4. HapMap, http://www.hapmap.org/
  5. Human Epigenome Project, http://www.epigenome.org/index.php
  6. Online Mendelian Inheritance in Man (OMIM), http://www.ncbi.nlm.nih.gov/Omim/ (for PSEN1, PSEN2, BRCA1, BRCA2, DM1, HD, CDH13, and ICF)
  7. UCSC Genome Browser, http://genome.ucsc.edu/
  8. University Health Network CpG Island Microarray Database, http://data.microarrays.ca/cpg/index.htm

References

  • 1.Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, et al (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921 10.1038/35057062 [DOI] [PubMed] [Google Scholar]
  • 2.Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, et al (2001) The sequence of the human genome. Science 291:1304–1351 10.1126/science.1058040 [DOI] [PubMed] [Google Scholar]
  • 3.Altshuler D, Brooks LD, Chakravarti A, Collins FS, Daly MJ, Donnelly P (2005) A haplotype map of the human genome. Nature 437:1299–1320 10.1038/nature04226 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Chimpanzee Sequencing and Analysis Consortium (2005) Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437:69–87 10.1038/nature04072 [DOI] [PubMed] [Google Scholar]
  • 5.Henikoff S, Matzke MA (1997) Exploring and explaining epigenetic effects. Trends Genet 13:293–295 10.1016/S0168-9525(97)01219-5 [DOI] [PubMed] [Google Scholar]
  • 6.Li E, Bestor TH, Jaenisch R (1992) Targeted mutation of the DNA methyltransferase gene results in embryonic lethality. Cell 69:915–926 10.1016/0092-8674(92)90611-F [DOI] [PubMed] [Google Scholar]
  • 7.Jaenisch R, Bird A (2003) Epigenetic regulation of gene expression: how the genome integrates intrinsic and environmental signals. Nat Genet 33:S245–S254 10.1038/ng1089 [DOI] [PubMed] [Google Scholar]
  • 8.Rakyan VK, Hildmann T, Novik KL, Lewin J, Tost J, Cox AV, Andrews TD, Howe KL, Otto T, Olek A, Fischer J, Gut IG, Berlin K, Beck S (2004) DNA methylation profiling of the human major histocompatibility complex: a pilot study for the human epigenome project. PLoS Biol 2:e405 10.1371/journal.pbio.0020405 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Li E (2002) Chromatin modification and epigenetic reprogramming in mammalian development. Nat Rev Genet 3:662–673 10.1038/nrg887 [DOI] [PubMed] [Google Scholar]
  • 10.Reik W, Walter J (2001) Genomic imprinting: parental influence on the genome. Nat Rev Genet 2:21–32 10.1038/35047554 [DOI] [PubMed] [Google Scholar]
  • 11.Suter CM, Martin DI, Ward RL (2004) Germline epimutation of MLH1 in individuals with multiple cancers. Nat Genet 36:497–501 10.1038/ng1342 [DOI] [PubMed] [Google Scholar]
  • 12.Martin DI, Ward R, Suter CM (2005) Germline epimutation: a basis for epigenetic disease in humans. Ann N Y Acad Sci 1054:68–77 10.1196/annals.1345.009 [DOI] [PubMed] [Google Scholar]
  • 13.Reik W, Dean W, Walter J (2001) Epigenetic reprogramming in mammalian development. Science 293:1089–1093 10.1126/science.1063443 [DOI] [PubMed] [Google Scholar]
  • 14.Morgan HD, Sutherland HG, Martin DI, Whitelaw E (1999) Epigenetic inheritance at the agouti locus in the mouse. Nat Genet 23:314–318 10.1038/15490 [DOI] [PubMed] [Google Scholar]
  • 15.Allegrucci C, Thurston A, Lucas E, Young L (2005) Epigenetics and the germline. Reproduction 129:137–149 10.1530/rep.1.00360 [DOI] [PubMed] [Google Scholar]
  • 16.Kimmins S, Sassone-Corsi P (2005) Chromatin remodelling and epigenetic features of germ cells. Nature 434:583–589 10.1038/nature03368 [DOI] [PubMed] [Google Scholar]
  • 17.Zalensky AO, Siino JS, Gineitis AA, Zalenskaya IA, Tomilin NV, Yau P, Bradbury EM (2002) Human testis/sperm-specific histone H2B (hTSH2B): molecular cloning and characterization. J Biol Chem 277:43474–43480 10.1074/jbc.M206065200 [DOI] [PubMed] [Google Scholar]
  • 18.Churikov D, Zalenskaya IA, Zalensky AO (2004) Male germline-specific histones in mouse and man. Cytogenet Genome Res 105:203–214 10.1159/000078190 [DOI] [PubMed] [Google Scholar]
  • 19.Churikov D, Siino J, Svetlova M, Zhang K, Gineitis A, Morton Bradbury E, Zalensky A (2004) Novel human testis-specific histone H2B encoded by the interrupted gene on the X chromosome. Genomics 84:745–756 10.1016/j.ygeno.2004.06.001 [DOI] [PubMed] [Google Scholar]
  • 20.Rakyan V, Whitelaw E (2003) Transgenerational epigenetic inheritance. Curr Biol 13:R6 10.1016/S0960-9822(02)01377-5 [DOI] [PubMed] [Google Scholar]
  • 21.Chong S, Whitelaw E (2004) Epigenetic germline inheritance. Curr Opin Genet Dev 14:692–696 10.1016/j.gde.2004.09.001 [DOI] [PubMed] [Google Scholar]
  • 22.Hajkova P, el-Maarri O, Engemann S, Oswald J, Olek A, Walter J (2002) DNA-methylation analysis by the bisulfite-assisted genomic sequencing method. Methods Mol Biol 200:143–154 [DOI] [PubMed] [Google Scholar]
  • 23.Yatabe Y, Tavare S, Shibata D (2001) Investigating stem cells in human colon by using methylation patterns. Proc Natl Acad Sci USA 98:10839–10844 10.1073/pnas.191225998 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Heisler LE, Torti D, Boutros PC, Watson J, Chan C, Winegarden N, Takahashi M, Yau P, Huang TH, Farnham PJ, Jurisica I, Woodgett JR, Bremner R, Penn LZ, Der SD (2005) CpG island microarray probe sequences derived from a physical library are representative of CpG islands annotated on the human genome. Nucleic Acids Res 33:2952–2961 10.1093/nar/gki582 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Schumacher A, Kapranov P, Kaminsky Z, Flanagan J, Assadzadeh A, Yau P, Virtanen C, Winegarden N, Cheng J, Gingeras T, Petronis A (2006) Microarray-based DNA methylation profiling: technology and applications. Nucleic Acids Res 34:528–542 10.1093/nar/gkj461 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Dudoit S, Fridlyand J (2002) A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biol 3:RESEARCH0036 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Kaminsky ZA, Assadzadeh A, Flanagan J, Petronis A (2005) Single nucleotide extension technology for quantitative site-specific evaluation of metC/C in GC-rich regions. Nucleic Acids Res 33:e95 10.1093/nar/gni094 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Gardiner-Garden M, Frommer M (1987) CpG islands in vertebrate genomes. J Mol Biol 196:261–282 10.1016/0022-2836(87)90689-9 [DOI] [PubMed] [Google Scholar]
  • 29.Sebat J, Lakshmi B, Troge J, Alexander J, Young J, Lundin P, Maner S, Massa H, Walker M, Chi M, Navin N, Lucito R, Healy J, Hicks J, Ye K, Reiner A, Gilliam TC, Trask B, Patterson N, Zetterberg A, Wigler M (2004) Large-scale copy number polymorphism in the human genome. Science 305:525–528 10.1126/science.1098918 [DOI] [PubMed] [Google Scholar]
  • 30.Tuzun E, Sharp AJ, Bailey JA, Kaul R, Morrison VA, Pertz LM, Haugen E, Hayden H, Albertson D, Pinkel D, Olson MV, Eichler EE (2005) Fine-scale structural variation of the human genome. Nat Genet 37:727–732 10.1038/ng1562 [DOI] [PubMed] [Google Scholar]
  • 31.Iafrate AJ, Feuk L, Rivera MN, Listewnik ML, Donahoe PK, Qi Y, Scherer SW, Lee C (2004) Detection of large-scale variation in the human genome. Nat Genet 36:949–951 10.1038/ng1416 [DOI] [PubMed] [Google Scholar]
  • 32.Craig JM, Bickmore WA (1993) Chromosome bands: flavours to savour. Bioessays 15:349–354 10.1002/bies.950150510 [DOI] [PubMed] [Google Scholar]
  • 33.Furey TS, Haussler D (2003) Integration of the cytogenetic map with the draft human genome sequence. Hum Mol Genet 12:1037–1044 10.1093/hmg/ddg113 [DOI] [PubMed] [Google Scholar]
  • 34.Holmquist GP (1992) Chromosome bands, their chromatin flavors, and their functional features. Am J Hum Genet 51:17–37 [PMC free article] [PubMed] [Google Scholar]
  • 35.Yoder JA, Walsh CP, Bestor TH (1997) Cytosine methylation and the ecology of intragenomic parasites. Trends Genet 13:335–340 10.1016/S0168-9525(97)01181-5 [DOI] [PubMed] [Google Scholar]
  • 36.Bird AP (1986) CpG-rich islands and the function of DNA methylation. Nature 321:209–213 10.1038/321209a0 [DOI] [PubMed] [Google Scholar]
  • 37.Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P, Agarwala R, et al (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 420:520–562 10.1038/nature01262 [DOI] [PubMed] [Google Scholar]
  • 38.Jabs EW, Goble CA, Cutting GR (1989) Macromolecular organization of human centromeric regions reveals high-frequency, polymorphic macro DNA repeats. Proc Natl Acad Sci USA 86:202–206 10.1073/pnas.86.1.202 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Miniou P, Jeanpierre M, Bourc’his D, Coutinho Barbosa AC, Blanquet V, Viegas-Pequignot E (1997) α-Satellite DNA methylation in normal individuals and in ICF patients: heterogeneous methylation of constitutive heterochromatin in adult and fetal tissues. Hum Genet 99:738–745 10.1007/s004390050441 [DOI] [PubMed] [Google Scholar]
  • 40.Gisselsson D, Shao C, Tuck-Muller CM, Sogorovic S, Palsson E, Smeets D, Ehrlich M (2005) Interphase chromosomal abnormalities and mitotic missegregation of hypomethylated sequences in ICF syndrome cells. Chromosoma 114:118–126 10.1007/s00412-005-0343-7 [DOI] [PubMed] [Google Scholar]
  • 41.Lippman Z, Martienssen R (2004) The role of RNA interference in heterochromatic silencing. Nature 431:364–370 10.1038/nature02875 [DOI] [PubMed] [Google Scholar]
  • 42.Sandovici I, Kassovska-Bratinova S, Loredo-Osti JC, Leppert M, Suarez A, Stewart R, Bautista FD, Schiraldi M, Sapienza C (2005) Interindividual variability and parent of origin DNA methylation differences at specific human Alu elements. Hum Mol Genet 14:2135–2143 10.1093/hmg/ddi218 [DOI] [PubMed] [Google Scholar]
  • 43.Wexler NS, Lorimer J, Porter J, Gomez F, Moskowitz C, Shackell E, Marder K, et al (2004) Venezuelan kindreds reveal that genetic and environmental factors modulate Huntington’s disease age of onset. Proc Natl Acad Sci USA 101:3498–3503 10.1073/pnas.0308679101 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Lichtenstein P, Holm NV, Verkasalo PK, Iliadou A, Kaprio J, Koskenvuo M, Pukkala E, Skytthe A, Hemminki K (2000) Environmental and heritable factors in the causation of cancer: analyses of cohorts of twins from Sweden, Denmark, and Finland. N Engl J Med 343:78–85 10.1056/NEJM200007133430201 [DOI] [PubMed] [Google Scholar]
  • 45.Taubes G (1995) Epidemiology faces its limits. Science 269:164–169 [DOI] [PubMed] [Google Scholar]
  • 46.Wong AH, Gottesman II, Petronis A (2005) Phenotypic differences in genetically identical organisms: the epigenetic perspective. Hum Mol Genet 14:R11–R18 10.1093/hmg/ddi116 [DOI] [PubMed] [Google Scholar]
  • 47.Hitchins M, Williams R, Cheong K, Halani N, Lin VA, Packham D, Ku S, Buckle A, Hawkins N, Burn J, Gallinger S, Goldblatt J, Kirk J, Tomlinson I, Scott R, Spigelman A, Suter C, Martin D, Suthers G, Ward R (2005) MLH1 germline epimutations as a factor in hereditary nonpolyposis colorectal cancer. Gastroenterology 129:1392–1399 10.1053/j.gastro.2005.09.003 [DOI] [PubMed] [Google Scholar]
  • 48.Hibi K, Kodera Y, Ito K, Akiyama S, Nakao A (2004) Methylation pattern of CDH13 gene in digestive tract cancers. Br J Cancer 91:1139–1142 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Ogama Y, Ouchida M, Yoshino T, Ito S, Takimoto H, Shiote Y, Ishimaru F, Harada M, Tanimoto M, Shimizu K (2004) Prevalent hyper-methylation of the CDH13 gene promoter in malignant B cell lymphomas. Int J Oncol 25:685–691 [PubMed] [Google Scholar]
  • 50.Sakai M, Hibi K, Koshikawa K, Inoue S, Takeda S, Kaneko T, Nakao A (2004) Frequent promoter methylation and gene silencing of CDH13 in pancreatic cancer. Cancer Sci 95:588–591 10.1111/j.1349-7006.2004.tb02491.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Hibi K, Nakayama H, Kodera Y, Ito K, Akiyama S, Nakao A (2004) CDH13 promoter region is specifically methylated in poorly differentiated colorectal cancer. Br J Cancer 90:1030–1033 10.1038/sj.bjc.6601647 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Widschwendter A, Ivarsson L, Blassnig A, Muller HM, Fiegl H, Wiedemair A, Muller-Holzner E, Goebel G, Marth C, Widschwendter M (2004) CDH1 and CDH13 methylation in serum is an independent prognostic marker in cervical cancer patients. Int J Cancer 109:163–166 10.1002/ijc.11706 [DOI] [PubMed] [Google Scholar]
  • 53.Roman-Gomez J, Castillejo JA, Jimenez A, Cervantes F, Boque C, Hermosin L, Leon A, Granena A, Colomer D, Heiniger A, Torres A (2003) Cadherin-13, a mediator of calcium-dependent cell-cell adhesion, is silenced by methylation in chronic myeloid leukemia and correlates with pretreatment risk profile and cytogenetic response to interferon alfa. J Clin Oncol 21:1472–1479 10.1200/JCO.2003.08.166 [DOI] [PubMed] [Google Scholar]
  • 54.Murrell A, Heeson S, Cooper WN, Douglas E, Apostolidou S, Moore GE, Maher ER, Reik W (2004) An association between variants in the IGF2 gene and Beckwith-Wiedemann syndrome: interaction between genotype and epigenotype. Hum Mol Genet 13:247–255 10.1093/hmg/ddh013 [DOI] [PubMed] [Google Scholar]
  • 55.Zogel C, Bohringer S, Gross S, Varon R, Buiting K, Horsthemke B (2006) Identification of cis- and trans-acting factors possibly modifying the risk of epimutations on chromosome 15. Eur J Hum Genet (http://www.nature.com/ejhg/journal/vaop/ncurrent/full/5201602a.html) (electronically published April 5, 2006; accessed May 22, 2006) [DOI] [PubMed] [Google Scholar]
  • 56.Yamada Y, Watanabe H, Miura F, Soejima H, Uchiyama M, Iwasaka T, Mukai T, Sakaki Y, Ito T (2004) A comprehensive analysis of allelic methylation status of CpG islands on human chromosome 21q. Genome Res 14:247–266 10.1101/gr.1351604 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Polesskaya OO, Aston C, Sokolov BP (2006) Allele C-specific methylation of the 5-HT2A receptor gene: evidence for correlation with its expression and expression of DNA methylase DNMT1. J Neurosci Res 83:362–373 10.1002/jnr.20732 [DOI] [PubMed] [Google Scholar]
  • 58.Blewitt ME, Vickaryous NK, Hemley SJ, Ashe A, Bruxner TJ, Preis JI, Arkell R, Whitelaw E (2005) An N-ethyl-N-nitrosourea screen for genes involved in variegation in the mouse. Proc Natl Acad Sci USA 102:7629–7634 10.1073/pnas.0409375102 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Malaspina D, Harlap S, Fennig S, Heiman D, Nahon D, Feldman D, Susser ES (2001) Advancing paternal age and the risk of schizophrenia. Arch Gen Psychiatry 58:361–367 10.1001/archpsyc.58.4.361 [DOI] [PubMed] [Google Scholar]
  • 60.Byrne M, Agerbo E, Ewald H, Eaton WW, Mortensen PB (2003) Parental age and risk of schizophrenia: a case-control study. Arch Gen Psychiatry 60:673–678 10.1001/archpsyc.60.7.673 [DOI] [PubMed] [Google Scholar]
  • 61.Montgomery ND, Yee D, Chen A, Kalantry S, Chamberlain SJ, Otte AP, Magnuson T (2005) The murine polycomb group protein Eed is required for global histone H3 lysine-27 methylation. Curr Biol 15:942–947 10.1016/j.cub.2005.04.051 [DOI] [PubMed] [Google Scholar]
  • 62.Smith A, Bourdeau I, Wang J, Bondy CA (2005) Expression of catenin family members CTNNA1, CTNNA2, CTNNB1 and JUP in the primate prefrontal cortex and hippocampus. Brain Res Mol Brain Res 135:225–231 10.1016/j.molbrainres.2004.12.025 [DOI] [PubMed] [Google Scholar]
  • 63.Oswald J, Engemann S, Lane N, Mayer W, Olek A, Fundele R, Dean W, Reik W, Walter J (2000) Active demethylation of the paternal genome in the mouse zygote. Curr Biol 10:475–478 10.1016/S0960-9822(00)00448-6 [DOI] [PubMed] [Google Scholar]
  • 64.Mayer W, Niveleau A, Walter J, Fundele R, Haaf T (2000) Demethylation of the zygotic paternal genome. Nature 403:501–502 [DOI] [PubMed] [Google Scholar]
  • 65.Walsh CP, Chaillet JR, Bestor TH (1998) Transcription of IAP endogenous retroviruses is constrained by cytosine methylation. Nat Genet 20:116–117 10.1038/2413 [DOI] [PubMed] [Google Scholar]
  • 66.Hajkova P, Erhardt S, Lane N, Haaf T, El-Maarri O, Reik W, Walter J, Surani MA (2002) Epigenetic reprogramming in mouse primordial germ cells. Mech Dev 117:15–23 10.1016/S0925-4773(02)00181-8 [DOI] [PubMed] [Google Scholar]
  • 67.Czermin B, Imhof A (2003) The sounds of silence: histone deacetylation meets histone methylation. Genetica 117:159–164 10.1023/A:1022927725945 [DOI] [PubMed] [Google Scholar]
  • 68.Klar AJ (1998) Propagating epigenetic states through meiosis: where Mendel’s gene is more than a DNA moiety. Trends Genet 14:299–301 10.1016/S0168-9525(98)01535-2 [DOI] [PubMed] [Google Scholar]
  • 69.Mittelsten Scheid O, Afsar K, Paszkowski J (2003) Formation of stable epialleles and their paramutation-like interaction in tetraploid Arabidopsis thaliana. Nat Genet 34:450–454 10.1038/ng1210 [DOI] [PubMed] [Google Scholar]
  • 70.Cavalli G, Paro R (1998) The Drosophila Fab-7 chromosomal element conveys epigenetic inheritance during mitosis and meiosis. Cell 93:505–518 10.1016/S0092-8674(00)81181-2 [DOI] [PubMed] [Google Scholar]
  • 71.Cavalli G, Paro R (1999) Epigenetic inheritance of active chromatin after removal of the main transactivator. Science 286:955–958 10.1126/science.286.5441.955 [DOI] [PubMed] [Google Scholar]

Articles from American Journal of Human Genetics are provided here courtesy of American Society of Human Genetics

RESOURCES