Skip to main content
American Journal of Human Genetics logoLink to American Journal of Human Genetics
. 2009 Dec 11;85(6):942–945. doi: 10.1016/j.ajhg.2009.11.006

Response to Falush: A Role for cis-Element Polymorphisms in HD

Simon C Warby 1, Henk Visscher 1, Stefanie Butland 1, Christopher E Pearson 2, Michael R Hayden 1,
PMCID: PMC2790565  PMID: 20004773

Main Text

To the Editor: We thank Falush for his important insights into the evolution of CAG expansion in the huntingtin (HTT) gene. Following observations from his computational model, Falush argues that the patterns observed in the genotyping at the HTT locus1 can be explained by a mutational mechanism that is solely dependent on the size of the CAG tract, and that the evolution of Huntington's disease (HD) chromosomes is most simply explained by a common founder CAG expansion. On this basis, he argues that there is no need to invoke cis-elements having a role in the evolution of HD chromosomes.

We agree that founder CAG expansions likely play a role in the evolution of HD chromosomes in European populations. However, there remain several observations in the genotyping data that are difficult to reconcile with the hypothesis that HD chromosomes evolved exclusively from a common CAG-expanded founder. Furthermore, invalid assumptions made in Falush's computational model weaken the argument for this hypothesis. Instead, we argue that the simplest explanation for our data is that cis-elements make an important contribution to CAG instability at the HTT locus. We propose that cis-element polymorphisms are an influential force behind the apparent multiple founder chromosomes based on the observed pattern of haplotypes in the European population and strong biological precedents for cis-elements influencing trinucleotide instability.

In our original publication, we described three important patterns in the genotyping data.1 The first observation was that CAG expansion in the European population was highly enriched in very specific haplogroup A variants (A1, A2, and A3), but not in other haplogroup A variants (A4 and A5) or other haplogroups (B and C). It is important to note that the two variants with the strongest disease association (A1 and A2) are less similar to each other than either variant is to other non-disease-associated variants; the A1 variant is more closely related to A4 and A5 (differs at one or two SNP positions from A1) than to A2 (differs at three SNP positions). This observation makes it less likely that A1 and A2 are derived from a simple common CAG-expanded founder and more likely that CAG expansion arose independently in these haplotypes.

The computational model described by Falush argues that the multiple founders are the result of genetic drift. This model assumes that CAG instability is strictly CAG-size dependent and that small random increases in the modal CAG size of these specific haplotypes (Falush proposes 17 to >22 CAG in A1, 17 to 20 CAG in A2, and is unclear for A3) were sufficient to cause an enrichment of HD mutations in these haplotypes that are otherwise neutral. However, if haplotypes (cis-elements) play no role in CAG instability, it is not clear why genetic drift would favor expansion multiple times in similar variants of haplogroup A rather than in the other common haplogroup in the general population (haplogroup C). The genetic drift mechanism proposed by Falush appears to be nonrandom. Haplogroup C also has chromosomes in the 20–22 CAG range, but very few expanded into the HD range (we detected none with >35 CAG in our sample). Additionally, CAG expansion is much more common in variant A3 compared to A4, despite having the same modal CAG size of 17 repeats (Figure 4C in Warby et al.1). These discrepancies make it difficult to explain the enrichment of CAG expansion in specific haplotypes through simple genetic drift and CAG dependency alone.

The second observation in our genotyping data was that the pattern of SNP linkage was punctuated rather than decaying as a function of genetic distance, as would be expected for a recent founder chromosome. We agree with Falush that this punctuated pattern could be explained by secondary changes such as new SNPs or gene conversion events, given enough time.

The third observation was that we did not find haplotypes that were unique to expanded chromosomes. We had anticipated that secondary genetic events, such as those generating the punctuated pattern of SNP linkage described above, would produce haplotypes that were found exclusively on CAG-expanded chromosomes. However, even the highest-risk haplotypes (A1 and A2) had CAG tract sizes that extended down into the normal (<27 CAG) range. If the CAG-expanded founder and nonexpanded chromosomes have been separated for substantial time, why has the founder not undergone further genetic variation that discriminates between them?

One possibility is that the haplotypes with founder CAG expansions (and all variants of that founder) have contracted back to the normal CAG size range, as explained by the computational model. We argue that this is unlikely because of the long timescale required for these types of change (see below). The other possible explanation is that the multiple founder CAG expansions are always very small expansions in CAG size, such as those that could result from genetic drift. In addition to the inconsistencies with the genetic drift model stated above, the hypothesis that modest increase (17 to 22) in modal CAG size can result in a massive enrichment of CAG expansion in that haplotype relies heavily on insights from the computational model. However, the validity of the model depends on a few key assumptions that require further examination.

The first assumption of Falush's computational model is that for all CAG tract sizes, the mutation rate is CAG dependent such that it is proportional to the CAG tract size to the power of eight. This mutation rate is based on sperm typing data from 26 men in a Venezuelan HD cohort with CAG sizes ranging from 37 to 62 repeats.2 We agree that this mutational model appropriately describes changes to the CAG tract in sperm for chromosomes in this disease range (37–62 CAG). However, there are no data to validate this mutation rate formula for chromosomes in the intermediate allele range (27–35 CAG), let alone the normal CAG range (<27 CAG). As far as we are aware, there are no data to suggest that all chromosomes with 22 CAG repeats will mutate more rapidly than those with 17 CAG repeats or that the same upward bias exists for normal CAG sizes as for expanded CAG sizes. At high CAG tract sizes, the CAG dependency of the mutation rate may be overwhelming. However, we argue that it is likely that the influence of CAG size on the mutation rate is relatively small at normal-sized CAG tracts, and there is opportunity for other factors to influence stability. The CAG dependency and upward bias of the mutation rate may have a lower cutoff threshold dictated by mechanistic factors such as Okazaki fragment length, DNA damage susceptibility, repair excision tract size, and cis-elements.3,4 Additional experiments are needed to determine whether this large extrapolation of the data from unstable expanded CAG tracts to stable normal-sized CAG tracts is appropriate.

The second assumption of the computational model is that negative selection acts strongly against chromosomes with 50 CAG repeats. In fact, individuals with 50 CAG repeats typically do not become symptomatic until their 30s5 and would therefore have an opportunity to produce offspring before the onset of disease symptoms. Furthermore, other investigators have observed that adult HD individuals have increased numbers of offspring relative the general population, and this increased fertility results in positive selection for HD chromosomes.6 The model also makes assumptions about, but does not incorporate data from, maternally inherited alleles, which are more likely to undergo contractions7,8 than paternally inherited alleles and would increase the number of generations needed before anticipation results in negative selection. The influence of selection on HD chromosomes is likely not straightforward, and we argue that the computational model overestimates the speed and strength of negative selection on the HD population in the 50–60 CAG range.

The timescale of changes in the computational model is extremely long and may even be longer as a result of overestimating the influence of negative selection. The computational founder model, which starts with 1/3 of the population (rather than a single founder chromosome) at the 23 CAG size and eliminates chromosomes at 50 CAG, already requires 2,000 generations to form a flattened CAG population distribution (Figure 1 in Falush, bottom left panel, red line) that resembles the data that we observed for variants A1 or A2 (Figure 4C in Warby et al.1). Further extending this long timescale (2,000 generations × 25 years per generation = 50,000 years) may require these founder expansions to occur prior to the colonization of Europe. We agree that it is possible that genetic drift and CAG dependency alone account for the observed pattern in the data, but given the assumptions and the extended amount of time required for this model, it is not the most likely hypothesis.

Instead, we propose that specific haplotypes in the general population have cis-element polymorphisms that make CAG expansion more likely compared to other haplotypes. There are multiple stepwise CAG expansions that originate from the general population in these predisposed haplotypes, explaining the unusual similarity between normal and expanded haplotypes as well as the specific enrichment of a few haplotypes on disease chromosomes. The punctuated pattern of disease-associated SNP linkage is due to events that generate new diversity on normal CAG-sized chromosomes containing the predisposing cis-element polymorphisms that later undergo CAG expansion. To support this idea, we would agree that the CAG distribution of haplogroup A in the general population of Europe (Figure 3C in Warby et al.,1 red bars) looks similar to the cis-elements model (Figure 1 in Falush, right panel, green haplotype). Furthermore, there is evidence to suggest that specific haplotypes on chromosome 4 (which includes the HTT locus) modify the HD age of onset,9,10 which is inversely related to the CAG tract size and influenced by age-dependent somatic CAG instability.11

Finally, it is clear that numerous cis-elements influence trinucleotide stability at different loci in the genome. Many genes with similar-sized CAG tracts are found throughout the genome but have different propensities for CAG instability.12 A cis-element adjacent to the ATXN7 repeat that regulates against CAG hyperinstability has been described.13 We do not agree with Falush that invoking cis-elements that alter particular mutational properties seems unparsimonious, because there is a clear biological precedent for this.14 cis-elements can influence the bias for expansion or contraction15 and the tissue and developmental specificity of trinucleotide instability.16 Additionally, cis-elements such as CAG tract interruptions can significantly influence the CAG-size dependency of the mutation rate.17

With numerous polymorphic satellites in and around the HTT gene, as well as SNPs occurring at a frequency of one per 300 bp, we think it is probable that cis-element polymorphisms differentially impact CAG stability in the human HD population. Ultimately, further studies are clearly needed to determine how founder chromosomes, cis-elements, trans-genetic factors, environmental influences, and stochastic factors contribute to CAG tract instability at the HTT locus.

Acknowledgments

Funding for this research was provided by the Canadian Genetic Diseases Network, the Huntington Society of Canada, the Michael Smith Foundation for Health Research, the Canadian Institutes of Health Research, the Child & Family Research Institute, the Huntington's Disease Society of America/High Q Foundation, the Muscular Dystrophy Association USA, and the University of Rochester Wellstone Muscular Dystrophy Cooperative Research Center with support from the National Institutes of Health. The authors thank the HD families for the generous use of their DNA for these studies.

References

  • 1.Warby S.C., Montpetit A., Hayden A.R., Carroll J.B., Butland S.L., Visscher H., Collins J.A., Semaka A., Hudson T.J., Hayden M.R. CAG expansion in the Huntington disease gene is associated with a specific and targetable predisposing haplogroup. Am. J. Hum. Genet. 2009;84:351–366. doi: 10.1016/j.ajhg.2009.02.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Leeflang E.P., Tavaré S., Marjoram P., Neal C.O., Srinidhi J., MacFarlane H., MacDonald M.E., Gusella J.F., de Young M., Wexler N.S., Arnheim N. Analysis of germline mutation spectra at the Huntington's disease locus supports a mitotic mutation mechanism. Hum. Mol. Genet. 1999;8:173–183. doi: 10.1093/hmg/8.2.173. [DOI] [PubMed] [Google Scholar]
  • 3.Pollard L.M., Sharma R., Gómez M., Shah S., Delatycki M.B., Pianese L., Monticelli A., Keats B.J., Bidichandani S.I. Replication-mediated instability of the GAA triplet repeat mutation in Friedreich ataxia. Nucleic Acids Res. 2004;32:5962–5971. doi: 10.1093/nar/gkh933. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Cleary J.D., Nichol K., Wang Y.H., Pearson C.E. Evidence of cis-acting factors in replication-mediated trinucleotide repeat instability in primate cells. Nat. Genet. 2002;31:37–46. doi: 10.1038/ng870. [DOI] [PubMed] [Google Scholar]
  • 5.Langbehn D.R., Hayden M.R., Paulsen J.S., and the PREDICT-HD Investigators of the Huntington Study Group CAG-repeat length and the age of onset in Huntington disease (HD): A review and validation study of statistical approaches. Am. J. Med. Genet. B. Neuropsychiatr. Genet. 2009 doi: 10.1002/ajmg.b.30992. Published online June 22, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Eskenazi B.R., Wilson-Rich N.S., Starks P.T. A Darwinian approach to Huntington's disease: Subtle health benefits of a neurological disorder. Med. Hypotheses. 2007;69:1183–1189. doi: 10.1016/j.mehy.2007.02.046. [DOI] [PubMed] [Google Scholar]
  • 7.Semaka A., Collins J.A., Hayden M.R. Unstable familial transmissions of Huntington disease alleles with 27-35 CAG repeats (intermediate alleles) Am. J. Med. Genet. B. Neuropsychiatr. Genet. 2009 doi: 10.1002/ajmg.b.30970. Published online May 19, 2009. [DOI] [PubMed] [Google Scholar]
  • 8.Wheeler V.C., Persichetti F., McNeil S.M., Mysore J.S., Mysore S.S., MacDonald M.E., Myers R.H., Gusella J.F., Wexler N.S., US-Venezuela Collaborative Research Group Factors associated with HD CAG repeat instability in Huntington disease. J. Med. Genet. 2007;44:695–701. doi: 10.1136/jmg.2007.050930. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Djoussé L., Knowlton B., Hayden M.R., Almqvist E.W., Brinkman R.R., Ross C.A., Margolis R.L., Rosenblatt A., Durr A., Dode C. Evidence for a modifier of onset age in Huntington disease linked to the HD gene in 4p16. Neurogenetics. 2004;5:109–114. doi: 10.1007/s10048-004-0175-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Nørremølle A., Budtz-Jørgensen E., Fenger K., Nielsen J.E., Sørensen S.A., Hasholt L. 4p16.3 haplotype modifying age at onset of Huntington disease. Clin. Genet. 2009;75:244–250. doi: 10.1111/j.1399-0004.2008.01136.x. [DOI] [PubMed] [Google Scholar]
  • 11.Swami M., Hendricks A.E., Gillis T., Massood T., Mysore J., Myers R.H., Wheeler V.C. Somatic expansion of the Huntington's disease CAG repeat in the brain is associated with an earlier age of disease onset. Hum. Mol. Genet. 2009;18:3039–3047. doi: 10.1093/hmg/ddp242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Butland S.L., Devon R.S., Huang Y., Mead C.L., Meynert A.M., Neal S.J., Lee S.S., Wilkinson A., Yang G.S., Yuen M.M. CAG-encoded polyglutamine length polymorphism in the human genome. BMC Genomics. 2007;8:126. doi: 10.1186/1471-2164-8-126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Libby R.T., Hagerman K.A., Pineda V.V., Lau R., Cho D.H., Baccam S.L., Axford M.M., Cleary J.D., Moore J.M., Sopher B.L. CTCF cis-regulates trinucleotide repeat instability in an epigenetic manner: A novel basis for mutational hot spot determination. PLoS Genet. 2008;4:e1000257. doi: 10.1371/journal.pgen.1000257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Pearson C.E., Nichol Edamura K., Cleary J.D. Repeat instability: Mechanisms of dynamic mutations. Nat. Rev. Genet. 2005;6:729–742. doi: 10.1038/nrg1689. [DOI] [PubMed] [Google Scholar]
  • 15.Martins S., Coutinho P., Silveira I., Giunti P., Jardim L.B., Calafell F., Sequeiros J., Amorim A. Cis-acting factors promoting the CAG intergenerational instability in Machado-Joseph disease. Am. J. Med. Genet. B. Neuropsychiatr. Genet. 2008;147B:439–446. doi: 10.1002/ajmg.b.30624. [DOI] [PubMed] [Google Scholar]
  • 16.Cleary J.D., Pearson C.E. The contribution of cis-elements to disease-associated repeat instability: Clinical and experimental evidence. Cytogenet. Genome Res. 2003;100:25–55. doi: 10.1159/000072837. [DOI] [PubMed] [Google Scholar]
  • 17.Sobczak K., Krzyzosiak W.J. Patterns of CAG repeat interruptions in SCA1 and SCA2 genes in relation to repeat instability. Hum. Mutat. 2004;24:236–247. doi: 10.1002/humu.20075. [DOI] [PubMed] [Google Scholar]

Articles from American Journal of Human Genetics are provided here courtesy of American Society of Human Genetics

RESOURCES