Abstract
The fragile X mental retardation (FMR1) gene contains an expansion-prone CGG repeat within its 5′ UTR. Alleles with 55–200 repeats are known as premutation (PM) alleles and confer risk for one or more of the FMR1 premutation (PM) disorders that include Fragile X-associated Tremor/Ataxia Syndrome (FXTAS), Fragile X-associated Primary Ovarian Insufficiency (FXPOI), and Fragile X-Associated Neuropsychiatric Disorders (FXAND). PM alleles expand on intergenerational transmission, with the children of PM mothers being at risk of inheriting alleles with > 200 CGG repeats (full mutation FM) alleles) and thus developing Fragile X Syndrome (FXS). PM alleles can be somatically unstable. This can lead to individuals being mosaic for multiple size alleles. Here, we describe a detailed evaluation of somatic mosaicism in a large cohort of female PM carriers and show that 94% display some evidence of somatic instability with the presence of a series of expanded alleles that differ from the next allele by a single repeat unit. Using two different metrics for instability that we have developed, we show that, as with intergenerational instability, there is a direct relationship between the extent of somatic expansion and the number of CGG repeats in the originally inherited allele and an inverse relationship with the number of AGG interruptions. Expansions are progressive as evidenced by a positive correlation with age and by examination of blood samples from the same individual taken at different time points. Our data also suggests the existence of other genetic or environmental factors that affect the extent of somatic expansion. Importantly, the analysis of candidate single nucleotide polymorphisms (SNPs) suggests that two DNA repair factors, FAN1 and MSH3, may be modifiers of somatic expansion risk in the PM population as observed in other repeat expansion disorders.
Subject terms: Neurodevelopmental disorders, Genetics
Introduction
Over 35 different inherited genetic disorders are caused by the expansion of a specific short tandem repeat tract1. In these repeat expansion disorders, the repeat is unstable showing a strong expansion bias. The FMR1 disorders or Fragile X-related disorders (FXDs), are members of this group that result from the presence of an evolutionarily conserved, but expansion-prone, CGG repeat tract at the 5′ end of the transcriptional unit of the X-linked FMR1 gene. The repeats are situated upstream of the open reading frame for FMRP, an RNA binding protein important for the regulation of translation in post-synaptic neurons in response to synaptic activation. The repeats are thought to modulate mGluR-dependent enhancement of FMRP synthesis via non-AUG initiated (RAN) translation through the repeat tract2. Premutation alleles (PM) have 55–200 repeats and are associated with a risk of developing one or more of the PM associated disorders, Fragile X-associated tremor/ataxia syndrome (FXTAS), Fragile X-associated primary ovarian insufficiency (FXPOI), and Fragile X-associated neuropsychiatric disorders (FXAND)3–6. Pathology is thought to arise from some deleterious effect of the excess number of repeats in the FMR1 transcript7. Carriers of PM alleles are also at risk of transmitting larger alleles to their children, with increasing CGG repeat number being associated with increased risk8. In particular, female PM carriers with ~ 90 CGG repeats, have a > 90% probability of transmitting alleles with > 200 CGG repeats to their children. Such alleles are known as full mutation (FM) alleles and result in Fragile X syndrome (FXS), a neurodevelopmental disorder that is the most common inherited form of intellectual disability and the most common monogenic cause of autism spectrum disorder. Pathology in this instance is thought to be related to the repeat-mediated silencing of the FMR1 promoter9. The prevalence of the PM allele among the general population is 1:110–200 females and 1:430 males. However, the PM disorders have a variable penetrance with 40–75% of males and 8–16% of females developing FXTAS10,11 and ~ 20% of females developing FXPOI12,13.
The increased use of higher resolution techniques for the analysis of PM alleles has demonstrated that some carriers of PM alleles show somatic repeat size mosaicism, i.e., the presence of two or more alleles of different sizes in a particular tissue. Previous studies of mosaicism have focused on individuals containing a combination of multiple discrete alleles often in both the PM and FM range14–23. The origin of the smaller alleles is uncertain, but likely reflects contractions of larger alleles. The second type of mosaicism is also present in PM carriers, in which multiple alleles differing by a single repeat are seen in some individuals24. This form of mosaicism is reminiscent of the products of somatic expansion seen in an FXD mouse model and in humans with other repeat expansion diseases25. Molecular modeling of these products suggests that they arise via small but frequent events that accumulate over the lifetime of the individual26. In an FXD mouse model, the frequency with which these events occur differs between tissues and cell types. While this phenomenon has not been extensively examined in the FMR1 disorders, it has been reported to occur in humans for other repeat expansion diseases such as Huntington's Disease (HD) and Myotonic Dystrophy type 1 (DM1)27–29. The extent of this somatic expansion has been shown to be affected by repeat length and purity as well as a variety of genetic factors with the extent of expansion affecting the age of onset and severity of many of these diseases29–39. This study represents the first study of the somatic instability of the FMR1 repeat in a large cohort of female PM carriers.
Materials and methods
Study
Peripheral blood was collected from a total of 426 PM female participants after signing an informed consent form and using a protocol approved by the UC Davis Institutional Review Board.
For the analysis of the correlation of a subset of molecular measures, data from the entire cohort of 426 females were used. For the analysis of the correlation between instability and molecular measures, data from a subset consisting of 384 participants was used. Some individuals were excluded from this subset because the quality of the capillary electrophoresis trace was too poor to allow calculation of instability (n = 19), no AR value was available (n = 1), or the allele corresponding to that on the inactive X could not be identified (n = 8). Individuals with an activation ratio (AR, defined as the percentage of cells carrying the normal allele on the active X chromosome) of > 0.8 (n = 14) who showed no evidence of expansion were also excluded since in these individuals the proportion of alleles able to expand would be relatively small and thus any expansion, should it occur, would be difficult to detect.
For the study of changes in premutation allele stability over time, a subset of 24 female PM participants was selected, based on the availability of at least two blood draws taken a minimum of 2 years apart (mean 6.7; SD 2.9). The age mean was 46.7 (SD 19.5); the mean of the CGG repeats (based on the draw at the first visit) was 100.1 (SD 27.2) (Supplementary Table 1).
CGG sizing, methylation status, AGG interruptions, and SNP selection
Genomic DNA (gDNA) was isolated from 3 ml of peripheral blood by using the Gentra Puregene Blood Kit (Qiagen, Valencia, CA, United States). CGG repeat allele size and methylation status were assessed using a combination of PCR and Southern Blot analysis. A PCR that specifically targeted FMR1 amplification (AmplideX PCR/CE, Asuragen, Inc.) was used to determine CGG repeat length and PCR products were visualized by CE and analyzed as previously reported40. Southern blotting was performed using the Stb12.3 FMR1 specific chemiluminescent intronic probe, as detailed in Ref.41. Briefly, 10 μg of isolated gDNA was digested with EcoRI and NruI, run on an agarose gel, transferred to a nylon membrane, and hybridized with the FMR1-specific dig-labeled StB12.3. Southern Blot analysis was also used to determine the methylation status of the FMR1 alleles (Activation ratio, AR, and the percent of methylation) as previously described42.
To visualize the methylation status of alleles by capillary electrophoresis a modified version of the assay described in Ref.43 was employed. Briefly, 600 ng of genomic DNA was placed in a 40 μl volume of 50 mM Tris.HCl pH 9.0, 1.75 mM MgCl2, 22 mM (NH4)2SO4, and 1 μl of HindIII restriction enzyme were added. This was divided into two equal aliquots and 0.5 μl of HpaII restriction enzyme was added to one. Digestion was allowed to proceed overnight at 37 °C. 5 μl of each digest was then made to 20 μl containing 50 mM Tris–HCl pH 9.0, 1.75 mM MgCl2, 22 mM (NH4)2SO4, 2.5 M betaine, 2% DMSO, 0.5 μM each primer, 0.2 mM dATP and dTTP, 0.475 mM dCTP and dGTP, and 0.75U of KAPA2G Robust HotStart polymerase. The PCR conditions were 98 °C for 3 min, 32 cycles of 98 °C for 30 s, 65 °C for 30 s, and 72 °C for 210 s, followed by 72 °C for 10 min. The primers used are:
Not FraxC: AGTTCAGCGGCCGCGCTCAGCTCCGTTTCGGTTTCACTTCCGGT.
Not FraxR4: FAM-CAAGTCGCGGCCGCCTTGTAGAAAGCGCCATTGGAGCCCCGCA.
The number of AGG interruptions was determined by using a triplet primed PCR protocol as described in Ref.8, visualized by CE, and analyzed with Gene Mapper software. The number of AGG interruptions in a sample was determined based on the number of sharp depressions visualized by capillary electrophoresis (CE) images8.
A total of ten single nucleotide polymorphisms (SNPs) were investigated in a subset of 384 PM female participants for whom the extent of somatic instability could be reliably determined. The choice of SNPs was based on their significant association with instability in other trinucleotide repeat expansion disorders38. SNP analysis was performed using the Taqman Single Nucleotide Polymorphism Allele Discrimination Assay for sample genotyping (Applied Biosystems, Inc., Foster City, CA). Predesigned TaqMan assays were used for genotyping. Briefly, probes were mixed with TaqMan Master Mix in a ratio of 2.5 TaqMan Master Mix to 0.125 µl of SNP probe per well, and aliquoted into plates containing 50–100 ng of genomic DNA. A visualization of the cluster plots was performed for each plate to ensure the absence of poor clustering of the SNP. Internal positive and negative controls with all the known genotypes for each SNP were included in each plate. Genotypes were determined using Applied Biosystems automated Taqman genotyping software, SDS v2.1. Genotype data were blind for statistical analysis.
FMR1 mRNA expression levels
Total RNA was isolated from 2.5 ml of peripheral blood collected in PAXgene Blood RNA tubes using the PAXgene Blood RNA Kit (Qiagen, Valencia, CA, United States) and quantified using the Agilent 2100 Bioanalyzer system. RNA isolation was performed in a clean and RNA-designated area. cDNA was synthesized as previously described44. FMR1 transcript levels were measured by performing reverse transcription followed by real-time PCRs (qRT-PCR). qRT-PCR was performed using both Assays-On-Demand from Applied Biosystems (Applied Biosystems, Foster City, CA, United States) and custom-designed TaqMan primers and probe assays44.
Measurement of instability
Two different metrics for the extent degree of expansion were used. Since the expansion is limited to the active X chromosome, the smaller alleles represented by Peak 1 represent the originally inherited allele. Our primary measure of expansion, ∆Rpts, is the difference in the number of repeats in a repeat profile between the modal expanded allele (Peak 2) and modal stable allele (Peak 1). Since in males X inactivation does not occur, we adapted a second metric from Ref.26 which is based on the increase in the dispersion of the allele populations in the PCR profile. This was calculated by first identifying the modal peaks of the stable (Peak 1) and unstable (Peak 2) allele populations. The RFU values of the peaks exceeding a threshold value (≥ 0.2 × RFU of modal peak) in each population were then converted into a histogram which was treated as being derived from a normal distribution and the standard deviation of that distribution became the dispersion (D) value. To minimize the contribution of alleles in Peak 1 to the dispersion of Peak 2 (D2) and vice versa, we determined the dispersion metric of Peak 2 (D2) by using only Peak 2 and peaks lying to the right of it. Similarly, the dispersion of Peak 1 (D1) was calculated by using only Peak 1 and peaks lying to the left of it.
To determine the proportion of alleles that expand, both the area under the stable peaks in a PCR profile (StableArea) and the area under the curve of the unstable peaks (UnstableArea) were calculated. The proportion of alleles that expand (AUC2) is given by UnstableArea/ (UnstableArea + StableArea) and the proportion of alleles that are stable (AUC1) is then 1 − AUC2.
Statistical analysis
Statistical analysis was used to determine the correlation between the FMR1 molecular measures, instability, age, CGG repeat size, AGG interruption, FMR1 mRNA, and AR. FMR1 mRNA expression was analyzed by CGG repeat number using linear regression, adjusting for activation ratio (AR) by including this as a covariate. The largest CGG repeat number was used for subjects with different numbers of CGG repeats reported. The above analyses were conducted in R version 4.0.5 (2021-03-31). The overall correlation of factors with instability (as measured by Peak2 − Peak1) was determined using the CORR Procedure, along with the generation of Pearson correlation coefficients. Relationships of individual factors with instability were determined through GLM Procedure. Association of repeat expansion with genetic and other risk factors was tested by negative binomial regression, using the glm.nb () function in R. We estimated the variance inflation factors for each variable in R using the VIF() function in the ‘regclass’ package. The VIFs ranged from 1.13 (AGG) to 2.97 (Peak1), which are comfortably below the cutoff of 5 commonly used to indicate problematic collinearity45.
Results
Study participants
Blood samples were collected from a total of 426 female PM carriers. The studies and all protocols were carried out in accordance with the Institutional Review Board at the University of California, Davis. All participants gave written informed consent before participating in the study in line with the Declaration of Helsinki. Capillary electrophoresis PCR profiles were determined for the PM alleles in everyone as previously described40. Standard practice is to report the number of repeats present in the most common allele as the individual’s repeat number. The number of AGG interruptions was determined by triplet-primed PCR as previously described8. The activation ratio (AR), the fraction of normal alleles that are located on the active X chromosome was determined by Southern blot analysis42. The FMR1 mRNA levels were determined by real-time PCR as described previously44. The ages of the participants in this study at the time their blood was drawn, their CGG repeat number, number of AGG interruptions, AR, and FMR1 mRNA levels are shown in Table 1.
Table 1.
Molecular measures | Total Group, n = 426 | Subset, n = 384 | Unstable, n = 361 | Stable, n = 23 | Welch Two Sample t-test | Actual value | ||||
---|---|---|---|---|---|---|---|---|---|---|
n | Mean | Std. Dev | n | Mean | Std. Dev | Mean ± Std. Err | Mean ± Std. Err | |||
CGG repeat | 425** | 92.08 | 22.94 | 383** | 90.40 | 19.00 | 91.69 ± 0.98 | 70.13 ± 2.65 | < 0.0001 | ~ 3e−8 |
AGG | 426 | 0.75 | 0.78 | 384 | 0.78 | 0.79 | 0.72 ± 0.04 | 1.65 ± 0.12 | < 0.0001 | ~ 6e−8 |
AR | 424 | 0.54 | 0.17 | 382 | 0.53 | 0.16 | 0.53 ± 0.008 | 0.58 ± 0.04 | 0.135153 | |
FMR1 mRNA | 401 | 2.18 | 0.91 | 361 | 2.21 | 0.89 | 2.24 ± 0.05 | 1.60 ± 0.18 | 0.001696 | |
Age | 423 | 42.49 | 17.18 | 381 | 42.01 | 16.84 | 42.80 ± 0.87 | 29.76 ± 3.99 | 0.003882 | |
AUC1 | 413 | 0.71 | 0.23 | 384 | 0.71 | 0.21 | 0.69 ± 0.01 | 0.99 ± 0.007 | < 0.0001 | ~ 8e−60 |
AUC2 | 413 | 0.29 | 0.23 | 384 | 0.29 | 0.21 | 0.31 ± 0.01 | 0.01 ± 0.007 | < 0.0001 | ~ 9e−60 |
D1 | 412 | 1.41 | 0.61 | 384 | 1.36 | 0.32 | 1.40 ± 0.02 | 0.77 ± 0.007 | < 0.0001 | ~ 4e−120 |
D2 | 411 | 2.44 | 2.09 | 383 | 2.50 | 2.01 | 2.66 ± 0.10 | 0 ± 0 | < 0.0001 | ~ 2e−83 |
AGG | ||||||||||
0 | 197 (46.2%)* | 171 (44.5%)* | 170 (47.1%)* | 1 (4.3%)* | ||||||
1 | 138 (32.4%)* | 127 (33.1%)* | 121 (33.5%)* | 6 (26.1%)* | ||||||
2 | 91 (21.4%)* | 86 (22.4%)* | 70 (19.4%)* | 16 (69.6%)* |
*Percentage of females relative to the total number, presenting with 0, 1 or 2 AGG interruptions.
**Number of females for whom the CGG repeat allele size was included (one participants was removed as she was a double heterozygous- two premutation alleles).
Characterization of somatic expansion
The CGG repeat number showed a normal distribution in our study population (Fig. 1A). The proportion of alleles with no interruptions increased from 40% for alleles with ≤ 64 repeats to > 80% for alleles with ≥ 125 repeats (Fig. 1B). The AR for the study participants was also normally distributed with a mean of ~ 0.5 (Fig. 1C), as previously reported46. There was no significant association of repeat size with AR. Consistent with previous reports, higher levels of FMR1 mRNA were associated with larger repeat lengths (Fig. 1D) even after correction for AR p < 0.0001.
A variety of different repeat PCR profiles were seen. Some females showed a single sharp and asymmetric PCR profile with a small number of PCR products smaller than the modal allele (Fig. 2A). This is like the PCR profile seen in the blood of very young female PM mice or in the tissue of mice with mutations that block somatic expansion47,48. As such, this PCR profile likely reflects a stable allele population with little, or no, somatic expansion, and with some, if not all, of the peaks smaller than the modal allele representing PCR “stutter”. Other individuals showed PCR profiles in which a “shoulder” was seen corresponding to alleles larger than the modal allele (Fig. 2B). The third group of women had a clear bimodal distribution of allele populations with the smaller allele population showing a narrow distribution of allele sizes and the larger allele population showing a broader distribution (Fig. 2C,D). These profiles are like those seen in older female PM mice with a genetic background permissive to somatic expansion. In mice, the smaller of the two allele populations in older animals is similar in size to the alleles present in the tail at 3 weeks of age, an approximate measure of the number of repeats in the originally inherited allele, and the size of this population does not change over time. In contrast, the larger of the two allele populations tend to have a modal repeat number that increases with the age of the animal and thus reflects alleles that have expanded or gained repeats during the animal's lifetime49.
Interestingly, as in mice, HpaII pre-digestion of the PCR template from women with evidence of alleles larger than the modal allele eliminates such alleles from the PCR profile resulting in the production of a unimodal PCR profile characteristic of stable alleles (Fig. 3). Since HpaII is a methylation-sensitive enzyme with recognition sites within the amplicon used for PCR analysis of the repeat, pre-digestion eliminates any PCR template derived from an active X chromosome. Thus, the disappearance of these products after HpaII digestion suggests that they are derived from the active X chromosome. We interpret this to mean that these products represent expanded alleles with expansions being limited to the active X as in mice.
The association between expansion and the presence of the PM allele on the active X is supported by the fact that there is a direct relationship between the fraction of alleles that expand, as assessed by an estimation of the area under the curve of the expanded allele (AUC2) and the fraction of alleles where the PM is on the active X (1 − AR) (Fig. 4). Thus, the allele population with the smaller repeat number corresponds to unexpanded alleles on the inactive X, with the modal repeat number likely reflecting the repeat number present on the originally inherited allele. This is consistent with our previous more limited analysis49 and suggests that expansions are limited to the active X chromosome, as they are in mice47. This indicates that transcription or a euchromatin configuration is required for these expansions.
To investigate the PM allele stability over time, a subset of 24 female participants with specimens available from multiple blood draws, was selected. In 20 of the cases examined, the time between draws was < 10 years. Eight participants showed changes in CGG repeat number (1–12 CGGs; Supplementary Table 1 and Fig. 5). The remaining sixteen individuals (66.7%) showed no evidence of change in their repeat PCR profile between draws, regardless of the age at first sampling and the time between draws. Of these, 11 had < 96 CGG repeats and five had alleles > 96 CGG, with three of the alleles > 96 repeats having AGG interruptions. The other eight individuals showed evidence of a change in the PCR profile with an increase in the modal number of CGG repeats seen in the larger of the two allele populations. Seven of these individuals had inherited alleles with > 96 CGG repeats and no AGG interruptions. A female with ~ 144 CGG repeats in her expanded allele at the first blood draw at two years of age, showed an allele representing a gain of ~ 8 repeats relative to her originally inherited allele (Fig. 5A). She had alleles with a mean repeat number of ~ 147 CGG repeats at the second draw two years later i.e., the gain of three repeats in 2 years (Fig. 5B) shows the PCR profile of a female with ~ 160 CGG repeats on her expanded allele at the first blood draw at eight years of age, 19 repeats more than the original allele. At the second blood draw six years later, the expanded alleles had gained an average of an additional 11 CGG repeats. In addition, as we previously described in an FXD mouse model24, the size distribution of expanded alleles broadens with age. This is consistent with mathematical modeling which suggests that each expansion event adds one-to-two repeats26. As a result, over time the dispersion of the population of expanding alleles, D2, increases.
Relationship between the extent of expansion, AR, AGG, age, and the dispersion of the expanded alleles.
The fact that the smaller of the two alleles corresponds to the originally inherited allele and the larger corresponds to those that have expanded would suggest that the difference in the modal number of repeats of the expanded and stable peaks, a metric we call ∆Rpts, reflects the extent of somatic expansion. We used this metric to examine the relationship between the extent of expansion and AGG number, AR, and age. For this purpose, we excluded alleles with AR > 0.8 that showed no evidence of expansion on the grounds that the absence of a detectable second peak might reflect expansions present at levels below the limit of detection by capillary electrophoresis, as could occur if extensive expansion had happened.
In addition, we excluded poor quality capillary electrophoresis traces and individuals where the stable peak could not be identified leaving us with 384 individuals. We then used ∆Rpts as a measure of expansion and performed negative binomial regression of this on the initial repeat number, AGG, AR, age, and the fraction of stable vs unstable alleles (represented by the area under the curve (AUC) of peak 1 and peak 2). We found a significant association between ∆Rpts and the size of the original allele along with a significant direct relationship with age (Table 2).
Table 2.
Estimate | Std. Error | p-value | |
---|---|---|---|
D1 | − 0.06 | 0.17 | 0.74 |
D2 | 0.34 | 0.02 | < 2e−16 |
Peak 1 | 0.02 | 0.004 | 1.40E−07 |
AGG | − 0.23 | 0.06 | 0.0001 |
Age | 0.01 | 0.003 | 1.19E−05 |
AR | 0.49 | 0.28 | 0.08 |
Amount of FMR1 | 0.000412 | 0.05 | 0.99 |
The values in this table refer to a multivariable negative binomial regression of ∆Rpts on all of the molecular measures simultaneously.
There is also an inverse relationship between ∆Rpts and the number of AGG interruptions (Fig. 6A) which is consistent with the stabilizing effect of AGGs observed on intergenerational transmission50,51. Since the dispersion about the mean of the expanding alleles increases with increasing expansion, we also tested the association of the ∆Rpts metric with a measure of the dispersion of the stable (D1) and unstable alleles (D2). There was a significant association between the ∆Rpts metric and D2 (Table 2). This is consistent with the data shown in (Fig. 6B) in which the heterogeneity of the expanding allele population increases with time. There was no association with D1 consistent with the fact that the size distribution of the stable allele population shows no increase over time. There was also no relationship between instability and the amount of FMR1 transcript after correction for the initial repeat number, AGG, AR, and age.
Genetic factors affecting the expansion
Genome-Wide Association Studies (GWAS) have identified a number of single nucleotide polymorphisms (SNPs) that are significantly associated with the risk of somatic expansion or age of disease onset in various other Repeat Expansion Diseases29–38. To assess whether some of the same SNPs were associated with somatic expansion risk in our PM population, we examined the association of the ∆Rpts metric with 10 single nucleotide polymorphisms (SNPs) previously found to be associated with a variation in the age of onset, disease severity or extent of somatic expansion in studies of other Repeat Expansion Diseases. Of the selected ten SNPs chosen and reported in Table 3, two, rs701383 and rs150393409, showed a significant association with the extent of instability, although neither of them would survive correction for multiple testing.
Table 3.
SNP | Candidate modifier gene(s) (and distance in kb) | Test allele | Effect (Std. Error) | p-value | HW p-val |
---|---|---|---|---|---|
rs1650742 | MSH3 (0), DHFR (40.1) | T | 0.12 (0.08) | 0.12 | 0.37 |
rs1799977 | MLH1 (0) | G | − 0.03 (0.08) | 0.73 | 0.93 |
rs274883 | LIG1 (0) | G | − 0.001 (0.09) | 0.99 | 0.31 |
rs34017474 | FAN1 (0), MTMR10 (0) | T | − 0.02 (0.07) | 0.78 | 0.8 |
rs35811129 | FAN1 (6.04) & MTMR10 (0) | G | 0.02 (0.08) | 0.78 | 0.38 |
rs3791767 | PMS1 (8.8) | C | − 0.07 (0.09) | 0.4 | 0.06 |
rs701383 | DHFR (8.77), MSH3 (37.2) | G | − 0.22 (0.08) | 0.007 | 0.86 |
rs74302792 | PMS2 (31.3) | T | − 0.05 (0.11) | 0.61 | 0.81 |
rs145821638 | LIG1 (0) | C | − 0.11 (0.53) | 0.84 | 0.94 |
rs150393409 | FAN1 (0) | G | − 0.70 (0.31) | 0.02 | 0.74 |
Discussion
In this study, we describe the first large-scale characterization of somatic expansion in female premutation allele carriers. We show that most PM carriers show some degree of somatic expansion in blood as evidenced by their PCR profile and by the serial sampling of a subset of individuals. The extent of this expansion is related to the CGG-repeat number and inversely related to the number of AGG interruptions as with intergenerational expansions14,46,52. There was also a relationship between the extent of expansion and age, consistent with the observation of a maternal age effect on the risk of a female PM carrier having a child with an FM allele49–51. We also showed that the extent of expansion correlates with the proportion of the PM allele that is on the active X chromosome (Fig. 3). This is consistent with the fact that expansion in humans requires transcription or open chromatin as it does in mice47. While expansions were not seen on the inactive X chromosome, we observed a relationship between AR and the extent of expansion of the allele on the active X. No evidence of CGG repeat allele contractions was seen in this data set, although the occurrence of low-frequency contraction events or contraction events that generate heterogenous deletion products cannot be definitively excluded.
The measurement of somatic expansion in females is facilitated by the fact that expansion is limited to alleles on the active X chromosome and thus that the size of the inherited allele can be inferred from the size of the allele on the inactive X. However, this is not possible in males. Our demonstration that the extent of expansion as measured by ∆Rpts shows a direct relationship with DM2, the dispersion of the expanded allele about the mean, suggests that the DM metric could be useful for examining somatic expansion in male PM carriers.
The demonstration of the association of the rs701383 SNP with the extent of somatic expansion is of interest since this SNP has located 8.77 kb from the dihydrofolate reductase (DHFR) gene and 37.2 kb from MSH3, whose gene product is important for mismatch repair and is required for both somatic and germline expansion in the mouse model of FXDs1. rs701383 is an eQTL for MSH3 in GTEx, that is significant in several tissues (minimum p = 1.5 × 10–71 in cultured fibroblasts) with the minor allele (A) at rs701383 being associated with higher expression of MSH332. rs701383 is an eQTL for DHFR in artery (p = 6.7 × 10–22) and nerve (p = 5.9 × 10–19) but the association is only weak in whole blood (p = 1.3 × 10–8 compared to 2.8 × 10–63 for MSH3). The minor allele at this SNP is associated with an earlier age at onset of HD (p = 5.46 × 10–10)38 and increasing somatic instability in HD and DM132.
The rs150393409 SNP is located within FAN1, a DNA repair gene that encodes a nuclease FAN1 that protects against expansion in the FXD mouse53,54. This SNP results in the substitution of Arg for His at amino acid 507 in FAN1, a change predicted to be deleterious or damaging in SIFT and PolyPhen, respectively. The directionality of the observed effect of the rs150393409 SNP would be consistent with FAN1 normally protecting against repeat expansion in women with the PM as well. Thus, although studies of larger cohorts are needed, our data suggest that genetic factors that affect somatic expansion in women with the PM are consistent with data from a mouse model of the FXDs and with other Repeat Expansion Diseases. This similarity between humans and mice with respect to the genetic factors involved in somatic expansion supports the idea that the FXD mouse model can provide useful insights into the expansion process in human PM carriers. The fact that the same SNPs are associated with disease risk in other Repeat Expansion Diseases lends weight to the idea that these diseases share a common underlying mutational mechanism.
It is notable that expansion can be readily detected in the blood of many PM human carriers. In an FXD mouse model, blood shows much less expansion than the brain48. A similar difference between the extent of expansion in blood and brain has been reported in other Repeat Expansion Diseases55–58. Thus, in PM carriers where expansion can be detected in blood, the extent of expansion in the brain maybe even larger. Since there is a direct relationship between repeat number and FXTAS age of onset39, this raises the possibility that the propensity to undergo somatic expansion could contribute to the variable penetrance of FXTAS pathology seen in PM carriers. Furthermore, since in the FXD mouse model the same genetic factors that affect expansion risk in somatic cells affect expansion in the germline, the genetic factors identified in this study as potential modifiers of somatic expansion risk, may also be modifiers of intergenerational expansion risk. These factors may account for some of the variances in expansion risk that are not explained by repeat number or the number of AGG interruptions14. Thus, a better understanding of the full range of genetic factors affecting expansion risk may contribute to better assessments of disease risk in PM carriers as well as the risk of transmission of FXS.
Supplementary Information
Author contributions
Conceptualization: F.T., K.U. Writing, original draft preparation: Y.H., and B.H. Writing, review, and editing: Y.H., B.H., M.Z., J.K., B.D.J., P.H., K.U., and F.T. Methodology and Analysis: Y.H., B.H., K.U., B.D.J., P.H., and F.T. Manuscript writing and editing: all authors have read and agreed to the published version of the manuscript.
Data availability
Data and results generated from this project will be fully available from corresponding author upon request. Biological samples from subjects included in this study will be available under MTA agreement accordingly to the University of California, Davis policy.
Competing interests
Dr. Flora Tassone received funds from Azrieli Foundation and Zynerba for studies in Fragile X syndrome. The other authors declare no conflict of interest.
Footnotes
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Ye Hyun Hwang and Bruce Eliot Hayward.
Contributor Information
Karen Usdin, Email: karenu@nih.gov.
Flora Tassone, Email: ftassone@ucdavis.edu.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-022-14183-0.
References
- 1.Zhao X-N, Usdin K. The repeat expansion diseases: The dark side of DNA repair. DNA Repair. 2015;32:96–105. doi: 10.1016/j.dnarep.2015.04.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Rodriguez CM, et al. A native function for RAN translation and CGG repeats in regulating fragile X protein synthesis. Nat. Neurosci. 2020;23:386–397. doi: 10.1038/s41593-020-0590-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Rajaratnam A, et al. Fragile X syndrome and fragile X-associated disorders. F1000Research. 2017;6:2112. doi: 10.12688/f1000research.11885.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Hagerman RJ, et al. Fragile X-associated neuropsychiatric disorders (FXAND) Front. Psychiatry. 2018;9:564. doi: 10.3389/fpsyt.2018.00564. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Leehey MA, et al. FMR1 CGG repeat length predicts motor dysfunction in premutation carriers. Neurology. 2008;70:1397–1402. doi: 10.1212/01.wnl.0000281692.98200.f5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Loesch DZ, et al. Psychological status in female carriers of premutation FMR1 allele showing a complex relationship with the size of CGG expansion. Clin. Genet. 2015;87:173–178. doi: 10.1111/cge.12347. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Hagerman RJ, Hagerman P. Fragile X-associated tremor/ataxia syndrome—features, mechanisms and management. Nat. Rev. Neurol. 2016;12:403–412. doi: 10.1038/nrneurol.2016.82. [DOI] [PubMed] [Google Scholar]
- 8.Yrigollen CM, Tassone F, Durbin-Johnson B, Tassone F. The role of AGG interruptions in the transcription of FMR1 premutation alleles. PLoS One. 2011;6:e21728. doi: 10.1371/journal.pone.0021728. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Hagerman RJ, et al. Fragile X syndrome. Nat. Rev. Dis. Primers. 2017;3:1–19. doi: 10.1038/nrdp.2017.65. [DOI] [PubMed] [Google Scholar]
- 10.Tassone F, et al. FMR1 CGG allele size and prevalence ascertained through newborn screening in the United States. Genome Med. 2012;4:100. doi: 10.1186/gm401. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Jacquemont S. Penetrance of the fragile X-Associated tremor/ataxia syndrome in a premutation carrier population. JAMA. 2004;291:460. doi: 10.1001/jama.291.4.460. [DOI] [PubMed] [Google Scholar]
- 12.Allen EG, et al. Refining the risk for fragile X-associated primary ovarian insufficiency (FXPOI) by FMR1 CGG repeat size. Genet. Med. 2021;23:1648–1655. doi: 10.1038/s41436-021-01177-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Allen EG, et al. Examination of reproductive aging milestones among women who carry the FMR1 premutation. Hum. Reprod. 2007;22:2142–2152. doi: 10.1093/humrep/dem148. [DOI] [PubMed] [Google Scholar]
- 14.Nolin SL, et al. Fragile X AGG analysis provides new risk predictions for 45–69 repeat alleles. Am. J. Med. Genet. A. 2013;161A:771–778. doi: 10.1002/ajmg.a.35833. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Pretto D, et al. Clinical and molecular implications of mosaicism in FMR1 full mutations. Front. Genet. 2014;5:318. doi: 10.3389/fgene.2014.00318. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Cohen IL, et al. Mosaicism for the FMR1 gene influences adaptive skills development in fragile X-affected males. Am. J. Med. Genet. 1996;64:365–369. doi: 10.1002/(SICI)1096-8628(19960809)64:2<365::AID-AJMG26>3.0.CO;2-C. [DOI] [PubMed] [Google Scholar]
- 17.Dobkin CS, et al. Tissue differences in fragile X mosaics: Mosaicism in blood cells may differ greatly from skin. Am. J. Med. Genet. 1996;64:296–301. doi: 10.1002/(SICI)1096-8628(19960809)64:2<296::AID-AJMG13>3.0.CO;2-A. [DOI] [PubMed] [Google Scholar]
- 18.Han X-D, Powell BR, Phalin JL, Chehab FF. Mosaicism for a full mutation, premutation, and deletion of the CGG repeats results in 22% FMRP and elevatedFMR1 mRNA levels in a high-functioning fragile X male. Am. J. Med. Genet. A. 2006;140A:1463–1471. doi: 10.1002/ajmg.a.31291. [DOI] [PubMed] [Google Scholar]
- 19.Jiraanont P, et al. Size and methylation mosaicism in males with Fragile X syndrome. Expert Rev. Mol. Diagn. 2017;17:1023–1032. doi: 10.1080/14737159.2017.1377612. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Field M, et al. Significantly elevated FMR1 mRNA and mosaicism for methylated premutation and full mutation alleles in two brothers with autism features referred for fragile X testing. Int. J. Mol. Sci. 2019;20:3907. doi: 10.3390/ijms20163907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Saldarriaga W, González-Teshima LY, Forero-Forero JV, Tang H-T, Tassone F. Mosaicism in fragile X syndrome: A family case series. J. Intellect. Disabil. 1999 doi: 10.1177/1744629521995346. [DOI] [PubMed] [Google Scholar]
- 22.Schmucker B, Seidel J. Mosaicism for a full mutation and a normal size allele in two fragile X males. Am. J. Med. Genet. 1999;84:221–225. doi: 10.1002/(SICI)1096-8628(19990528)84:3<221::AID-AJMG11>3.0.CO;2-M. [DOI] [PubMed] [Google Scholar]
- 23.Mailick MR, et al. Health profiles of mosaic versus non-mosaic FMR1 premutation carrier mothers of children with fragile X syndrome. Front. Genet. 2018;9:173. doi: 10.3389/fgene.2018.00173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Zhao X-N, et al. Mutsβ generates both expansions and contractions in a mouse model of the Fragile X-associated disorders. Hum. Mol. Genet. 2015 doi: 10.1093/hmg/ddv408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Lokanga RA, et al. Somatic expansion in mouse and human carriers of fragile X premutation alleles. Hum. Mutat. 2013;34:157–166. doi: 10.1002/humu.22177. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Møllersen L, Rowe AD, Larsen E, Rognes T, Klungland A. Continuous and periodic expansion of CAG repeats in Huntington’s disease R6/1 mice. PLoS Genet. 2010;6:e1001242. doi: 10.1371/journal.pgen.1001242. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Pinto RM, et al. Patterns of CAG repeat instability in the central nervous system and periphery in Huntington’s disease and in spinocerebellar ataxia type 1. Hum. Mol. Genet. 2020;29:2551–2567. doi: 10.1093/hmg/ddaa139. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Monckton DG, Wong LJ, Ashizawa T, Caskey CT. Somatic mosaicism, germline expansions, germline reversions and intergenerational reductions in myotonic dystrophy males: Small pool PCR analyses. Hum. Mol. Genet. 1995;4:1–8. doi: 10.1093/hmg/4.1.1. [DOI] [PubMed] [Google Scholar]
- 29.Wong LJ, Ashizawa T, Monckton DG, Caskey CT, Richards CS. Somatic heterogeneity of the CTG repeat in myotonic dystrophy is age and size dependent. Am. J. Hum. Genet. 1995;56:114–122. [PMC free article] [PubMed] [Google Scholar]
- 30.Ciosi M, et al. A genetic association study of glutamine-encoding DNA sequence structures, somatic CAG expansion, and DNA repair gene variants, with Huntington disease clinical outcomes. EBioMedicine. 2019;48:568–580. doi: 10.1016/j.ebiom.2019.09.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Kim K-H, et al. Genetic and functional analyses point to FAN1 as the source of multiple Huntington disease modifier effects. Am. J. Hum. Genet. 2020;107:96–110. doi: 10.1016/j.ajhg.2020.05.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Flower, M. Genetic Variation in DNA Repair Proteins Modifies the Course of Huntington’s Disease. (2019).
- 33.Morales F, et al. A polymorphism in the MSH3 mismatch repair gene is associated with the levels of somatic instability of the expanded CTG repeat in the blood DNA of myotonic dystrophy type 1 patients. DNA Repair. 2016;40:57–66. doi: 10.1016/j.dnarep.2016.01.001. [DOI] [PubMed] [Google Scholar]
- 34.Cumming SA, et al. De novo repeat interruptions are associated with reduced somatic instability and mild or absent clinical features in myotonic dystrophy type 1. Eur. J. Hum. Genet. 2018;26:1635–1647. doi: 10.1038/s41431-018-0156-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Morales F, et al. Longitudinal increases in somatic mosaicism of the expanded CTG repeat in myotonic dystrophy type 1 are associated with variation in age-at-onset. Hum. Mol. Genet. 2020;29:2496–2507. doi: 10.1093/hmg/ddaa123. [DOI] [PubMed] [Google Scholar]
- 36.Veitch NJ, et al. Inherited CAG.CTG allele length is a major modifier of somatic mutation length variability in Huntington disease. DNA Repair. 2007;6:789–796. doi: 10.1016/j.dnarep.2007.01.002. [DOI] [PubMed] [Google Scholar]
- 37.Bettencourt C, et al. DNA repair pathways underlie a common genetic mechanism modulating onset in polyglutamine diseases. Ann. Neurol. 2016;79:983–990. doi: 10.1002/ana.24656. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.CAG repeat not polyglutamine length determines timing of Huntington’s disease Onset. Cell178, 887–900.e14 (2019). [DOI] [PMC free article] [PubMed]
- 39.Tassone F, et al. CGG repeat length correlates with age of onset of motor signs of the fragile X-associated tremor/ataxia syndrome (FXTAS) Am. J. Med. Genet. B Neuropsychiatr. Genet. 2007;144B:566–569. doi: 10.1002/ajmg.b.30482. [DOI] [PubMed] [Google Scholar]
- 40.Filipovic-Sadic S, et al. A novel FMR1 PCR method for the routine detection of low abundance expanded alleles and full mutations in fragile X syndrome. Clin. Chem. 2010;56:399–408. doi: 10.1373/clinchem.2009.136101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Tassone F, Pan R, Amiri K, Taylor AK, Hagerman PJ. A rapid polymerase chain reaction-based screening method for identification of all expanded alleles of the fragile X (FMR1) gene in newborn and high-risk populations. J. Mol. Diagn. 2008;10:43–49. doi: 10.2353/jmoldx.2008.070073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Tassone F, Hagerman RJ, Ikle DN, Dyer PN, Lampe M, Willemsen R, Oostra BA, Taylor AK. FMRP expression as a potential prognostic indicator in fragile X syndrome. Am J Med Genet. 1999;84(3):250–261. doi: 10.1002/(SICI)1096-8628(19990528)84:3<250::AID-AJMG17>3.0.CO;2-4. [DOI] [PubMed] [Google Scholar]
- 43.Hayward BE, Usdin K. Assays for determining repeat number, methylation status, and AGG interruptions in the fragile X-related disorders. Methods Mol. Biol. 2019;1942:49–59. doi: 10.1007/978-1-4939-9080-1_4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Tassone F, et al. Elevated levels of FMR1 mRNA in carrier males: A new mechanism of involvement in the fragile-X syndrome. Am. J. Hum. Genet. 2000;66:6–15. doi: 10.1086/302720. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.James, G., Witten, D., Hastie, T. & Tibshirani, R. An Introduction to Statistical Learning: with Applications in R. (Springer Science & Business Media, 2013).
- 46.Yrigollen CM, et al. AGG interruptions within the maternal FMR1 gene reduce the risk of offspring with fragile X syndrome. Genet. Med. 2012;14:729–736. doi: 10.1038/gim.2012.34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Lokanga RA, Zhao X-N, Usdin K. The mismatch repair protein MSH2 is rate limiting for repeat expansion in a fragile X premutation mouse model. Hum. Mutat. 2014;35:129–136. doi: 10.1002/humu.22464. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Zhao X, Zhang Y, Wilkins K, Edelmann W, Usdin K. MutLγ promotes repeat expansion in a Fragile X mouse model while EXO1 is protective. PLoS Genet. 2018;14:e1007719. doi: 10.1371/journal.pgen.1007719. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Zhao X, et al. Repeat instability in the Fragile X-related disorders: Lessons from a mouse model. Brain Sci. 2019;9:52. doi: 10.3390/brainsci9030052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Nolin SL, et al. Expansions and contractions of the FMR1 CGG repeat in 5,508 transmissions of normal, intermediate, and premutation alleles. Am. J. Med. Genet. A. 2019;179:1148–1156. doi: 10.1002/ajmg.a.61165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Yrigollen CM, et al. AGG interruptions and maternal age affect FMR1 CGG repeat allele stability during transmission. J. Neurodev. Disord. 2014;6:24. doi: 10.1186/1866-1955-6-24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Latham GJ, Coppinger J, Hadd AG, Nolin SL. The role of AGG interruptions in fragile X repeat expansions: A twenty-year perspective. Front. Genet. 2014;5:244. doi: 10.3389/fgene.2014.00244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Xiao-Nan Zhao KU. FAN1 protects against repeat expansions in a Fragile X mouse model. DNA Repair. 2018;69:1. doi: 10.1016/j.dnarep.2018.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Zhao X, et al. Modifiers of somatic repeat instability in mouse models of Friedreich Ataxia and the Fragile X-related disorders: Implications for the Mechanism of somatic expansion in Huntington’s disease. J. Huntington’s Dis. 2021;10:149–163. doi: 10.3233/JHD-200423. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Ballester-Lopez A, et al. Preliminary findings on CTG expansion determination in different tissues from patients with myotonic dystrophy type 1. Genes. 2020;11:1321. doi: 10.3390/genes11111321. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Kacher R, et al. Propensity for somatic expansion increases over the course of life in Huntington disease. Elife. 2021;10:e64674. doi: 10.7554/eLife.64674. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Corrales E, et al. Analysis of mutational dynamics at the DMPK (CTG)n locus identifies saliva as a suitable DNA sample source for genetic analysis in myotonic dystrophy type 1. PLoS One. 2019;14:e0216407. doi: 10.1371/journal.pone.0216407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Kennedy L, et al. Dramatic tissue-specific mutation length increases are an early molecular event in Huntington disease pathogenesis. Hum. Mol. Genet. 2003;12:3359–3367. doi: 10.1093/hmg/ddg352. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data and results generated from this project will be fully available from corresponding author upon request. Biological samples from subjects included in this study will be available under MTA agreement accordingly to the University of California, Davis policy.