Skip to main content
Molecular Genetics and Metabolism Reports logoLink to Molecular Genetics and Metabolism Reports
. 2021 Nov 11;29:100820. doi: 10.1016/j.ymgmr.2021.100820

Long-read single molecule real-time (SMRT) sequencing of GBA1 locus in Gaucher disease national cohort from Argentina reveals high frequency of complex allele underlying severe skeletal phenotypes: Collaborative study from the Argentine Group for Diagnosis and Treatment of Gaucher Disease

Guillermo I Drelichman a,, Nicolas Fernández Escobar a, Barbara C Soberon a, Nora F Basack a, Joaquin Frabasil b, Andrea B Schenone b, Gabriel Aguilar c, Maria S Larroudé c, James R Knight d, Dejian Zhao d, Jiapeng Ruan e, Pramod K Mistry e; Argentine Group for Diagnosis and Treatment of Gaucher Disease1
PMCID: PMC8600149  PMID: 34820281

Abstract

Gaucher disease is renowned for extreme phenotypic diversity that does not show consistent genotype/phenotype correlations. In Argentina, a national collaborative group, Grupo Argentino de Diagnóstico y Tratamiento de la Enfermedad de Gaucher, GADTEG, delineated uniformly severe type 1 Gaucher disease manifestations presenting in childhood with large burden of irreversible skeletal disease. Here using Long-Read Single Molecule Real-Time (SMRT) Sequencing of GBA1 locus, we show that the RecNciI allele is highly prevalent and it is associated with severe skeletal manifestations with onset in childhood or in young adults. Additionally, we described novel GBA1 variants not previously described.

Keywords: Gaucher disease, Bone disease, Mutation analysis, Genotype phenotype correlation

Abbreviations: GD, Gaucher disease; BD, bone disease; GL1, Glucosylceramide; GADTEG, The Argentine Group for Diagnosis and Treatment of Gaucher Disease (Grupo Argentino de Diagnóstico y Tratamiento de la Enfermedad de Gaucher; ERT, Enzyme replacement therapy

1. Introduction

Gaucher disease (GD) is a prototype lysosomal storage disease due to bi-allelic mutations in GBA1, which encodes lysosomal acid β-glucosidase (glucocerebrosidase, EC 3.2.1.45) [1]. Deficiency of acid β-glucosidase leads to a progressive accumulation of glucosylceramide (GlcCer) and glucosylsphingosine (GlcSph) in the lysosomes of myeloid cells, most prominently displayed by the macrophages [2,3]. Three broad phenotype categories have been classified based on the absence of (type 1, GD1: non-neuronopathic, OMIM # 230800), or the presence of, and severity of early onset neurodegenerative symptoms (type 2, GD2: acute neuronopathic form, OMIM # 230900; type 3, GD3: chronic neuronopathic forms, OMIM # 231000) [3,4]. In GD1, some patients develop neurodegeneration as adults manifesting as Parkinson's disease and Lewy Body Dementia [ 5].

GBA1, located on chromosome 1q21 of GRCh37/hg19 (now on 1q22 of the latest version GRCh38/hg38), is comprised of 11 exons and 11 introns spanning 7.6 Kb DNA. Notably, it is located in a highly gene dense region that harbors seven genes and two pseudogenes within only 85 Kb of DNA [6,7] (Fig. 1). There is a highly homologous pseudogene (GBAP1), 16 Kb downstream from the GBA1 with an exon and intron organization spanning 5.7 kb, similar to GBA1. In fact, the assigned exons of the GBAP1 share up to 98% sequence homology with the coding region of GBA1 [6]. Notably, GBAP1 harbors many mutations which, if present in the GBA1, causes Gaucher disease. The genomic organization of the GBA locus results in a propensity for gene conversion events, which underlie numerous disease mutations and complex alleles involving GBA1 and GBAP1. These attributes of GBA present significant challenges for the accurate and comprehensive genotyping of patients in the clinics and in large cohorts studies.

Fig. 1.

Fig. 1

Location of GBA1 on Chromosome 1q21 with flanking genes, and the LR-PCR amplicons for SMRT sequencing. The 133kb human GBA1 loci (GRCh38.p13, Chr1q22; NC_000001.11) consist of 15 genes: PKLR (pyruvate kinase L/R, Chr1:155,289,293..155,301,438. Length:12,146nt); HCN3 (hyperpolarization activated cyclic nucleotide gated potassium channel 3, complement Chr1:155,277,427..155,289,848. Length:12,422nt), CLK2 (CDC like kinase 2, Chr1:155,262,868..155,273,504. Length:10,637nt), SCAMP3 (secretory carrier membrane protein 3, Chr1:155,255,981..155,262,360. Length:6,380nt), FAM189B (family with sequence similarity 189 member B, Chr1:155,247,205..155,255,892. Length:8,688nt), GBA1 (glucosylceramidase beta, Chr1:155,234,452..155,244,627. Length:10,176nt), MTX1P1 (metaxin 1 pseudogene 1, complement Chr1:155,230,976..155,234,451. Length:3,476nt), GBAP1 (glucosylceramidase beta pseudogene 1, Chr1:155,213,825..155,227,534. Length:13,710nt), MTX1 (metaxin 1, complement Chr1:155,208,699..155,213,839. Length:5,141nt), THBS3 (thrombospondin 3, Chr1:155,195,588..155,209,180. Length:13,593nt), LOC (THBS3-AS1/LOC105371450, complement Chr 155,196,035..155,200,571. Length:4,537nt), MIR92B (microRNA 92b, complement Chr1:155,195,177..155,195,272. Length:96nt), TRIM46 (tripartite motif containing 46, complement Chr1:155,173,381..155,184,971. Length:11,591nt), MUC1 (mucin 1, cell surface associated, Chr1:155,185,824..155,192,915. Length:7,092nt), KRTCAP2 (keratinocyte associated protein 2, Chr1:155,169,408..155,173,304. Length:3,897nt). GBA1 pseudogene (GBAP1) is approximately 12 kb downstream of GBA1 gene. Red bar, length and location of the six long-range (LR) SMRT amplicons used in this study (primers in Table 1). Lower panel, purified LR-PCR amplicons on 0.75% agarose gel.

Currently, more than 600 GBA1 pathogenetic variants have been catalogued (HGMD professional 2020.4; CentoLSD™, https://www.centogene.com/centolsd.html) The most prevalent GBA1 variants are N370S (c. 1226A > G; p.Asn409Ser), a founder mutation from Eastern Europe, and L444P (c.1448 T > C; p.Leu483Pro), a variant in GBAP1 transferred to GBA1 by gene conversion events that occurs recurrently in all populations of the world [7,8].

Delineation of the genotype/phenotype correlations have been the focus of many studies. N370S mutation is predictive of GD type 1, therefore is a neuro-protective variant for childhood-onset of neurodegenerative disease (GD2 or GD3). However, it does not protect from late-onset neurodegenerative diseases, Parkinson's disease and Lewy Body Dementia, seen in some patients with GD1. It accounts for more than 70% of pathogenetic variants in GD type 1 Ashkenazi Jewish patients, and ~ 30% of pathogenetic variants in European non-Jewish GD patients [9]. In contrast, homozygosity for the L444P mutation is strongly associated with GD3, and, when present in context of a complex allele, it can be associated with the most severe forms neuronopathic GD [8].

A wide spectrum of GBA1 mutations have been reported, including missense/nonsense, indels, gross deletions/insertions, duplications, alternative splicing, promoter elements, regulatory RNAs, and complex recombinant alleles. GD complex alleles arise from the high homology and the physical proximity between GBA1 and GBAP1, that enable reciprocal as well as nonreciprocal homologous recombination events [10]. To date, more than 20 GBAP1 derived recombinant alleles have been reported, where recombination sites are variable spanning intron 2 to exon 11 of GBA1. The most frequently encountered complex alleles are RecNciI and RecDelta55 (c.1263-1317Del55) [11]. Recombination events underlying RecNciI mutation occur in the area from intron 9 to exon 10, involve the incorporation of a GBAP1 segment that harbors three variants: L444P, A456P, and the silent change of V460V [12]. Therefore, targeted NGS sequencing or Sanger sequencing analysis of only L444P (as occurs in many diagnostic panels not using whole gene sequencing) can miss a complex allele, hindering optimal genetic care and confounds genotype/phenotype studies. Taken together, the complexity of the GBA1 locus challenges and confound precise genotype assignment which can hamper accurate assessment of individual patients and large cohort studies.

A major cause of morbidity and disability in GD1 is complex skeletal disease, manifesting as chronic unrelenting bone pain, avascular osteonecrosis, complex lytic bone lesions and fragility fractures. In Europe and the US, bone disease occurs in GD patients with a frequency of 50-60% [13,14]. In contrast, in the Argentinian GD population, bone involvement is more frequent at diagnosis (71%) and remains predominant after long-term follow-up (69.8%), despite the Enzyme Replacement Therapy (ERT) [15]. To understand what types of mutation are associated with bone disease in GD, comprehensive GBA1 analysis is necessary. Previous studies have examined prevalent individual pathogenetic variants and in general shown that “N370S/other allele variant(s)” is associated with more severe skeletal disease [16]. Hitherto, approaches to ascertainment of GBA1 mutations in such studies have relied mostly on screening for common mutations and not full gene sequencing. Therefore, the nature of genotype/phenotype correlation with respect to bone disease in GD is not fully understood.

The Argentine Group for Diagnosis and Treatment of Gaucher Disease (Grupo Argentino de Diagnóstico y Tratamiento de la Enfermedad de Gaucher, GADTEG) was created in 2006. Our collaborative group is formed by ~70 physicians throughout Argentina, tomonitor the phenotypic spectrum, natural history, and treatment outcomes in 300 patients across the country. We showed a strikingly prevalence of skeletal disease in our GD1 population [15]. Until 2017, only 30% of our patients underwent limited genotyping for common pathogenetic variants. Therefore, our cohort of GD patients offers a unique opportunity to understand the genetic contribution of GBA1 variants to high burden of skeletal disease. Such studies are essential prelude for unravelling putative modifier genes. Moreover, knowledge of comprehensive genotype of in our patients promises to advance precision medicine for optimal management of Gaucher disease.

2. Materials and methods

2.1. Patients

A total of 192 patients provided informed consent to participate in the study. All patients had enzymatically confirmed diagnoses of Gaucher disease in peripheral blood leucocytes. Patients have been longitudinally followed and comprehensively evaluated for hematological, visceral and bone disease indicators as described previously [ 14]. Response to ERT (Enzyme Replacement Therapy) has also been carefully documented.

DNA samples were processed at Laboratorio “Dr. N. A. Chamoles” in Argentine and sequenced at Yale Center of Genome Analysis (YCGA). PacBio long-read Single Molecule Real-Time (SMRT) GBA1 deep sequencing was developed using GBA1 specific primers depicted in Fig. 1, covering from the 5'-UTR to 3'-URT of GBA1 gene, avoiding amplification of GBAP1.

To fully genotype GBA1, a total of six specific LR-PCR amplicons (5.7 to 8.15 kb, spanning 19.4 kb) were designed as shown in Fig. 1. Primers (Table 1) were optimized to yield amplicons of similar amount to enhance sequencing efficiency and loading capacity for SMRT sequencing. LR-PCR fragments were amplified from 100 ng of genomic DNA, using 1× PrimeSTAR GXL polymerase (R050B, Takara Bio USA Inc.) on a 25 μl of PCR reaction volume with 200 nM of barcode tagged primers. Initial denaturation was performed for 8 min at 98°, followed by 30 cycles of 10 s at 98 °C, 15 s at 60 °C, and 10 min at 68 °C, respectively. Final extension was 10 min at 68 °C. In some cases, 5 ul of 5M Betaine solution (B0300, Sigma-Aldrich USA) was included to the 25 μl of LR-PCR reaction to increase PCR efficiency. Pooled PCR products were size selected, purified, and visually inspected on agarose gel. Equal amount of pooled amplicons were used to generate SMRT libraries and sequenced on SMRT cells in PacBio RS II system [17] according to manufacturer's instructions. Briefly, the damages of the pooled amplicons were first repaired, followed by end-repair and A-tailing. Next, PacBio sequencing adaptors were ligated to the pooled amplicons and purified using AMPure PB beads. The final PacBio libraries were then annealed to the sequencing primer, bound to the polymerase, and loaded on the PacBio RS II for sequencing. SMRT phasing was then performed to resolve individual alleles. The genotype was validated in original patient DNA sample by Sanger sequencing.

Table 1.

LR-PCR primers used to amplify the 1q21 GBA1 region for PacBio SMRT.

Primer ID Primer Sequence (Universal tag + gene specific sequence) Amplicon Size (bp)
F5704/F8155 GGTAGGCGCTCTGTGTGCAGCtCGGGGTTGGGATTCGCACT
R5704 CCATCTCATATGTAGTACTCTtGATGTCCAGGGGCTGGCAA 5704
R8155 CCATCTCATATGTAGTACTCTtGATGTCCAGGGGCTGGCAA 8155
F6242 GGTAGGCGCTCTGTGTGCAGCgGCCACACCATGGACAGCTT
R6242 CCATCTCATATGTAGTACTCTtTGGGTCCTCCTTCGGGGTT 6242
F5900 GGTAGGCGCTCTGTGTGCAGCaGCAGATGTGTCCATTCTCCATGT
R5900 CCATCTCATATGTAGTACTCTtTGTCTCCATCCAGCGGGCA 5900
F6746 GGTAGGCGCTCTGTGTGCAGCgGTCCACTTTCTTGGCCGGA
R6746 CCATCTCATATGTAGTACTCTaACCTATTGCTATGAAAAGGAGCAG 6746
F8077 GGTAGGCGCTCTGTGTGCAGCgGACCGACTGGAACCTTGCC
R8077 CCATCTCATATGTAGTACTCTgCCAGCACACCCTTAGTGGG 8077

Each primer has 5 base padding sequence at 5’-end (underline) following with barcode sequence and GBA1 gene specific sequence (starts from the lower-case nucleotide).

For Sanger sequencing, four Sanger LR-PCR were designed (Table 2) covering the whole GBA1 (Table 1). Sanger LR-PCR enzyme and conditions were the same as for PacBio LR-PCR, except the extension time was decreased to 3 min. For each patient, four gel-purified Sanger LR-PCR fragments were prepared and stored individually at -20 °C. Each variant detected by SMRT was verified at least two times from both 5’- and 3’- directions by LR-PCR based Sanger sequencing.

Table 2.

LR-PCR primers used to amplify GBA1 gene for Sanger sequencing.

Primer ID Primer Sequence Amplicon Size (bp)
NA2568F CCATCCTCTGGGATTTAGGAGC
NA2568R GAAGTCAGGGTCCAAAGAAAGGG 2568
NB2664F TGCATCCCTAAAAGCTTCGGCTA
NB2664R GGTGAGTACTGTTGGCGAGGG 2664
NC2470F CTCAAGACCAATGGAGCGGT
NC2470R TCGACAAAGTTACGCACCCA 2470
C1600F CTTCCTGCAAAGCAGACCTCA
C1600R TTGGGCCCAGCTTTCCTAGTC 1600

Genotyping results were evaluated for correlation with skeletal phenotype and compared our genotype distribution with that reported in the Gaucher International Registry data (International Collaborative Gaucher Group, ICGG, https://clinicaltrials.gov/NCT00358943).

2.2. Statistical methods

Qualitative variables are expressed as frequency and percentage, while quantitative variables are expressed as mean, median, minimum, and maximum. For univariate analysis, contingency tables were evaluated using chi-square test or Fisher's exact test, as appropriate. Yates´ continuity correction was applied to 2 × 2 contingency tables. Logistic regression models were used for multivariate analysis. Alpha values of 0.05 were considered as statistically significant.

In the multivariate statistical analysis we included 4 variables: RecNcil allele, RecNcil/N370S genotype, RecNcil/other genotype and Argentine ancestors.

In addition, we took 2 models as a dependent variable:

  • ·

    MODEL I: less severe bone manifestations during follow up, i.e., bone marrow infiltration and EFD.

  • ·

    MODEL II: severe bone manifestations during follow up, i.e., acute and/or chronic avascular necrosis and bone marrow infarcts).

We found, in both models (I and II), that RecNcil/N370S genotype and patient carrying heterozygote a RecNcil allele have a statistically significant association with model I (p = 0.017) and model II (p = 0.004) Argentine ancestors were not a significant variable (p = 0.059) in either of the two models.

3. Results

Of a total of 192 samples, 146 (76%) were successfully genotyped by SMRT sequencing. In 46 samples, full GBA1 genotype could not be ascertained in the first round of PacBio SMRT analysis. As the key objective of our study was delineating the comprehensive GBA1 genotype of our cohort, we removed these 46 samples from further analysis. Separately, we are conducting studies to optimize our sequencing strategies to overcome this limitation. All genotypes assigned by SMRT were confirmed by Sanger sequencing in the original gDNA sample.

3.1. Pathogenetic variants

The most frequent Gaucher disease mutation in the Argentine cohort is N370S, with 126 patients (86.3%) harboring at least one N370S allele. Notably, this frequency is similar to that reported in the ICGG (June 2014) [13] for Ashkenazi Jewish and non-Jewish European populations (Table 3).

Table 3.

N370S allele frequency in Argentine compared with other regions worldwide (ICGG 2014) [13].

Region Argentine Europe1 Japac2 Latin America3
(Without Argentine)
Middle East4 North America4 TOTAL
(Without Argentine)
Total genotyping 146 971 135 497 697 1854 4156
1 N370S ALLELE
(Source: ICGG 2014)
86.3% 74.8% 4.4% 79% 84.6% 83.8% 79.2%

1Europe: Albania, Austria, Balearic Islands, Belgium, Bulgaria, Czech Republic, Denmark, England, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Lithuania, Netherlands, Norway, Poland, Portugal, Romania, Russia, Serbia, Slovenia, Spain, Switzerland, Turkey, and Ukraine. 2JAPAC: China, Hong Kong, India, Japan, Korea, Malaysia, Philippines, Taiwan, and Thailand. 3Latin América: Bolivia, Brasil, Chile, Colombia, Costa Rica, Dominican Republic, Ecuador, Guatemala, México (n = 13; 1.4%), Panamá, Paraguay, Perú, Suriname, Uruguay, and Venezuela. 3North America: Canada and United States. 4Middle East: Egypt, Israel, Jordan, Kuwait, Oman, Saudi Arabia, and Unites Arab Emirates.

The second most frequent allele was RecNcil, with 77 patients (52.7%) harboring this complex mutation (Table 4). Notably, this allelic frequency for RecNciI is the highest reported in the literature, including a previous small study from Argentina [[18], [19], [20], [21], [22], [23]].

Table 4.

Comparison of RecNcil allele frequency in Argentine and other regions.

Country Argentine [15] Egypt [16] Spain [17] Argentine [18] India [19] Brazil [20]
Total genotyping 146 26 193 31 22 58
1 RecNcil ALLELE 52.7% 13.4% 0.7% 21% 7% 15.5%

3.2. Genotypes

We retained the older mutation nomenclature to facilitate comparison with past studies, i.e., N370S for p. Asp409Ser and L444P for p. Lys483Pro, respectively. The most frequent genotype in our population was N370S/RecNciI. These compound heterozygotes mutations occurred in 68 patients (46.6%). This is the highest frequency of compound heterozygote N370S/RecNciI genotype reported in the literature. In our study of Argentine GD population, it accounted for 46.6% of genotypes compared to only 1.9% in the International Gaucher Registry (p = 0.001) (Table 5). Interestingly, 14 patients (9.6%) were homozygous for N370S mutation. Equally prevalent genotype was N370S/L444P with 14 patients (9.6%) harboring this compound heterozygous genotype. This is similar to other regions worldwide. In 14 patients (9.6%), we found multiple rare or novel mutations (Table 6); in 10 patients, the rare mutation was in compound heterozygous state with N370S and in only one patient with L444P. One patient was homozygous for F411I mutation, and 6 patients were heterozygous for F411I.

Table 5.

Genotype frequency in Argentine compared with other regions of the world (ICGG 2019)12.

Region Argentina Europe1 Japac2 Latin America3
(without Argentina)
Middle East4 North America4 Oceania Total
(without Argentina)
Total genotyping 146 1345 139 284 921 2146 42 4877
N370S/RecNciI
(Source: ICGG 2019)
46.6% 4.8% 0% 2.8% 0.5% 0.7% 4.8% 1.9%
N370S/L444P
(Source: ICGG 2019)
9.6% 17.8% 2.9% 26.8% 4.7% 13.4% 19% 13.8%

1Europe: Albania, Austria, Balearic Islands, Belgium, Bulgaria, Czech Republic, Denmark, England, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Lithuania, Netherlands, Norway, Polad, Portugal, Romania, Russia, Serbia, Slovenia, Spain, Switzerland, Turkey, and Ukraine. 2JAPAC: China, Hong Kong, India, Japan, Korea, Malaysia, Philippines, Taiwan, and Thailand. 3Latin America: Bolivia, Brazil, Chile, Colombia, Costa Rica, Dominican Republic, Ecuador, Guatemala, Mexico (n = 13; 1.4%), Panama, Paraguay, Peru, Suriname, Uruguay, and Venezuela. 3North America: Canada and United States. 4Midde East: Egypt, Israel, Jordan, Kuwait, Oman, Saudi Arabia, and United Arab Emirates. 6Oceania: Australia and New Zealand.

Table 6.

Summary of GBA1 genotypes in Argentine GD national cohort.

Genotype (n = 146) Frequency
RecNcil/N370S 68 (46.6%)
N370S/N370S 14 (9.6%)
New (see Table 10) 14 (9.6%)
RecNcil/F411I 9 (6.2%)
L444P/N370S 9 (6.2%)
F411I/F411I 6 (4.1%)
F411I/N370S 4 (2.7%)
R285C/N370S 2 (1.4%)
L444P/R496H 2 (1.4%)
H255Q-D409H/F411I 2 (1.4%)
F411I/R48W 2 (1.4%)
G202R/N370S 2 (1.4%)
G195W/N370S 2 (1.4%)
H255Q-D409H/N370S 1 (0.7%)
R120Q/N370S 1 (0.7%)
V394L/N370S 1 (0.7%)
R163X/R463C 1 (0.7%)
I161N/N370S 1 (0.7%)
L371V/L444P 1 (0.7%)
I260T/N370S 1 (0.7%)
F411I/E233X 1 (0.7%)
F397S/N370S 1 (0.7%)
F411I/Y135C 1 (0.7%)
Total 146

3.3. Clinical characteristics of N370S/RecNcil genotype patients at diagnosis

The finding of the high prevalence of N370S/RecNcil genotype in our cohort offered an unprecedented opportunity to understand the phenotypic spectrum associated with this combination of mutations. First, all patients had type 1 GD, underscoring the notion that presence of at least one N370S mutation predicts GD1 and that it is neuroprotective against neuronopathic GD [24]. Patients with N370S/RecNcil genotype had severe disease indicated by onset of symptoms at mean age 8.9 years, and it was not until mean age 21.4 years that they were diagnosed with Gaucher disease and started ERT at mean age 28.1 years. These results reveal a wide gap in interval between onset of symptoms in pediatric age group and delayed diagnosis as adults and moreover significant delay from diagnosis to initiation of ERT for a population that is severely affected. Notably, patients with this genotype had severe visceral and bone disease at diagnosis and as well as during follow-up. The numbers of patients with other genotypes were too small to conduct meaningful comparisons.

Bone involvement is the most disabling and incapacitating complication of GD1. The major goal of treatment for Gaucher disease is to improve bone health by preventing irreversible complications such as avascular necrosis and fragility fractures as well as ameliorate chronic bone pain. In our cohort of N370S/RecNcil Gaucher disease patients, we found a higher prevalence of skeletal disease at diagnosis (71%) as well as during follow up on ERT (69.8%), compared to the ICGG Registry which shows bone disease in 59.8% of patients. Moreover, there was major diagnostic delay with onset of symptoms in childhood and diagnosis in adulthood. These findings underscore the large burden of irreversible bone complications at initiation of treatment, such as avascular osteonecrosis (Table 7).

Table 7.

Bone manifestations at diagnosis and follow-up in the Argentine population vs. world population (Source: ICGG 2014) [13].

Argentine
Basal
ICGG
Basal
Argentine
follow-up
ICGG
follow-up
Patients, N 260 4625 266 4625
Bone pain 65.6% 49.8% 41% 37.8%
Bone crisis 35.6% 13.6% 14.5% 2.6%
Infiltration 91% 87.3% 89.5% 80.8%
Erlenmeyer 89.7% 73.1% 87.7% 71.1%
Infarcts 60% 43.8% 59.6% 48.9%
Necrosis 43.6% 32% 42.7% 28%
% Total of bone disease 71% 60.8% 69.8% 58.8%

3.3.1. N370S/RecNcil genotype and bone manifestations

Comparing the presence of bone manifestations in our RecNcil/N370S genotype patients with those reported at the ICGG (June 2019), we observed that bone manifestations were significantly more frequent in our population not only at diagnosis (61% vs. 38%, p = 0.001), but also after long-term follow-up (median 154 months) (70% vs. 57%, p = 0.001).

3.3.2. Univariate analysis

The presence of bone lesions at diagnosis and follow-up (mean duration of ERT 154 months) was compared in N370S/RecNciI and F411I/RecNciI patients (n = 77) and those with other genotypes (n = 69). The prevalence of bone lesions at diagnosis did not differ in the two genotype groups(p = 0.183). At follow-up, bone lesions were significantly more frequent among patients harboring RecNciI allele (p = 0.017). Additionally, follow-up on ERT, both genotypes (RecNciI/N370S and RecNciI/ F411I), were significantly associated with bone lesions (p = 0.017 and p = 0.001, respectively). Rates of splenectomy in different genotypes were similar, ~9% (data not shown).

3.3.3. Argentine ancestor analyses in patients harboring RecNciI allele

The preponderance of RecNciI allele in nearly 50% of our national cohort is highest of any cohort described to date. We examined the origin of patients using questionnaires designed by experts in population genetics based on the origins of the families to the birthplace of the third generation. Of the respondents, approximately 70% (n = 56) of patients reported at least one Argentine ancestor and 49% (n = 39) had two ancestors from Argentine in one paternal and/or maternal lineage.

3.3.4. Multivariate analysis

Variables included were: 1) RecNciI allele, 2) N370S/RecNciI genotype, 3) RecNciI/other genotype, and 4) Argentine ancestors. Taking as a dependent variable solitary bone lesion in the follow-up (model I) and severe multiple lesions at follow-up (model II) we found, in both models, that N370S/RecNciI genotype and a presence of RecNciI allele was significantly associated with severe bone lesions at follow-up (p = 0.017) as well as with irreversible severe lesions (necrosis and acute and chronic infarcts) (p = 0.004). Argentine ancestors were not a significant variable (p = 0.059) in either of the two models.

Interestingly, we found a significant correlation between the presence of a RecNciI allele and the presence of, one Argentine ancestor (p < 0.001) or both Argentine ancestors (p < 0.001) (Table 8). Notably, there was also a significant correlation between N370S/RecNciI genotype and at least one Argentine ancestor (p = 0.001) or two Argentine ancestors (p = 0.001) (Table 9). There was no significant correlation between history of Argentine ancestors and bone lesions at diagnosis (p = 0.470) or at follow-up (p = 0.549).

Table 8.

Association between RecNciI allele and Argentine ancestor.

Without association With association P value
One Argentine ancestor, N = 80 24 (30%) 56 (70%) <0.001
Two Argentine ancestors, N = 50 11 (22%) 39 (48.8%) <0.001
Table 9.

RecNciI/N370S genotype and Argentine ancestors.

Without association With association P value
One Argentine ancestor, N = 80 20 (29.4%) 48 (70.6%) 0.001
Two Argentine ancestors, N = 50 17 (34%) 33 (66%) 0.001

3.3.5. Novel GD pathogenetic variants

Using our comprehensive genotyping strategy, we found rare GD pathogenetic variants in 14 patients (9.6%), which have not been previously reported. We assessed their damaging properties in silico, as shown in the Table 10. Phenotype of patients with these pathogenetic variants is severe, with early childhood presentations (mean age of diagnosis: 6.9 years) and almost 64% prevalence of advanced bone disease (Table 10).

Table 10.

Novel Pathogenic variants found in PacBio Argentina patients.

Novel Var. Numbers in GD ArgGD ID. PolyPhen Score Position/Change Clinical Significance Genotype
L-17SfsX36 1 #31 1:155210462 (Del:AG > A) Pathogenic L-17SfsX36/N370S
R48GfsX4 (c.259DelC) 1 #186 1:155209724 (Del:CG > C) Pathogenic R48GfsX4 (c.259DelC)/N370S
D218A 1 #178 0.892 1:155207361 (T > G) Uncertain L444P/D218A
P332L 1 #147 1 1:155206148 (G > A) Uncertain P332L/N370S
W348R 1 #10 0.863 rs765182863/1:155206101 (A > T) Pathogenic L444P/W348R
L372P 1 #59 0.987 1:155205628 (A > G) Pathogenic F411I/L372P
Y313H + V375L G > C 3 #86, #153, #154 0.795 rs398123528/1:155205620 (C > G) Pathogenic N370S/Y313H + V375L G > C
P401R 1 #93 1 (rs74598136)/1:155205541 (G > C) Pathogenic P401R/N370S
S424R 1 #190 1 1:155205102 (G > C) Uncertain S424R/N370S
F426S 1 #58 1 1:155205097 (A > G) Uncertain F426S/N370S
S488IfsX38 1 #165 1:155204819 (Ins:G > GTAGC) Pathogenic S488IfsX38/N370S
L461P 1 #105 0.998 1:155204992 (A > G) Uncertain L461P/N370S

4. Discussion

Comprehensive GBA1 genotyping remains challenging [10,11,25,26], due to the vast multiplicity of disease mutations and highly homologous pseudogene in proximity, that harbors numerous disease mutations. Several variants normally present in the GBA1P sequence, when they occur in GBA1, generally cause severe forms of GD, e.g., L444P, complex alleles with multiple pseudogene variants in tandem, D409H, and RecDelta55 deletion, etc. The occurrences of such disease mutations around the world have been attributed to the propensity of GBA1 for gene conversion events. Several approaches have been developed to analyze the coding regions of GBA1 by Sanger sequencing on amplicons generated using primers designed to exploit limited differences in the GBA1 and GBA1P sequences. While the Sanger sequencing approach is adequate for most clinical applications, its short-read (from 500bp to 1000bp) and low-throughput limitations may hinder its application for refined genetic counselling and for study of large cohorts. Moreover, this standard approach does not allow for phasing of the pathogenic variants. Next generation sequencing (NGS) platforms [27], utilizing short read lengths (from 25 bp to 400 bp), are also inadequate for the precise identification of GBA1 variants, especially for complex alleles and structural variants (SVs) [11,26].

Third-generation sequencing (TGS) technologies display the capability to produce read lengths higher than 10,000 bp (typically 5,000 bp to 20,000 bp), offering the advantage of identifying repetitive and complex genome regions. Superior to short-read sequencing and arrays, long-read sequencing technologies have now reached a level of accuracy and yield that allows their application to variant detection on a scale of tens to thousands of samples [28]. Therefore, TGS is a promising tool for identifying recombinant alleles, SVs and phasing bi-allelic GD mutations.

However, TGS using Nanopore [29] or PacBio [30] reportedly shows results with low accuracy reads. So far, there is only one such algorithm designed to exploit GBA1 long-read data, using Nanopore technology studied on a relatively small cohort to compare with Sanger sequencing results [31].

A high proportion of the RecNciI complex allele has been noticed in an earlier Argentinian GD report [21]. Our results confirmed the superiority of SMRT to NGS in detecting the RecNciI allele, and to Sanger sequencing in scaled-up bi-allelic sequencing of the whole GBA1 gene.

In the Argentinian GD cohort, bone disease is highly prevalent. Advanced bone lesions are frequent both at diagnosis (71%) and at follow-up (69.8%), underscoring the irreversible nature of bone disease such as osteonecrosis. There was large gap between diagnosis in childhood and initiation of treatment as young adults which may contribute to this finding. Previous studies have demonstrated that the maximal impact of treatment in prevention of irreversible bone lesions occurs when ERT is initiated within 2 years of diagnosis compared to when there is a larger interval between diagnosis and starting ERT [32,33]. There are no published descriptions of a comprehensive genotype/phenotype correlation regarding bone manifestations in GD patients, hence our cohort provides valuable insight.

We found that the high frequency of advanced bone lesions in Argentinian GD patients is correlated with an unprecedentedly high frequency of N370S/N370S genotype. Additional contributors to advanced bone disease in the Argentinian GD phenotype include prolonged intervals between childhood onset of disease, and long gaps between diagnosis and initiation of ERT in adulthood, which likely promotes irreversible bone lesions such as osteonecrosis and bone marrow fibrosis around focal collections of lipid-laden macrophages, ‘Gaucheromas’. ERT targets tissue macrophages via the mannose receptor and exhibits very high uptake by the liver and spleen, hindering delivery to the bone marrow compartment [34,35].

In 2016, the GADTEG identified five unfavorable prognostic factors for advanced bone manifestations of Gaucher disease, viz a viz, history of splenectomy, diagnostic delay and long gap between diagnosis and initiation of ERT, poor compliance and suboptimal dose of ERT [15].

Our study found that the Argentinian GD population harbors the highest burden of RecNciI GBA1 mutation accounting for 52.7% of all disease alleles ever reported. This is higher than those reported in the ICGG Registry and in a smaller cohort study from Argentina (1.9% and 21%, respectively) [21].

The high prevalence of RecNciI allele found in our study in Argentina is the most accurate ascertainment to date, enabled by the SMRT strategy to decipher GBA locus. Previous studies reporting lower prevalence of this disease allele may have under-estimated the prevalence due to screening for only common variants.

There are several important findings in our study. First, our patient population harbors high prevalence of the RecNciI GBA1 complex allele, which is associated with severe early childhood onset of GD1. Second, there is large gap between the onset of GD1 symptoms and diagnosis to initiation of ERT. Therefore, by the time of initiation of ERT, there is already high burden of irreversible bone manifestation, attenuating the impact of therapy in the skeletal compartment. Third, the high prevalence of the RecNciI mutation highlights that the gene conversion event(s) giving rise to this variant could disrupt another gene in the tightly packed GBA locus that could impact the skeletal phenotype. This topic merits further investigation for a potential candidate gene for severe skeletal manifestations. Finally, we found about 10% of disease variants were not described before, hence contributing to the catalogue of GBA1 mutations.

It is intriguing that we found a strong association of the N370S/RecNciI genotype with Argentinian ancestry. It seems likely that the RecNciI mutation is founder mutation in the Argentine GD patient population. Generally the Argentinian population is said to be comprised, in a greater proportion, by immigrants, while only a minor proportion with Argentinian ancestors. However, in 2012, Avena et al. [36], demonstrated, after analyzing national surveys and census, that 65% of Argentinians have European ancestors, 31% have Argentinian ancestors, and 4% have African. These frequencies vary according to geographical region, for example, the proportion of indigenous Americans is higher in the northwest (60%) and south (50%) [12,13]. Our analysis showed that 54.8% of GD patients have at least one Argentinian ancestor, and 30.5% have two ancestors in any paternal/maternal lineage.

Univariate and multivariate analyses showed a statistically significant correlation between the presence of the RecNciI allele and at least one Argentine ancestor (p < 0.001) or two Argentinian ancestors (p < 0.001). This demonstrates a strong relationship between the presence of Argentine ancestors and the RecNciI allele as well as the RecNciI/N370S genotype (p = 0.001). This high incidence of Argentinian ancestors should explain the high frequency of RecNciI allele.

A limitation of our study was that, of a total of 192 samples, 146 (76%) were successfully genotyped by SMRT sequencing but for 46 samples full genotypes could not be ascertained. However, having 146 patients comprehensively genotyped enabled a robust genotype/phenotype correlation study. We are currently optimizing and adapting our strategy to achieve successful genotyping in all samples.

5. Conclusions

Complete GBA1 gene sequencing using long-read SMRT plus short-read validation sequencing provides a rapid scaled-up mutation analysis, with essential information that cannot be obtained by Sanger or NGS sequencing alone. In our population, a high frequency of the RecNciI allele (52.7%) and RecNciI/N370S genotype (46.6%) was detected, data not captured in the international registry. Using a questionnaire to assess familial origin, we show that Argentina GD patients have at least one Argentine ancestor (54.8%) or two Argentine ancestors (30.5%). Moreover, there was a statistically significant correlation between the presence of a RecNciI allele and one Argentine ancestor (p < 0.001) or two Argentine ancestors (p < 0.001). Our study indicates that the RecNciI allele and RecNciI/N370S genotype are significantly associated with severe bone manifestations at presentation and during follow-up on ERT (p = 0.017).

Funding

This study was supported by Yale Center of Genome Analysis (YCGA) and by Sanofi.

Acknowledgements

We express our deep gratitude to our patients for their participation in this study.

References

  • 1.Beutler E., Grabowski G. In: The Metabolic and Molecular Bases of Inherited Disease. Scriver C., Beaudet A., Sly W., Valle D., editors. McGraw Hill; New York: 2001. The mucopolysaccharidoses; pp. 3635–3668. [Google Scholar]
  • 2.Grabowski G.A. Gaucher disease and other storage disorders. Hematology Am Soc Hematol Educ Program. 2012;2012:13–18. doi: 10.1182/asheducation-2012.1.13. [DOI] [PubMed] [Google Scholar]
  • 3.Drelichman G., Fernández Escobar N., Basack N., et al. 2015. Actualización del consenso argentino de enfermedad de gaucher: grupo argentino para el diagnóstico y tratamiento de la enfermedad de gaucher. HEMATOLOGÍA Volumen 19 Suplemento Enfermedad de Gaucher; pp. 4–51. [Google Scholar]
  • 4.Murugesan V., Chuang W.L., Liu J., Lischuk A., Kacena K., Lin H., Pastores G.M., Yang R., Keutzer J., Zhang K., Mistry P.K. Glucosylsphingosine is a key biomarker of Gaucher disease. Am. J. Hematol. 2016;11:1082–1089. doi: 10.1002/ajh.24491. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bultron G., Kacena K., Pearson D., Boxer M., Yang R., Sathe S., Pastores G., Mistry P.K. The risk of Parkinson’s disease in type 1 Gaucher disease. J. Inherit. Metab. Dis. 2010;33(2):167–173. doi: 10.1007/s10545-010-9055-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Horowitz M., Wilder S., Horowitz Z., Reiner O., Gelbart T., Beutler E. The human glucocerebrosidase gene and pseudogene: structure and evolution. Genomics. 1989;4(1):87–96. doi: 10.1016/0888-7543(89)90319-4. [DOI] [PubMed] [Google Scholar]
  • 7.Winfield S.L., Tayebi N., Martin B.M., Ginns E.I., Sidransky E. Identification of three additional genes contiguous to the glucocerebrosidase locus on chromosome 1q21: implications for Gaucher disease. Genome Res. 1997;7:1020–1026. doi: 10.1101/gr.7.10.1020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Koprivica V., Stone D.L., Park J.K., Callahan M., Frisch A., Cohen I.J., Tayebi N., Sidransky E. Analysis and classification of 304 mutant alleles in patients with type 1 and type 3 Gaucher disease. Am. J. Hum. Genet. 2000;66(6):1777–1786. doi: 10.1086/302925. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Vanier T., Froissart R. In: Advances in Gaucher Disease: Basic and Clinical Perspectives. Grabowski G.A., 1st, editors. Future Medicine; London: 2013. Diagnostic testing (enzyme and mutational analysis) pp. 170–181. [Google Scholar]
  • 10.Tayebi N., Stubblefield B.K., Park J.K., Orvisky E., Walker J.M., LaMarca M.E., Sidransky E. Reciprocal and nonreciprocal recombination at the glucocerebrosidase gene region: implications for complexity in Gaucher disease. Am. J. Hum. Genet. 2003;72:519–534. doi: 10.1086/367850. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Zampieri S., Cattarossi S., Bembi B., Dardis A. GBA1 Analysis in Next-Generation Era: Pitfalls, Challenges, and Possible Solutions. J. Mol. Diagnost. 2017;19(5):733–741. doi: 10.1016/j.jmoldx.2017.05.005. [DOI] [PubMed] [Google Scholar]
  • 12.Eyal N., Wilder S., Horowitz M. Prevalent and rare pathogenetic variants among Gaucher patients. Gene. 1990;96:277–283. doi: 10.1016/0378-1119(90)90264-r. [DOI] [PubMed] [Google Scholar]
  • 13.International Collaborative Gaucher Group (ICGG) Gaucher Registry . 2014. Standard Report Population Report: Annual Report Based on 26 June. [Google Scholar]
  • 14.Data Analysis Request Report ICGG Gaucher Registry DAR Committee Date: 5/13/2019.
  • 15.Drelichman G., Fernández Escobar N., Soberón B., Aguilar G., Larroude M., et al. Skeletal involvement in Gaucher disease: an observational multicenter study of prognostic factors in the Argentine Gaucher disease patients. Am. J. Hematol. 2016;91:E448–E453. doi: 10.1002/ajh.24486. [DOI] [PubMed] [Google Scholar]
  • 16.Taddei T.H., Kacena K.A., Yang M., Yang R., Malhotra A., Boxer M., Aleck K.A., Rennert G., Pastores G.M., Mistry P.K. The underrecognized progressive nature of N370S Gaucher disease and assessment of cancer risk in 403 patients. Am. J. Hematol. 2009;84(4):208–214. doi: 10.1002/ajh.21362. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Eid J., Fehr A., Gray J., et al. Real-time DNA sequencing from single polymerase molecules. Science. 2009;323:133–138. doi: 10.1126/science.1162986. [DOI] [PubMed] [Google Scholar]
  • 18.Drelichman G., Fernández Escobar N., Mistry P.K., et al. 2018. Correlación fenotípica /genotípica de la Prevalencia del genotipo N370S/RecNcil en una corte argentina de 197 pacientes con Enfermedad de Gaucher. Poster durante el LATAM round table. Buenos Aires Argentina. [Google Scholar]
  • 19.Saleem T., Hassan M., El-Abd Ahmed A., Sayed A., et al. Clinical and genetic assessment of pediatric patients with Gaucher disease in Upper Egypt. Egypt. J. Med. Human Genet. 2017;18:249.255. [Google Scholar]
  • 20.Alfonso P., Cenarro A., Pérez Calvo J., Giraldo P., et al. Mutation prevalence among 51 unrelated spanish patients with gaucher disease: identification of 11 novel pathogenetic variants. Blood Cells Mol. Dis. 2001;27:882–8912. doi: 10.1006/bcmd.2001.0461. [DOI] [PubMed] [Google Scholar]
  • 21.Cormand B., Harboe T.L., Gort L., Campoy C., Blanco M., Chamoles N., Chabas A., Vilageliu L., Grinberg D. Mutation analysis of Gaucher disease patients from Argentina high Prevelence of the RecNciI mutation. Am. J. Med. Genet. 1998;80:343–351. doi: 10.1002/(sici)1096-8628(19981204)80:4<343::aid-ajmg8>3.0.co;2-w. [DOI] [PubMed] [Google Scholar]
  • 22.Sheth J., Bhavsar R., Mistri M., Pancholi D., Bavdekar A., Dalal A., Ranganath P., Girisha K.M., Shukla A., Phadke S., et al. Gaucher disease: single gene molecular characterization of one-hundred Indian patients reveals novel variants and the most prevalent mutation. BMC Med. Genet. 2019;20:31. doi: 10.1186/s12881-019-0759-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Basgaluppa S.P., Altmanna V., Vairob F.P., et al. Is there any difference in GBA1 allele frequencies depending on the region of Brazil. WORLD SymposiumTM. Mol. Genet. Metab. 2019;126:S2–S6. [Google Scholar]
  • 24.Grabowski G.A., Kolodny E.H., Weinreb N.J., Rosenbloom B.E., Prakash-Cheng A., Kaplan P., Charrow J., Pastores G.M., Mistry P.K. In: The Online Metabolic and Molecular Bases of Inherited Disease. Valle D.L., Antonarakis S., Ballabio A., Beaudet A.L., Mitchell G.A., editors. McGraw Hill; 2019. Gaucher disease: phenotypic and genetic variation. Accessed October 30. [Google Scholar]
  • 25.Mandelke D., Schmidt R.J., Ankala A., McDonald G.K., et al. Navigating highly homologous genes in a molecular diagnostic setting: a resource for clinical next-generation sequencing. Genet. Med. 2016;18:1282–1289. doi: 10.1038/gim.2016.58. [DOI] [PubMed] [Google Scholar]
  • 26.Woo E.G., Tayebi N., Sidransky E. Next-generation sequencing analysis of GBA1: the challenge of detecting complex recombinant alleles. Front. Genet. 2021;12:684067. doi: 10.3389/fgene.2021.684067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Goodwin S., McPherson J.D., McCombie W.R. Coming of age: ten years of next-generation sequencing technologies. Nat. Rev. Genet. 2016;17(6):333–351. doi: 10.1038/nrg.2016.49. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.De Coster W., Weissensteiner M.H., Sedlazeck F.J. Towards population-scale long-read sequencing. Nat. Rev. Genet. 2021;22(9):572–587. doi: 10.1038/s41576-021-00367-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Wick R.R., Judd L.M., Holt K.E. Performance of neural network base calling tools for Oxford Nanopore sequencing. Genome Biol. 2019;20(1):129. doi: 10.1186/s13059-019-1727-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Ardui S., Ameur A., Vermeesch J.R., Hestand M.S. Single molecule real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics. Nucleic Acids Res. 2018;46(5):2159–2168. doi: 10.1093/nar/gky066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Leija-Salazar M., Sedlazeck F.J., Toffoli M., Mullin S., Mokretar K., Athanasopoulou M., Donald A., Sharma R., Hughes D., Schapira A.H.V., Proukakis C. Evaluation of the detection of GBA missense mutations and other variants using the Oxford Nanopore MinION. Mol. Genet. Genomic Med. 2019;7(3) doi: 10.1002/mgg3.564. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Mistry P.K., Deegan P., Vellodi A., Cole J.A., Yeh M., Weinreb N.J. Timing of initiation of enzyme replacement therapy after diagnosis of type 1 Gaucher disease: effect on incidence of avascular necrosis. Br. J. Haematol. 2009;147(4):561–570. doi: 10.1111/j.1365-2141.2009.07872.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Hughes D., Mikosch P., Belmatoug N., Carubbi F., Cox T., Goker-Alpan O., Kindmark A., Mistry P., Poll L., Weinreb N., Deegan P. Gaucher disease in bone: from pathophysiology to practice. J. Bone Miner. Res. 2019;34:996–1013. doi: 10.1002/jbmr.3734. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Mistry P.K., Wraight E.P., Cox T.M. Therapeutic delivery of proteins to macrophages: implications for treatment of Gaucher's disease. Lancet. 1996;348:1555–1559. doi: 10.1016/S0140-6736(96)04451-0. [DOI] [PubMed] [Google Scholar]
  • 35.Barton N.W., Brady R.O., Dambrosia J.M., Di Bisceglie A.M., Doppelt S.H., Hill S.C., Mankin H.J., Murray G.J., Parker R.I., Argoff C.E., et al. Replacement therapy for inherited enzyme deficiency–macrophage-targeted glucocerebrosidase for Gaucher’s disease. N. Engl. J. Med. 1991;324(21):1464–1470. doi: 10.1056/NEJM199105233242104. [DOI] [PubMed] [Google Scholar]
  • 36.Avena S., Via M., Ziv E., Pérez-Stable E.J., Gignoux C.R., Dejean C., Huntsman S., Torres-Mejía G., Dutil J., Matta J.L., Beckman K., Burchard E.G., Parolin M.L., Goicoechea A., Acreche N., Boquet M., Ríos Part Mdel C., Fernández V., Rey J., Stern M.C., Carnese R.F., Fejerman L. Heterogeneity in genetic admixture across different regions of Argentina. PLoS One. 2012;7 doi: 10.1371/journal.pone.0034695. e34695. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Molecular Genetics and Metabolism Reports are provided here courtesy of Elsevier

RESOURCES