Abstract
Background
In this study, the prevalence of different types of mucopolysaccharidoses (MPS) was estimated based on data from the exome aggregation consortium (ExAC) and the genome aggregation database (gnomAD). The population-based allele frequencies were used to identify potential disease-causing variants on each gene related to MPS I to IX (except MPS II).
Methods
We evaluated the canonical transcripts and excluded homozygous, intronic, 3′, and 5′ UTR variants. Frameshift and in-frame insertions and deletions were evaluated using the SIFT Indel tool. Splice variants were evaluated using SpliceAI and Human Splice Finder 3.0 (HSF). Loss-of-function single nucleotide variants in coding regions were classified as potentially pathogenic, while synonymous variants outside the exon–intron boundaries were deemed non-pathogenic. Missense variants were evaluated by five in silico prediction tools, and only those predicted to be damaging by at least three different algorithms were considered disease-causing.
Results
The combined frequencies of selected variants (ranged from 127 in GNS to 259 in IDUA) were used to calculate prevalence based on Hardy–Weinberg's equilibrium. The maximum estimated prevalence ranged from 0.46 per 100,000 for MPSIIID to 7.1 per 100,000 for MPS I. Overall, the estimated prevalence of all types of MPS was higher than what has been published in the literature. This difference may be due to misdiagnoses and/or underdiagnoses, especially of the attenuated forms of MPS. However, overestimation of the number of disease-causing variants by in silico predictors cannot be ruled out. Even so, the disease prevalences are similar to those reported in diagnosis-based prevalence studies.
Conclusion
We report on an approach to estimate the prevalence of different types of MPS based on publicly available population-based genomic data, which may help health systems to be better prepared to deal with these conditions and provide support to initiatives on diagnosis and management of MPS.
Keywords: Mucopolysaccharidoses (MPS), Estimated prevalence, Exome aggregation consortium (ExAC), Genome aggregation database (gnomAD), In silico analysis
Introduction
The mucopolysaccharidoses (MPS) are a group of lysosomal diseases characterized by the deficiency of one of eleven enzymes involved in the breakdown of glycosaminoglycans (GAGs) which are constituents of the extracellular matrix. When there is a disturbance in their activities this leads to downstream consequences at the cellular level affecting multiple organs and systems. The MPS may be divided into different types according to the enzyme deficiency and the accumulated substrate (type I, II, IIIA, IIIB, IIIC, IIID, IVA, IVB, VI, VII, and IX). GAGs are constituents of the extracellular matrix, where impaired activities can lead to a spate of negative consequences both at the cellular and the physiological levels. Affected individuals usually have coarse facial features, cardiac and pulmonary problems, and, depending on the MPS type, bone dysplasia (dysostosis multiplex) and/or neurological impairment such as behavioural problems and developmental delay [1–3]. The severity of the diseases is variable, and individuals with MPS I, II, IVA, VI, and VII may benefit from market-approved enzyme replacement therapy, while there are novel therapies such as fusion proteins, gene therapy, and genome editing under investigation for several MPS [4].
Incidence and prevalence data are important to back up health system decisions and are necessary to calculate the cost–benefit of new therapies and treatment. Despite extensive molecular characterization having been done for the genes that encode the enzymes involved in these diseases with over 2,109 pathogenic variants reported in the Human Gene Disease Database (HGMD®) [5], there is still lack of specific epidemiology data on MPS. Newborn screening programs that include lysosomal diseases have arisen worldwide and may bring valuable information. However, such programs are still largely restricted to very few countries and most types of MPS are not included in the list of screened diseases [6, 7]. Population-based genomic data can help narrow the information gap, since now it is possible to rely on carrier frequency instead of the incidence of a disease among live births. However, care must be taken when using in silico predictors to classify genetic variants in order to have the most reliable data possible.
Herein, we used the frequency of potential disease-causing variants present in population-based genomic databases such as the Exome Aggregation Consortium (ExAC) [8] and the Genome Aggregation Database (gnomAD) [9], to estimate the prevalence of the different types of MPS after applying Hardy–Weinberg principles [10].
Results
Table 1 shows the number of variants present in each database and after the merger, which ranged from 961 (IDS) to 2988 (GALNS). After subsequent filtering steps, these numbers were reduced, ranging from 31 (IDS) to 259 (IDUA) (Table 2). A detailed description of the excluded variants can be found in Additonal file 1: Table S1.
Table 1.
MPS type | Gene | ExAC variants | gnomAD variants | Common | Retained variants** |
---|---|---|---|---|---|
MPS I | IDUA | 1246 | 1439 | 680 | 2005 |
MPS II | IDS | 300 | 920 | 259 | 961 |
MPS IIIA | SGSH | 1188 | 1400 | 545 | 2043 |
MPS IIIB | NAGLU | 640 | 805 | 397 | 1048 |
MPS IIIC | HGSNAT | 598 | 1456 | 521 | 1533 |
MPS IIID | GNS | 429 | 1116 | 404 | 1141 |
MPS IVA | GALNS | 1390 | 2254 | 656 | 2988 |
MPS IVB | GLB1* | 871 | 1322 | 564 | 1629 |
MPS VI | ARSB | 407 | 1122 | 370 | 1159 |
MPS VII | GUSB | 593 | 1067 | 519 | 1141 |
MPS IX | HYAL1 | 669 | 700 | 287 | 1082 |
*Variants may be associated with GM1 Gangliosidosis or with MPS IVB
**Retained variants represent unique variants after merging both databases
Table 2.
Frameshift** | In-frame insertion/deletion | Splice site** | Start loss | Stop gain** | Stop loss** | Missense** | Total** | |
---|---|---|---|---|---|---|---|---|
IDUA | 17–18 | 12 | 16–37 | 1 | 10–15 | 0–1 | 86–175 | 142–259 |
IDS | 0 | 1 | 1–2 | 0 | 0 | 0 | 4–28 | 6–31 |
SGSH | 8–14 | 7 | 5–7 | 0 | 4–14 | 0 | 73–194 | 97–236 |
NAGLU | 11–20 | 2 | 6–10 | 1 | 8–16 | 0 | 87–176 | 115–225 |
HGSNAT | 11 | 4 | 22–37 | 0 | 8–9 | 0 | 18–98 | 63–159 |
GNS | 5 | 3 | 14–23 | 0 | 4 | 0–1 | 29–91 | 55–127 |
GALNS | 11 | 7 | 14–26 | 1 | 10–11 | 0–1 | 57–187 | 100–244 |
GLB1* | 12–13 | 3 | 18–34 | 1 | 11–13 | 0 | 67–161 | 112–225 |
ARSB | 9–12 | 5 | 10–18 | 0 | 8–12 | 0 | 48–141 | 80–188 |
GUSB | 11–13 | 6 | 17–27 | 2 | 13–14 | 0–2 | 62–160 | 111–224 |
HYAL1 | 12–13 | 8 | 1–3 | 1 | 8–9 | 0 | 57–107 | 87–141 |
All genes | 107–130 | 58 | 124–224 | 7 | 84–117 | 0–5 | 588–1515 | 968—2059 |
*Variants may be associated with GM1 Gangliosidosis or to MPS IVB
**Numbers represent minimum and maximum frequencies. In the case of frameshift, stop gain or stop loss minimum frequency excludes variants in the last exon or located < 50 nucleotides upstream of the 3’ most splice-generated exon-exon junction. For splice site and missense variants, minimum frequency considers only variants deemed pathogenic by a consensus of all software packages
The number of variants excluded due to homozygosis ranged between 3 in GNS and GUSB to 113 in IDS (in homozygosis or hemizygosis); none of them were stop gain, stop loss, or start loss. The overall number of heterozygous canonical and non-canonical splice site variants considering all genes was 452, with 224 being considered deleterious by the in silico algorithms. One splice site variant could not be analysed by HSF nor SpliceAI (Additonal file 3: Table S3). In addition, 213 out of 218 frameshift and 188 in-frame insertions and deletions were considered deleterious. Variants that could not be analysed by SIFT Indel were excluded from further analysis. All variants considered deleterious by only one splice program as well as frameshift and nonsense variants in the last exon or located < 50 nucleotides upstream of the 3’ most splice-generated exon-exon junction were excluded from the calculations of minimum frequency. The number of variants considered deleterious in each category is shown in Table 2.
All 3,111 missense variants were analysed by five different in silico tools. A consensus on pathogenicity was reached for 588 variants, while 548 variants were classified as pathogenic by four tools and 382 variants by three.
The allele frequencies of each variant for a given gene were added together and considered as the minimum and maximum frequency of the deleterious recessive allele. This number was then used to calculate minimum and maximum prevalence of disease based on the Hardy–Weinberg equilibrium (Table 3). As the number of variants retained for IDS was very low (31 variants), the estimated frequency of MPS II must be viewed with caution. It is worth noticing that variants on GLB1 can be associated either with MPS IVB or GM1 gangliosidosis.
Table 3.
Gene | Disease-causing variants | CI in 100,000 (max) | CI in 100,000 (min) |
---|---|---|---|
IDUA | 259 | 7.103–7.096 | 2.479–2.476 |
IDS | 29 | 0.0108–0.0107 | 0.00014–0.00013 |
SGSH | 236 | 2.365–2.363 | 0.4116–0.4112 |
NAGLU | 225 | 1.532–1.530 | 0.366–0.365 |
HGSNAT | 159 | 1.566–1.565 | 0.107–0.106 |
GNS | 127 | 0.459–0.458 | 0.0549–0.0548 |
GALNS | 224 | 2.363–2.361 | 0.25–0.25 |
GLB1* | 225 | 1.677–1.676 | 0.456–0.455 |
ARSB | 188 | 1.119–1.117 | 0.1761–0.1758 |
GUSB | 224 | 1.144–1.141 | 0.2081–0.2078 |
HYAL1 | 141 | 0.4393–0.4388 | 0.1081–0.1079 |
*Variants may be associated to GM1 gangliosidosis or to MPS IVB. CI = Confidence interval
Only two of the 2,061 retained variants have frequencies over 0.001—p.(His356Pro) in NAGLU with 0.007993 and p.(Asp152Asn) in GUSB with 0.001153. After all five tier variant selections, maximum and minimum estimated disease prevalence was calculated based on global allele frequency (Table 3).
In addition to estimated overall disease prevalence, the prevalence of MPS in specific populations was calculated for eight ethnic groups present in the databases (Figs. 1, 2 and Additonal file 4: Table S4).
Discussion
In this study, we used public data from WES and WGS to estimate the prevalence of different types of MPS. As MPS symptoms usually show up in the first decade of life, it is unlikely that severely affected individuals would be part of such databases. However, the possibility of undiagnosed individuals with milder phenotypes being included in those cannot be ruled out. Importantly, individuals homozygous for rare variants present in any MPS gene (Additonal file 2: Table S2), which could represent individuals with attenuated forms of the disease were filtered out in the second-tier variant selection.
The estimated global frequency for all types of MPS except for type VI found in this study was either above or at the upper limit in comparison to frequencies of MPS in different countries based on the number of diagnosed cases in reference centres [20] (Table 4). Worthy of note is the fact that the maximum prevalence as reported by Khan et al., 2017 is for a limited number of countries, whereas our data was calculated collectively for the different ethnic backgrounds present in the databases. This means that we may have overestimated the prevalence of diseases in the general population. A recent study estimated the prevalence of MPS in Brazil based on 600 affected individuals with all types of MPS included in a national network database [21]. The researchers found discrepancy when comparing the estimated prevalence based on diagnosis (0.24/100,000) to the estimated prevalence based on genetic screening for the most common pathogenic variant in IDUA among healthy volunteers (0.95/100,000), for example. Furthermore, the estimated prevalence of MPS VI in Brazil was the second highest in the world, with prevalence similar to that found in the present study (1.02/100,000 compared with 1.12/100,000).
Table 4.
MPS type | Gene | This study (max.–min.) | Khan et al. 2017 (max.–min.) |
---|---|---|---|
MPS I | IDUA | 7.10–2.48 | 3.62–0.11 |
MPS II | IDS | 0.0108–0.00013 | 2.16–0.1 |
MPS IIIA | SGSH | 2.36–0.41 | 1.62–0.08 |
MPS IIIB | NAGLU | 1.53–0.37 | 0.72–0.02 |
MPS IIIC | HGSNAT | 1.57–0.11 | 0.42–0.03 |
MPS IIID | GNS | 0.46–0.05 | 0.10–0.09 |
MPS IVA | GALNS | 2.36–0.25 | 1.30–0.15 |
MPS IVB | GLB1 | 1.68–0.46* | 0.14–0.01 |
MPS VI | ARSB | 1.12–0.18 | 7.85–0.02 |
MPS VII | GUSB | 1.14–0.21 | 0.29–0.02 |
MPS IX | HYAL1 | 0.44–0.11 | NA |
*Combined frequency of GM1 Gangliosidosis and MPS IVB
Several measures were taken to reduce the chance of prevalence overestimation. For example, variants were filtered in sequential steps, in order to obtain the most specific data possible. Also, both homozygotes and variants with frequency higher than 0.001 were excluded. Additional filtering based on functional predictions was also performed in order to include only variants more likely to affect protein function. After that, all variants remaining for analysis had allele frequencies below 0.001 and most of them have not been previously reported as disease-causing. This was expected since variants classified as of uncertain significance (VUS) based on the standards and guidelines of the American College of Medical Genetics/Association of Molecular Pathology (ACMG/AMP) [10] are known to account for a substantial part of disease-causing variants for MPS and have a significant impact on incidence estimates. For example, Clark et al. [22] showed that 25% of VUS analysed in MPS IIIB were potentially disease-causing and cause reduced enzyme activity.
It is worthy of note that sequential filtering steps and use of consensus scores do not guarantee that only pathogenic variants are selected or that only non-pathogenic variants are discarded. However, the estimation error is not directly measurable. Furthermore, the high frequency filter is necessary to exclude variants with frequencies incompatible with MPS disease. Although this may lead the possibility of underascertainment, frequencies like 0.007993 and 0.001153 for variant c.1067A > C; p.(His356Pro) in NAGLU and the c.454G > A; p.(Asp152Asn) in GUSB are not found in clinical practice. These were the only two variants excluded because of high frequency. We considered using curated variants reported either on ClinVar or Human Genome Mutation Database (HGMD), however, this would significantly reduce the number of retained variants (for instance, from 259 to 47 for IDUA, data not shown). Different in silico tools were used to estimate the likelihood of a variant being disease-causing. However, as no data on the sensitivity and specificity of such softwares are available for MPS genes, it is impossible to estimate the number of false-positive results. For instance, several well characterized pathogenic variants reported in HGMD had low deleteriousness scores as evaluated by the Combined Annotation-Dependent Depletion (CADD) [23] that has an overall higher performance than other predictors (data not shown).
The existence of compound heterozygotes cannot be ruled out. In fact, most individuals with MPS who are not a result of from consanguineous marriage are indeed compound heterozygotes. However, due to the structure of both databases used in this study, it is impossible to set up conditions where the occurrence of variants in cis cannot be ruled out, which would contribute to the overestimation of disease prevalence.
Despite these limitations, a similar approach has been used by Appadurai et al., 2015 to estimate the prevalence of cerebrotendinous xanthomatosis (CTX). As in the present study, the authors suggested an apparent underdiagnosis of CTX based on the allele frequency of potentially disease-causing variants present in ExAC. Interestingly, the discrepancy between genomic data and the diagnosis-based incidence is more pronounced for the rarest MPS diseases, such as MPS IIIC, IIID, IVB, VII, and IX. For some forms of MPS I, II, VI, and IX, it is possible that variants leading to deficient enzyme activity are not clinically recognized due to attenuated phenotypes [24–26]. On the other hand, severe cases of MPS VII may lead to premature death before the diagnosis is reached or even sought [27].
Notably, data emerging from large datasets of WES and WGS are disclosing novel phenotypes for well-known diseases, especially intermediate phenotypes [28–30]. This may also be the case for MPS and could help explain the higher prevalence predicted by our work, with patients not being recognized clinically due to an unusual presentation.
In the case of MPS IVB, there is an additional complexity since the same gene is involved in another lysosomal disorder with different accumulated substrate and clinical features, called GM1 gangliosidosis [31]. In this study, variants of GLB1 were considered disease-causing regardless of the associated phenotype. Therefore, the overall frequency of alleles was used to estimate the prevalence of MPS IVB, whereas in fact only about 13.3% of curated disease-causing variants in this gene are associated with MPS IVB, the rest leading to the three types of GM1 gangliosidosis [32].
After the filtering steps, IDS had a limited number of retained disease-causing variants (29 variants), and therefore the estimated prevalence for MPS II was lower than what has been previously reported [20]. The higher prevalence observed in studies based on reference centres and diagnostic laboratories may be related to the proportion of patients having de novo variants. Pollard et al. [33] show that this happens in 22.5% of MPS II cases. In addition, recombination events between IDS and its pseudogene IDS2 are a common cause of the disease, with structural variants such as gross rearrangements and complete or partial deletions seen in between 10 and 28% of affected individuals [34–40]. Those types of variants could not be taken into account in our estimates because of the structure of the populational databases used. As a result, the estimated prevalence of MPS II is not as reliable as it is for the other types of MPS. It is worth mentioning that the other study that uses a similar method for two X-linked diseases (Menkes disease and ATP7A-related disorders) [41] also found a very low number of variants, which could suggest that this strategy is not the best approach for X-linked disorders.
Conclusions
In summary, we report on an approach to estimate the prevalence of the different types of MPS based on publicly available population-based genomic data that may help to better tailor screening and diagnostic programs for these diseases, to prepare the health systems to deal with a more precise estimated number of patients, and may serve as a starting point for other rare-disease initiatives.
Methods
Database
Genetic variants (GRCh37/hg19) from ExAC V0.3.1 and gnomAD v2.0.2 [8, 9] were used to estimate the prevalence of different types of MPS. These public data aggregated information from 125,748 WES and 15,708 WGS collected from unrelated individuals and 1,756 parent–offspring trios with no known rare disease. The genetic data were collected from case–control studies of adult-onset common diseases, spanning six global and eight sub-continental ancestries, determined by ancestry-informative markers [9]. Although related individuals can have an influence upon the frequency of variants, the size of the database which has a total of 141,456 individuals makes the influence of 1,756 trios irrelevant.
The data was retrieved separately for each gene, and then merged to create one single unified database. When variants were common to both databases, the allele frequencies from gnomAD were used for further analysis, as it includes ExAC data.
First-tier variant selection
Variants of the gene located in 5′ and 3′ UTR, upstream and downstream, as well as intronic and non-coding transcript exons, were excluded assuming that no disease-causing variant has been described in such positions for any MPS. In addition, synonymous variants outside the exon–intron boundaries were also excluded, as well as variants in non-canonical transcripts.
Second-tier variant selection
In second-tier analysis, missense, nonsense, stop gain and stop-loss, frameshift, and splice site variants present in homozygosis (and hemizygosis for IDS) were excluded based on the assumption that neither ExAC and gnomAD include MPS-affected individuals as they exclude samples from patients with severe pediatric diseases and their relatives [8]. Therefore, any homozygous variant should not be pathogenic. Heterozygous loss-of-function variants such as stop gain, stop loss, and start loss were considered as potentially disease-causing, considering the impact on protein function and strong evidence of pathogenicity as per the ACMG/AMP guidelines [10].
Third-tier variant selection
Heterozygous alterations in canonical or non-canonical splice site were analysed using Human Splice Finder [11] and SpliceAI [12]. In-frame insertions, deletions and frameshift variants outside the last exon were analysed using SIFT Indel [13]. Variants were classified based on the default algorithms parameters for deleteriousness.
Fourth-tier variant selection
The analysis of missense variants was made using five in silico algorithms: MutPred [14], PolyPhen2 [15], PROVEAN [16], SIFT [17], and REVEL [18]. Since Polyphen2 provides more than two categories, results were transformed into binary data considering "possibly pathogenic" and “probably pathogenic” as deleterious. For REVEL, an ensemble algorithm, a rank score over 0.75 was considered deleterious. To calculate the maximum prevalence of the disease, a variant was considered deleterious when at least three software packages agreed on pathogenicity. For the minimum prevalence, we included missense variants for which all in silico tools agreed on pathogenicity.
Fifth-tier variant selection
The remaining variants were analysed to make sure that only rare alleles were retained. Therefore any variant with a frequency greater than 0.001 was excluded, as no variants associated with low enzymatic activity (≤ 15% wild type) were found with higher allele frequencies [19].
Calculation of disease prevalence using Hardy–Weinberg principles
The frequency of a given variant retained as being disease-causing was calculated by dividing the number of chromosomes bearing the genetic change by the total number of chromosomes subjected to analysis in this position. Then the sum of all variant frequencies for each gene was used as the frequency of the recessive allele (q). The prevalence was then calculated as q2, from the Hardy–Weinberg formula p2 + 2pq + q2. The incidence for each specific population was calculated using the population-specific frequencies.
Calculation of confidence Interval
A script in R was used to estimate the confidence interval. The variances in the frequency of variants and in the prevalence estimate were calculated equally as exhibit eqautions 5 and 13 from Clark et al. [22]. The confidence intervals were adapted to consider the sum of allele frequencies instead of probability, as suggested by Clark et al. [22].
Supplementary information
Acknowledgements
The authors would like to thank the Research Incentive Fund of the Clinicas Hospital in Porto Alegre (Fundo de Incentivo à Pesquisa do Hospital de Clínicas de Porto Alegre—- FIPE/HCPA).
Abbreviations
- MPS
Mucopolysaccharidoses
- GAGs
Glycosaminoglycans
- HGMD
Human gene disease database
- ExAC
Exome aggregation consortium
- gnomAD
Genome aggregation database
- VUS
Variants classified as of uncertain significance
- CADD
Combined Annotation-Dependent Depletion
- CTX
Cerebrotendinous xanthomatosis
- WES
Whole exome sequencing
- WGS
Whole genome sequencing
Authors’ contributions
UM conceived the study, PB and GP collected the data; PB and FV carried out the analysis and interpretation of data; PB, UM, and FV wrote the manuscript; UM, RG, FV and GP revised the manuscript. All authors read and approved the submitted version of the manuscript.
Funding
This work was supported by the Brazilian National Council for Technological and Scientific Development (CNPq) and the Research Incentive Fund of the Clinicas Hospital in Porto Alegre (FIPE/HCPA).
Availability of data and materials
The authors confirm that the data supporting the findings of this study are available within the article [and/or] its supplementary materials.
Ethics approval and informed consent to participate
No ethical approval was required.
Consent for publication
Not applicable.
Competing interests
The authors declare no conflict of interests.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
is available for this paper at 10.1186/s13023-020-01608-0.
References
- 1.Muenzer J. Overview of the mucopolysaccharidoses. Rheumatology (Oxford). 2011;50(5):v4–v12. doi: 10.1093/rheumatology/ker394. [DOI] [PubMed] [Google Scholar]
- 2.Giugliani R. Mucopolysacccharidoses: From understanding to treatment, a century of discoveries. Genet Mol Biol. 2012;35(Suppl 4):924–931. doi: 10.1590/s1415-47572012000600006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Sun A. Lysosomal storage disease overview. Ann Transl Med. 2018;6(24):476. doi: 10.21037/atm.2018.11.39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Giugliani R, Federhen A, Vairo F, et al. Emerging drugs for the treatment of mucopolysaccharidoses. Expert Opin Emerg Drugs. 2016;21(1):9–26. doi: 10.1517/14728214.2016.1123690. [DOI] [PubMed] [Google Scholar]
- 5.Stenson PD, Mort M, Ball EV, Shaw K, Phillips A, Cooper DN. The human gene mutation database: building a comprehensive mutation repository for clinical and molecular genetics, diagnostic testing and personalized genomic medicine. Hum Genet. 2014;133(1):1–9. doi: 10.1007/s00439-013-1358-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Robinson BH, Gelb MH. The importance of assay imprecision near the screen cutoff for newborn screening of lysosomal storage diseases. Int J Neonatal Screen. 2019;5(2):17. doi: 10.3390/ijns5020017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Schielen PCJI, Kemper EA, Gelb MH. Newborn screening for lysosomal storage diseases: a concise review of the literature on screening methods, therapeutic possibilities and regional programs. Int J Neonatal Screen. 2017;3(2):6. doi: 10.3390/ijns3020006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Lek M, Karczewski KJ, Minikel EV, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536(7616):285–291. doi: 10.1038/nature19057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, et al. Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes. bioRxiv. 2019;531210. Available from: https://www.biorxiv.org/content/10.1101/531210v2
- 10.Appadurai V, DeBarber A, Chiang PW, et al. Apparent underdiagnosis of cerebrotendinous xanthomatosis revealed by analysis of ~60,000 human exomes. Mol Genet Metab. 2015;116(4):298–304. doi: 10.1016/j.ymgme.2015.10.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Desmet FO, Hamroun D, Lalande M, Collod-Béroud G, Claustres M, Béroud C. Human splicing finder: an online bioinformatics tool to predict splicing signals. Nucleic Acids Res. 2009;37(9):e67. doi: 10.1093/nar/gkp215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Jaganathan K, Kyriazopoulou Panagiotopoulou S, McRae JF, et al. Predicting splicing from primary sequence with deep learning. Cell. 2019;176(3):535–548.e24. doi: 10.1016/j.cell.2018.12.015. [DOI] [PubMed] [Google Scholar]
- 13.Hu J, Ng PC. SIFT Indel: predictions for the functional effects of amino acid insertions/deletions in proteins. PLoS One. 2013;8(10):e77940. Published 2013 Oct 23; doi:10.1371/journal.pone.0077940 [DOI] [PMC free article] [PubMed]
- 14.Li B, Krishnan VG, Mort ME, et al. Automated inference of molecular mechanisms of disease from amino acid substitutions. Bioinformatics. 2009;25(21):2744–2750. doi: 10.1093/bioinformatics/btp528. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Adzhubei IA, Schmidt S, Peshkin L, et al. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7(4):248–249. doi: 10.1038/nmeth0410-248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Choi Y, Sims GE, Murphy S, Miller JR, Chan AP. Predicting the functional effect of amino acid substitutions and indels. PLoS ONE. 2012;7(10):e46688. doi: 10.1371/journal.pone.0046688. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Kumar P, Henikoff S, Ng PC. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc. 2009;4(7):1073–1081. doi: 10.1038/nprot.2009.86. [DOI] [PubMed] [Google Scholar]
- 18.Ioannidis NM, Rothstein JH, Pejaver V, et al. REVEL: an Ensemble method for predicting the pathogenicity of rare missense variants. Am J Hum Genet. 2016;99(4):877–885. doi: 10.1016/j.ajhg.2016.08.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Clarke LA, Giugliani R, Guffon N, et al. Genotype-phenotype relationships in mucopolysaccharidosis type I (MPS I): Insights from the International MPS I registry. Clin Genet. 2019;96(4):281–289. doi: 10.1111/cge.13583. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Khan SA, Peracha H, Ballhausen D, et al. Epidemiology of mucopolysaccharidoses. Mol Genet Metab. 2017;121(3):227–240. doi: 10.1016/j.ymgme.2017.05.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Federhen A, Pasqualim G, de Freitas TF, et al. Estimated birth prevalence of mucopolysaccharidoses in Brazil. Am J Med Genet A. 2020;182(3):469–483. doi: 10.1002/ajmg.a.61456. [DOI] [PubMed] [Google Scholar]
- 22.Clark WT, Yu GK, Aoyagi-Scharber M, LeBowitz JH. Utilizing ExAC to assess the hidden contribution of variants of unknown significance to Sanfilippo Type B incidence. PLoS One. 2018;13(7):e0200008. doi: 10.1371/journal.pone.0200008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Rentzsch P, Witten D, Cooper GM, Shendure J, Kircher M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 2019;47(D1):D886–D894. doi: 10.1093/nar/gky1016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Kiykim E, Barut K, Cansever MS, et al. Screening mucopolysaccharidosis Type IX in patients with juvenile idiopathic arthritis. JIMD Rep. 2016;25:21–24. doi: 10.1007/8904_2015_467. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Pinto E, Vairo F, Conboy E, de Souza CFM, et al. Diagnosis of attenuated mucopolysaccharidosis VI: clinical, biochemical, and genetic pitfalls. Pediatrics. 2018;142(6):e20180658. doi: 10.1542/peds.2018-0658. [DOI] [PubMed] [Google Scholar]
- 26.Rigoldi M, Verrecchia E, Manna R, Mascia MT. Clinical hints to diagnosis of attenuated forms of Mucopolysaccharidoses. Ital J Pediatr. 2018;44(Suppl 2):132. doi: 10.1186/s13052-018-0551-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Sands MS. Mucopolysaccharidosis type VII: a powerful experimental system and therapeutic challenge. Pediatr Endocrinol Rev. 2014;12(Suppl 1):159–165. [PubMed] [Google Scholar]
- 28.Bonafé L, Kariminejad A, Li J, et al. Brief report: peripheral osteolysis in adults linked to ASAH1 (Acid Ceramidase) mutations: a new presentation of farber's disease. Arthritis Rheumatol. 2016;68(9):2323–2327. doi: 10.1002/art.39659. [DOI] [PubMed] [Google Scholar]
- 29.Kim SY, Choi SA, Lee S, et al. Atypical presentation of infantile-onset farber disease with novel ASAH1 mutations. Am J Med Genet A. 2016;170(11):3023–3027. doi: 10.1002/ajmg.a.37846. [DOI] [PubMed] [Google Scholar]
- 30.Yu FPS, Amintas S, Levade T, Medin JA. Acid ceramidase deficiency: farber disease and SMA-PME. Orphanet J Rare Dis. 2018;13(1):121. doi: 10.1186/s13023-018-0845-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Lee JS, Choi JM, Lee M, et al. Diagnostic challenge for the rare lysosomal storage disease: late infantile GM1 gangliosidosis. Brain Dev. 2018;40(5):383–390. doi: 10.1016/j.braindev.2018.01.009. [DOI] [PubMed] [Google Scholar]
- 32.Caciotti A, Garman SC, Rivera-Colón Y, et al. GM1 gangliosidosis and Morquio B disease: an update on genetic alterations and clinical findings. Biochim Biophys Acta. 2011;1812(7):782–790. doi: 10.1016/j.bbadis.2011.03.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Pollard LM, Jones JR, Wood TC. Molecular characterization of 355 mucopolysaccharidosis patients reveals 104 novel mutations. J Inherit Metab Dis. 2013;36(2):179–187. doi: 10.1007/s10545-012-9533-7. [DOI] [PubMed] [Google Scholar]
- 34.Bunge S, Rathmann M, Steglich C, et al. Homologous nonallelic recombinations between the iduronate-sulfatase gene and pseudogene cause various intragenic deletions and inversions in patients with mucopolysaccharidosis type II. Eur J Hum Genet. 1998;6(5):492–500. doi: 10.1038/sj.ejhg.5200213. [DOI] [PubMed] [Google Scholar]
- 35.Brusius-Facchin AC, Schwartz IV, Zimmer C, et al. Mucopolysaccharidosis type II: identification of 30 novel mutations among Latin American patients. Mol Genet Metab. 2014;111(2):133–138. doi: 10.1016/j.ymgme.2013.08.011. [DOI] [PubMed] [Google Scholar]
- 36.Kosuga M, Mashima R, Hirakiyama A, et al. Molecular diagnosis of 65 families with mucopolysaccharidosis type II (Hunter syndrome) characterized by 16 novel mutations in the IDS gene: Genetic, pathological, and structural studies on iduronate-2-sulfatase. Mol Genet Metab. 2016;118(3):190–197. doi: 10.1016/j.ymgme.2016.05.003. [DOI] [PubMed] [Google Scholar]
- 37.Chiong MA, Canson DM, Abacan MA, Baluyot MM, Cordero CP, Silao CL. Clinical, biochemical and molecular characteristics of Filipino patients with mucopolysaccharidosis type II - Hunter syndrome. Orphanet J Rare Dis. 2017;12(1):7. doi: 10.1186/s13023-016-0558-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Dvorakova L, Vlaskova H, Sarajlija A, et al. Genotype-phenotype correlation in 44 Czech, Slovak, Croatian and Serbian patients with mucopolysaccharidosis type II. Clin Genet. 2017;91(5):787–796. doi: 10.1111/cge.12927. [DOI] [PubMed] [Google Scholar]
- 39.Zanetti A, D'Avanzo F, Rigon L, et al. Molecular diagnosis of patients affected by mucopolysaccharidosis: a multicenter study. Eur J Pediatr. 2019;178(5):739–753. doi: 10.1007/s00431-019-03341-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Zhang W, Xie T, Sheng H, et al. Genetic analysis of 63 Chinese patients with mucopolysaccharidosis type II: Functional characterization of seven novel IDS variants. Clin Chim Acta. 2019;491:114–120. doi: 10.1016/j.cca.2019.01.009. [DOI] [PubMed] [Google Scholar]
- 41.Kaler SG, Ferreira CR, Yam LS. Estimated birth prevalence of Menkes disease and ATP7A-related disorders based on the Genome Aggregation Database (gnomAD) Mol Genet Metab Rep. 2020;5(24):100602. doi: 10.1016/j.ymgmr.2020.100602. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The authors confirm that the data supporting the findings of this study are available within the article [and/or] its supplementary materials.