Abstract
The NAGLU challenge of the fourth edition of the Critical Assessment of Genome Interpretation experiment (CAGI4) in 2016, invited participants to predict the impact of variants of unknown significance (VUS) on the enzymatic activity of the lysosomal hydrolase α-N-acetylglucosaminidase (NAGLU). Deficiencies in NAGLU activity lead to a rare, monogenic, recessive lysosomal storage disorder, Sanfilippo syndrome type B (MPS type IIIB). This challenge attracted 17 submissions from 10 groups. We observed that top models were able to predict the impact of missense mutations on enzymatic activity with Pearson’s correlation coefficients of up to 0.61. We also observed that top methods were significantly more correlated with each other than they were with observed enzymatic activity values, which we believe speaks to the importance of sequence conservation across the different methods. Improved functional predictions on the VUS will help population scale analysis of disease epidemiology and rare variant association analysis.
Keywords: alpha-N-acetylglucosaminidase, Sanfilippo syndrome, variants of unknown significance, machine learning, CAGI, enzymatic activity, critical assessment
Introduction
The exponential increase in genetic data over the past decade has confronted researchers with an unprecedented number of rare variants of unknown disease significance (VUS) detected in the human population. Such data present both a challenge and an opportunity. In the context of newborn screening, a clinician might be asked to interpret only a handful of mutations in a specific gene of relevance, but that given gene might have hundreds of missense VUS in databases such as gnomAD (Karczewski et al., 2019). Although the sheer number of VUS may be high, having a-priori knowledge of their likely disease relevance can facilitate the pre-screening of such mutations. Next to observing a mutation in a confidently diagnosed patient, experimental characterization remains a valuable method for validating the impact of detected variants. However, this process can be time-consuming, costly, and impractical. In an attempt to bridge this gap, many computational methods have been developed to predict the impact of missense variants on protein function (Gallion et al., 2017; Tang & Thomas, 2016). As part of the effort to test and independently evaluate such algorithms, the Critical Assessment of Genome Interpretation (CAGI) creates challenges using unpublished experimental data to evaluate the performance of blinded phenotype prediction algorithms (Hoskins et al., 2017).
Sanfilippo syndrome, also known as Mucopolysaccharidosis type III (MPS III), is a rare autosomal recessive inherited metabolic disease caused by a deficiency in one of four lysosomal enzymes catalyzing distinct steps in the sequential degradation of heparan sulfate (Coutinho, Lacerda, & Alves, 2012). Each enzyme deficiency defines a separate subtype: IIIA, IIIB, IIIC, IIID, although symptoms and disease progression are largely indistinguishable between types. The resultant accumulation of heparan sulfate within lysosomes, particularly in the brain and liver, leads to a severe neurological phenotype and death in the second decade. Mutations leading to type IIIB (MIM# 252920), one of the more commonly diagnosed types, are located in the gene encoding the lysosomal hydrolase, α-N-acetylglucosaminidase (NAGLU; MIM# 609701).
The accumulation of heparan sulfate due to partial or complete loss of NAGLU enzyme activity occurs in various tissues and cells; however, the clinical signs are mostly associated with the central nervous system (Birrane et al., 2019), causing severe cognitive disabilities, behavioral problems and developmental regression, leading to death in adolescence or early adulthood. The age of onset of Sanfilippo Type B is 1–4 years (Andrade, Aldamiz-Echevarria, Llarena, & Couce, 2015) and the estimate for lifetime risk at birth (number of patients per 100,000 live births) varies substantially in European populations from 0.05 in Sweden to 0.78 in Greece (Zelei, Csetneki, Voko, & Siffel, 2018). To date, no effective treatment for Sanfilippo syndrome exists although several promising approaches are being developed, including enzyme replacement therapy, gene therapy, bone marrow stem cell transplantation and small molecules (Aoyagi-Scharber et al., 2017; Gaffke, Pierzynowska, Piotrowska, & Wegrzyn, 2018). Because newborns are asymptomatic at birth, early diagnosis is critical for improved management and outcome of therapeutic trials. The development of algorithms capable of reliably distinguishing between pathogenic and benign NAGLU alleles is an important step in this direction.
For the NAGLU challenge of the fourth edition of the CAGI experiment (CAGI4) in 2016, participants were asked to predict the impact of VUS on the enzymatic activity of NAGLU. Variants were selected for testing based on being present in ExAC version v0.3 and not being present in HGMD (Lek et al., 2016). The enzymatic activity of these missense mutations in NAGLU had been previously measured in transfected cell lysates (Clark, Yu, Aoyagi-Scharber, & LeBowitz, 2018). Of the 163 VUS tested, 41 (25%) decreased the activity of NAGLU to levels consistent with known Sanfilippo Type B pathogenic alleles. Previous analysis of variants that were found to decrease activity to levels consistent with disease found that they were more likely to be buried and close to the active site of the protein.
This challenge attracted 17 submissions from ten groups (Table 1). Most of the models utilized sequence information (n=16), one third of the methods also added structure-based features in addition to sequence (n=8). To the best of our knowledge, this is the largest assessment of predicted enzyme activity for rare population missense variants in CAGI.
Table 1.
PI | Model Name | PubMed | PolyPhen/SIFT/Provean Based Features | Structure Based Features | PSSM/MSA Based Features | ML Method | Training Database |
---|---|---|---|---|---|---|---|
Bromberg | SNAP-1 | Bromberg & Rost, 2007 | Yes | Yes | Yes | Neural Network | PMD |
Bromberg | SNAP-2 | Bromberg & Rost, 2007 | Yes | Yes | Yes | Neural Network | PMD |
Moult | Moult Consensus | Yin, Kundu, Pal, & Moult, 2017 | Yes | Yes | Yes | Support Vector Regression | |
Lichtarge | Evolutionary Action | Katsonis & Lichtarge, 2014 | No | No | Yes | None | |
Wei | iFish | Wang & Wei, 2016 | Yes | Yes | Yes | SVM | |
Mooney | MutPred | Li et al., 2009 | Yes | No | Yes | Random Forest | HGMD |
Mooney | MutPred2 w/o homology | Pejaver et al., 2017 | Yes | No | Yes | Neural Network Ensemble | HGMD 2013 |
Mooney | MutPred2 w homology | Pejaver et al., 2017 | Yes | No | Yes | Neural Network Ensemble | HGMD 2013 |
Jones | HHblits w/ real contacts | Remmert, Biegert, Hauser, & Soding, 2011 | No | Yes | Yes | Logistic regression | |
Jones | HHblits w/ predicted contacts | Remmert et al., 2011 | No | No | Yes | Logistic regression | |
Jones | HHblits w/o contacts | Remmert et al., 2011 | No | No | No | Logistic regression | |
Jones | PAM250 PSSM | No | No | Yes | Logistic regression | ||
Ford | PolyPhen2 Random Forest | Ford, Uppal, Nodzak, & Shi, 2019 | Yes | No | Yes | Random Forest | 1000 Genomes, NCBI, wANNOVAR |
Casadio | INPS3D | Savojardo, Fariselli, Martelli, & Casadio, 2016 | No | Yes | Yes | SVM | |
Casadio | SNPs&GO | Capriotti et al., 2013 | No | Yes | Yes | SVM | |
Zhou | EASE-MM | Folkman, Stantic, Sattar, & Zhou, 2016 | No | No | Yes | Support Vector Regression | ProTherm |
Dunbrack | Dunbrack-SVM | Wei, Xu, & Dunbrack, 2013 | Yes | Yes | Yes | SVM |
Methods
Selection and testing of NAGLU variants
For the CAGI challenge we attempted to select missense variants that were both observed in the population and which were of unknown disease significance (Clark et al., 2018). In order to do this, we relied on the v0.3 release of the Exome Aggregation Consortium’s (ExAC) collection of exome sequencing data comprising 60,706 individuals as a source for observed missense mutations (Lek et al., 2016). As a source of disease associated variants we relied on the 2016 v1 version of the Human Gene Mutation Database (HGMD) (Stenson et al., 2003). All missense mutations are reported for Ensembl protein ENSP00000225927.1 for NAGLU. Supp. Figure S1 shows a schematic of the observed activity of tested variants and their amino acid positions.
Predictor performance evaluation
We calculated a number of metrics in order to give a robust view of the performance of each team’s submissions. For analysis, percent wild-type (%WT) activity values were converted to fraction wild-type (fwt) activity values. Our analysis treated the experiment both as a binary classification problem, and as one with a continuous valued target variable.
We calculated the Pearson and Spearman correlation coefficients, root mean squared error (RMSE) with observed enzymatic activity values and area under the receiver operating characteristic curve (AUC) for each set of predictions (Figure 1, Supp. Figure S2, Supp. Table S1). Predictor values submitted through the CAGI challenge were not normalized. Although linear transformations of predicted values, like z-score normalization, will not impact Pearson’s r or Spearman’s rho they would impact RMSE values. RMSE represents the most stringent metric that we used to evaluate predictions as it requires predictions to be properly scaled. As a supplement, precision and recall curves were generated (Supp. Figure S3).
Some metrics, such as sensitivity, specificity, and AUC, assume a binary target variable. In these cases we designated pathogenic variables as true positives, and benign as true negatives. We used 0.15 fwt activity as a threshold with which we distinguished pathogenic from benign variables. This level of fwt activity is consistent with what we observed from previously identified pathogenic mutations as described in Clark et al., 2018. We also calculated AUC and F-max values for thresholds ranging from 0.05 to .95 in increments of 0.05 (Supp. Table S2, Supp Table S3). For each predictor a sliding decision threshold was varied from the highest predictor score to its lowest. Because, in this instance, low predictor scores designate positives, each predicted mutation with a score below the threshold was chosen as a predicted positive. All others were designated as predicted negative, or benign, data-points. A simple way to achieve the same impact would be to multiply predictions of each model by −1 and proceed as one normally would when calculating binary metrics. We also generated ROC curves (Figure 2). Optimal positions on the ROC curve, designated by red dots in Figure 2 were determined as the point with the lowest square root of the sum of the square of the false positive rate and false negative rate.
Determining statistical significance of correlation coefficients
We calculated pairwise correlation coefficients for all models (Figure 3, Supp. Table S4, Supp. Table S5). For any two models, Xi and Xj, we calculated whether the correlation of model Xi with the experimentally observed enzymatic activity values, EA, was statistically significantly different than its correlation with model Xj using bootstrap simulation (Supp. Table S6). In order to do this 10,000 random samples of 163 data-points were generated using sampling with replacement of the original data. For each pair of predictors, each sample, sk, was used to calculate the correlation coefficient of Xi with and Xj and EA; r(Xi, Xj)k and r(Xi, EA)k respectively. We then calculated the percentage of times r(Xi, Xj)k was greater than r(Xi, EA)k; and deemed the correlation of model Xi with EA to be significantly greater than the correlation with model Xj if this value was less than 5%. Supp. Figure S4 shows a heatmap of pairwise correlation coefficients for all models, including off-the-shelf methods.
Determining the uniqueness of predictions using a linear regression model
To estimate the specific contribution of each prediction model to the variance with experimental results (R2), a multiple linear regression model was applied (Figure 4). First, a linear regression model was built for every single model. The top model from each group was chosen based on the highest adjusted R2 values (Supp. Table S7). Next, models were combined with the best performing model, and the linear regression equation was recalculated to evaluate the contributions of each model to the variance (Supp. Table S8).
Additional predictions used for evaluation
We compared the predictions submitted to the NAGLU challenge to several off-the-shelf methods. As a simple method we considered Grantham scores (Grantham, 1974). Quantitative scores for PolyPhen and SIFT were obtained from the ExAC VCF file and were generated using VEP v81 (Adzhubei et al., 2010; Kumar et al., 2009). We previously analyzed categorical predictions produced by SIFT and PolyPhen. Here we only considered quantitative scores for both predictors. CADD annotations were obtained from CADD v1.4 (Rentzsch, Witten, Cooper, Shendure, & Kircher, 2019). REVEL predictions were taken from the June 3, 2016 release of predictions (Ioannidis et al., 2016). Because quantitative scores produced by Grantham, PolyPhen, REVEL, and CADD are negatively correlated with enzymatic activity (a higher score indicated a higher likelihood of being pathogenic) scores were inverted by subtracting them from 1. This is a linear transformation that will only impact the sign of correlation values, but will allow a fair comparison of RMSE value produced by these predictors to other models. In the case of CADD raw and phred scores, this was done after normalizing those scores to the range [0–1] by subtracting the minimum value raw or phred score from a prediction then dividing by the maximum value minus the minimum. Again, this is only a linear transformation that will not impact correlation or AUC values, but will facilitate a fair comparison of RMSE values between models. Grantham scores were normalized in a similar fashion as CADD scores, but a minimum value of 0 was assumed.
Relative solvent accessibility was calculated by first calculating the solvent accessibility of each amino acid in monomer of PDB structure 4XWH using DSSP. Raw solvent accessibility values were then normalized by dividing by maximum solvent accessibility values (Rost & Sander, 1994). Because residues 1 through 23 are a signaling peptide and are proteolytically cleaved, they are not present in the PDB structure for NAGLU. This means there is no solvent accessibility value for the p.Arg16Val missense mutation. We replaced this missing value with the average relative solvent accessibility value for the remaining 162 amino acids in the evaluation set. A final, ad-hoc model was generated by taking the average of normalized Grantham scores and relative solvent accessibility.
All results for such methods are included along with all submitted models in Supp. Table S9.
Results
Participation in the NAGLU CAGI Challenge
There were 17 submitted sets of predictions from 10 individual teams for the CAGI NAGLU Challenge (Table 1, model descriptions available in supplemental data). Of these 17 submissions, six models (Moult Consensus, iFish, HHBlits w/ real contacts, INPS3D, SNP&GP, and Dunbrack-SVM) utilized one of the NAGLU protein structures. Nine models utilized the output of commonly used predictors of variant functional effect such as PolyPhen, SIFT, or PROVEAN (Adzhubei et al., 2010; Choi & Chan, 2015; Kumar, Henikoff, & Ng, 2009). All but one model (n=16) utilized information from multiple sequence alignments (MSAs) or position specific scoring matrices (PSSMs). Three models utilized HGMD as a source of training data.
Analysis of predicted enzymatic activities
We utilized several metrics when assessing performance in the CAGI4 challenge. Averages and standard deviations for each metric were obtained by randomly sampling the 163 variants with replacement 104 times. When calculating (AUC) we utilized 0.15 fwt activity thresholds at which variants were designated as either neutral or disease-causing for our primary analysis. This threshold was based on the upper limit of fwt activity measured in previously observed pathogenic mutations (Clark et al., 2018). We also calculated AUC and F-Max for thresholds ranging from 0.05 to .95 in increments of 0.05 (Supp. Table S2, Supp. Table S3). Although AUC and F-Max values were generated by sampling mutations, ROC and precision-recall curves were generated from unsampled data. In cases where more than one model for a team ranked highly according to a particular metric, we only mention the top performing model, although all results for all models are shown in Supp. Table 1. Precision and recall curves were generated as a supplemental figure (Supp. Figure S3).
Figure 1 shows several metrics (Pearson’s r, Spearman’s ⍴, AUC, RMSE) used to evaluate the performance of each predictor. We found that the MutPred w/ homology model performed the best in terms of Pearson’s r (r=0.60), followed by the Evolutionary Action model (r=0.56), and Moult Consensus (r=0.55) respectively. The same three teams performed the best in terms of Spearman’s ⍴ as well (MutPred w/ homology (⍴=0.61), Moult Consensus (⍴=0.57), and Evolutionary Action (⍴=0.55)). It should be noted that Spearman and Pearson correlation coefficients were very correlated for all models. RMSE represents the most stringent measure of model performance that we utilized. MutPred obtained the lowest RMSE (0.30), followed by Moult Consensus (0.30) and Dunbrack-SVM (0.32). Figure 2 shows ROC curves for the top 10 performing submissions according to AUC, and Figure 1 shows the obtained AUC values for these models. We found that MutPred2 w/ homology performed the best in terms of AUC (AUC=0.85), followed by Evolutionary Action (AUC=0.85) and HHblits w/ real contacts (AUC=0.84).
Although each of these metrics measure a different aspect of a predictor’s performance, we found a large amount of agreement between metrics in the overall ranking of models. For example, MutPred2 w/ homology performed the best according to Pearson’s r, Spearman’s ⍴, and AUC. In terms of RMSE the MutPred2 w/ homology only slightly underperformed compared to the MutPred model.
Easy and difficult to predict mutations
We determined whether any mutations were easy or difficult to predict (Supp. Table S10). This was done by measuring the average RMSE 1) across all predictors, and 2) for the top 5 models (Mutpred, MutPred2 w/o homology, Moult consensus, Dunbrack-SVM, and Evolutionary Action). We found several mutations for which experimentally observed activities were both easy and difficult for models to predict.
The majority of easiest to predict deleterious mutations involved non-conservative substitutions of buried residues. Within the NAGLU structure, these variants are predicted to affect protein stability via disruption of aromatic clusters or stacking, salt bridges and hydrogen bonding networks, as well as through proximity to the active site or interference with the binding site pocket.
The majority of hardest to predict mutations involved moderate or conservative substitutions of partially or fully solvent-exposed residues. Interpretation of their effects within the NAGLU structure was not immediately obvious. One possibility involves an effect on protein solubility, especially in the context of the enzyme’s trimerization. The hardest to predict mutation (p.Pro283Leu) was predicted to have low activity by most predictors but was shown to actually increase activity. Both this variant and p.Gly596Cys, another benign variant predicted to have low activity, involve non-conservative substitutions and are buried in the NAGLU structure.
Correlation between predictive models
While it may be easy to focus on which model performed the best in terms of a particular measure, we observed that top models from each team were significantly more correlated with at least one other model from another team than they were with fwt values (Figure 3). Furthermore, we observed that the 6 top performing models as measured by Pearson’s correlation coefficients were all more correlated with each other than with fwt activity values, and that these correlations were found to be statistically significant through bootstrapping simulation (Methods).
For example, although the MutPred2 w/ homology and Evolutionary Action models were correlated with observed activity values with coefficients of 0.60 and 0.56 respectively, they were correlated with each other with a Pearson’s r of 0.82. For none of the 104 bootstrap samples generated did we observe that two models were more correlated with fwt activity values than they were with each other. This suggests that these models perform reasonably well at predicting fwt activities for NAGLU, they are better at recapitulating each other’s behavior although they are presumably based on distinct, and rather different, methodology.
In light of the high correlation between models, we did not observe that combining the best performing model with any other tool improved correlation with fwt activity values. In order to determine this, all models were fit to a linear regression model and the best tool out of all submissions from the same group was chosen based on the adjusted R2 values (shown in black in Figure 4). As R2 values in this case should be equivalent to squared Pearson’s correlation coefficients, MutPred2 w/o homology was found to explain the highest proportion of variance (36%). The best combination (MutPred2 w/o homology and HHblits w/ real contacts) increased the adjusted R2 value only by 0.02%. This implies that MutPred2 w/o homology itself can represent all of the other tools.
Comparison to Supplemental Models
We also evaluated the performance of several commonly used off the shelf tools as supplemental models including REVEL, Polyphen, SIFT and CADD (Supp. Figure S2, Methods, Supp. Table S9). REVEL performed the best out of all off the shelf methods (Pearson’s r = 0.56) although it was not as correlated with observed fwt activity values as MutPred 2 w/ homology (Pearson’s r = 0.60) and the two models were highly correlated with each other (Pearson’s r = 0.89) (Supp. Table S4). We observed that for both models, prediction scores were statistically significantly more correlated with each other than they were with observed fwt activity values; for none of 104 bootstrap simulations did we observe higher correlation between either model and fwt activity value than with each model (Methods). It is important to point out that REVEL uses predictions from the previous version of MutPred as features. Furthermore, MutPred2 w/ homology was trained using an older version of HGMD (June 2013) than REVEL (2015.2 version of HGMD).
While we observed that top methods performed better than PolyPhen scores, in the case of MutPred w/ homology, we did not find the difference to be significant in terms of Pearson’s correlation coefficients. Solvent accessibility and Grantham scores performed the poorest in terms of Pearson’s and Spearman’s correlation coefficients, but Grantham scores had lower RMSE than PolyPhen scores.
Out of all supplemental models, the proportion of variance explained based on a linear regression model was the highest for REVEL (adjusted R2=30.1%), and there was no improvement when additional models were added (Supp. Figure S5). This indicates that REVEL itself can be a good representative of all commonly used models that were selected.
Conclusions/Discussion
For the CAGI NAGLU challenge, we asked participants to predict the impact of missense mutations on the enzymatic activity of NAGLU. This task is different than predicting whether a mutation is pathogenic. A model could perform poorly if it is not able to distinguish between a benign mutation that has 60% wild-type activity and one that has 90%, or a pathogenic mutation with 0% activity and one with 10%. Although this is a different task than predicting pathogenicity, we found that participants in the 2016 NAGLU CAGI challenge performed well. This performance was obtained contrary to the fact that many models were not explicitly trained for the task of predicting enzymatic activity, instead being designed for the slightly different task of distinguishing pathogenic from benign variants.
Although models performed well, we did observe that the top methods were significantly more correlated with each other than they were with observed activity values. In many cases, such as with MutPred2 and Evolutionary Action, methods were highly correlated in spite of having relatively distinctive methodologies; one being a supervised machine learning model, the second being one based on a calculus of evolutionary variations. The starkly different methodologies of these two models suggest that a common feature type is the primary driver behind the high level of correlation between methods. Of all the feature types employed by participating models, sequence conservation was the most common. While we observed that sequence conservation was a unifying feature across almost all methods, we were unable to observe any relationship between the training data used for supervised methods and their performance. In fact, one of the top methods, Evolutionary Action, was a method that did not use a training dataset of known mutations at all, instead focusing of evolutionary conservation. There are technical details, such as the phylogenetic depth at which one should measure conservation and the choice of alignment algorithm, that must be considered when using homology to infer the impact of mutations (Katsonis et al., 2014).
The high level of correlation between in-silico models also has implications for the interpretation of variants in a clinical setting. As noted by the ACMG guidelines for variant interpretation, many tools rely on the same underlying data to make predictions, and single predictors should not be counted individually as evidence that a variant is pathogenic (Richards et al., 2015). Considering predictions from multiple tools will not necessarily add additional information regarding a particular mutation.
Our current in vitro enzyme activity assay is limited to testing missense coding variants, and as shown by Clark et al., there is very good agreement with observed activity and pathogenicity of known and well annotated disease variants. However, the in vitro assay may not always correlate with enzyme activities tested directly from patient samples. Mutations were introduced into a vector containing the NAGLU cDNA. Splicing, promoter/enhancer, and epigenetic mutations will be missed. Also, protein is being expressed at super-physiological levels. Some mutations may result in protein aggregation at high concentrations, but not at endogenous levels. Furthermore, proteins being expressed in cell lines from different organisms, or even different tissues, can exhibit variability in activity (Meijer et al., 2017). We can point to at least one mutation, p.Arg464Gln, whose activity we were surprised by. p.Arg464Gln was generated in multiple, sequence confirmed independent constructs, and its activity was checked by several independent transfections and was consistently found to have 3% wild-type activity. This particular mutation has a Non-Finnish European allele frequency in gnomAD v2.1 of 1.74 × 10−4, whereas the most common known disease causing mutation, p.Ser612Gly, has an allele frequency of 1.23 × 10−4. Given the high frequency of this variant compared to known pathogenic mutations, it is surprising it has not yet appeared in a patient.
Functional data from genes with clear functional readouts are important. While genes such as BRCA1 are important in the context of cancer, determining the impact of a missense mutation on its function is not a simple task (Carvalho, Couch, & Monteiro, 2007). Functional screening data on more genes like NAGLU can help train better models, which, in turn, can produce better predictions on genes that are more difficult to assay. More data will also allow researchers to determine trends amongst mutations that are easy and difficult to predict, as well as those that might not produce accurate activity read-outs in similar over-expression based cell line systems.
Supplementary Material
Acknowledgments
Contract grant sponsor: NIH (U41 HG007346, R13 HG006650, R01 MH105524, R01 LM009722, GM079656 and GM066099), National Institute of Aging (R01-AG061105), National Health and Medical Research Council of Australia (1059775, 1083450, 1121629), NIGMS grant (1 U01 GM115486 01), Estonian Research Council (IUT34-12).
Footnotes
Data Availability
The data that support the findings of this study are openly available as supplementary files. Submitted predictions for each model are available to registered users from the CAGI web site at:
References
- Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, … Sunyaev SR. (2010). A method and server for predicting damaging missense mutations. Nat Methods, 7(4), 248–249. doi: 10.1038/nmeth0410-248 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Andrade F, Aldamiz-Echevarria L, Llarena M, & Couce ML (2015). Sanfilippo syndrome: Overall review. Pediatr Int, 57(3), 331–338. doi: 10.1111/ped.12636 [DOI] [PubMed] [Google Scholar]
- Aoyagi-Scharber M, Crippen-Harmon D, Lawrence R, Vincelette J, Yogalingam G, Prill H, … Bunting S (2017). Clearance of Heparan Sulfate and Attenuation of CNS Pathology by Intracerebroventricular BMN 250 in Sanfilippo Type B Mice. Mol Ther Methods Clin Dev, 6, 43–53. doi: 10.1016/j.omtm.2017.05.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Birrane G, Dassier AL, Romashko A, Lundberg D, Holmes K, Cottle T, … Meiyappan M (2019). Structural characterization of the alpha-N-acetylglucosaminidase, a key enzyme in the pathogenesis of Sanfilippo syndrome B. J Struct Biol, 205(3), 65–71. doi: 10.1016/j.jsb.2019.02.005 [DOI] [PubMed] [Google Scholar]
- Bromberg Y, & Rost B (2007). SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic Acids Res, 35(11), 3823–3835. doi: 10.1093/nar/gkm238 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Capriotti E, Calabrese R, Fariselli P, Martelli PL, Altman RB, & Casadio R (2013). WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation. BMC Genomics, 14 Suppl 3, S6. doi: 10.1186/1471-2164-14-S3-S6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carvalho MA, Couch FJ, & Monteiro AN (2007). Functional assays for BRCA1 and BRCA2. Int J Biochem Cell Biol, 39(2), 298–310. doi: 10.1016/j.biocel.2006.08.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Choi Y, & Chan AP (2015). PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels. Bioinformatics, 31(16), 2745–2747. doi: 10.1093/bioinformatics/btv195 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clark WT, Yu GK, Aoyagi-Scharber M, & LeBowitz JH (2018). Utilizing ExAC to assess the hidden contribution of variants of unknown significance to Sanfilippo Type B incidence. PLoS One, 13(7), e0200008. doi: 10.1371/journal.pone.0200008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Coutinho MF, Lacerda L, & Alves S (2012). Glycosaminoglycan storage disorders: a review. Biochem Res Int, 2012, 471325. doi: 10.1155/2012/471325 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Folkman L, Stantic B, Sattar A, & Zhou Y (2016). EASE-MM: Sequence-Based Prediction of Mutation-Induced Stability Changes with Feature-Based Multiple Models. J Mol Biol, 428(6), 1394–1405. doi: 10.1016/j.jmb.2016.01.012 [DOI] [PubMed] [Google Scholar]
- Ford CT, Uppal A, Nodzak CM, & Shi X (2019). Prediction of the Effect of Naturally Occurring Missense Mutations on Cellular N-Acetyl-Glucosaminidase Enzymatic Activity. bioRxiv, 598870. doi: 10.1101/598870 [DOI] [Google Scholar]
- Gaffke L, Pierzynowska K, Piotrowska E, & Wegrzyn G (2018). How close are we to therapies for Sanfilippo disease? Metab Brain Dis, 33(1), 1–10. doi: 10.1007/s11011-017-0111-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gallion J, Koire A, Katsonis P, Schoenegge AM, Bouvier M, & Lichtarge O (2017). Predicting phenotype from genotype: Improving accuracy through more robust experimental and computational modeling. Hum Mutat, 38(5), 569–580. doi: 10.1002/humu.23193 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grantham R (1974). Amino acid difference formula to help explain protein evolution. Science, 185(4154), 862–864. [DOI] [PubMed] [Google Scholar]
- Hoskins RA, Repo S, Barsky D, Andreoletti G, Moult J, & Brenner SE (2017). Reports from CAGI: The Critical Assessment of Genome Interpretation. Hum Mutat, 38(9), 1039–1041. doi: 10.1002/humu.23290 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ioannidis NM, Rothstein JH, Pejaver V, Middha S, McDonnell SK, Baheti S, … Sieh, W. (2016). REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants. Am J Hum Genet, 99(4), 877–885. doi: 10.1016/j.ajhg.2016.08.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, … MacArthur DG. (2019). Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes. bioRxiv, 531210. doi: 10.1101/531210 [DOI] [Google Scholar]
- Katsonis P, Koire A, Wilson SJ, Hsu TK, Lua RC, Wilkins AD, & Lichtarge O (2014). Single nucleotide variations: biological impact and theoretical interpretation. Protein Sci, 23(12), 1650–1666. doi: 10.1002/pro.2552 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Katsonis P, & Lichtarge O (2014). A formal perturbation equation between genotype and phenotype determines the Evolutionary Action of protein-coding variations on fitness. Genome Res, 24(12), 2050–2058. doi: 10.1101/gr.176214.114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kumar P, Henikoff S, & Ng PC (2009). Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc, 4(7), 1073–1081. doi: 10.1038/nprot.2009.86 [DOI] [PubMed] [Google Scholar]
- Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, … Exome Aggregation, C. (2016). Analysis of protein-coding genetic variation in 60,706 humans. Nature, 536(7616), 285–291. doi: 10.1038/nature19057 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li B, Krishnan VG, Mort ME, Xin F, Kamati KK, Cooper DN, … Radivojac P. (2009). Automated inference of molecular mechanisms of disease from amino acid substitutions. Bioinformatics, 25(21), 2744–2750. doi: 10.1093/bioinformatics/btp528 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meijer OLM, Te Brinke H, Ofman R, L IJ, Wijburg FA, & van Vlies N. (2017). Processing of mutant N-acetyl-alpha-glucosaminidase in mucopolysaccharidosis type IIIB fibroblasts cultured at low temperature. Mol Genet Metab, 122(1–2), 100–106. doi: 10.1016/j.ymgme.2017.07.005 [DOI] [PubMed] [Google Scholar]
- Pejaver V, Urresti J, Lugo-Martinez J, Pagel KA, Lin GN, Nam H-J, … Radivojac P. (2017). MutPred2: inferring the molecular and phenotypic impact of amino acid variants. bioRxiv, 134981. doi: 10.1101/134981 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Remmert M, Biegert A, Hauser A, & Soding J (2011). HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Methods, 9(2), 173–175. doi: 10.1038/nmeth.1818 [DOI] [PubMed] [Google Scholar]
- Rentzsch P, Witten D, Cooper GM, Shendure J, & Kircher M (2019). CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res, 47(D1), D886–D894. doi: 10.1093/nar/gky1016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, … Committee, A. L. Q. A. (2015). Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med, 17(5), 405–424. doi: 10.1038/gim.2015.30 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rost B, & Sander C (1994). Conservation and prediction of solvent accessibility in protein families. Proteins, 20(3), 216–226. doi: 10.1002/prot.340200303 [DOI] [PubMed] [Google Scholar]
- Savojardo C, Fariselli P, Martelli PL, & Casadio R (2016). INPS-MD: a web server to predict stability of protein variants from sequence and structure. Bioinformatics, 32(16), 2542–2544. doi: 10.1093/bioinformatics/btw192 [DOI] [PubMed] [Google Scholar]
- Stenson PD, Ball EV, Mort M, Phillips AD, Shiel JA, Thomas NS, … Cooper DN. (2003). Human Gene Mutation Database (HGMD): 2003 update. Hum Mutat, 21(6), 577–581. doi: 10.1002/humu.10212 [DOI] [PubMed] [Google Scholar]
- Zelei T, Csetneki K, Voko Z, & Siffel C (2018). Epidemiology of Sanfilippo syndrome: results of a systematic literature review. Orphanet J Rare Dis, 13(1), 53. doi: 10.1186/s13023-018-0796-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tang H, & Thomas PD (2016). Tools for Predicting the Functional Impact of Nonsynonymous Genetic Variation. Genetics, 203(2), 635–647. doi: 10.1534/genetics.116.190033 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang M, & Wei L (2016). iFish: predicting the pathogenicity of human nonsynonymous variants using gene-specific/family-specific attributes and classifiers. Sci Rep, 6, 31321. doi: 10.1038/srep31321 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wei Q, Xu Q, & Dunbrack RL Jr. (2013). Prediction of phenotypes of missense mutations in human proteins from biological assemblies. Proteins, 81(2), 199–213. doi: 10.1002/prot.24176 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yin Y, Kundu K, Pal LR, & Moult J (2017). Ensemble variant interpretation methods to predict enzyme activity and assign pathogenicity in the CAGI4 NAGLU (Human N-acetyl-glucosaminidase) and UBE2I (Human SUMO-ligase) challenges. Hum Mutat, 38(9), 1109–1122. doi: 10.1002/humu.23267 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.