Skip to main content
PLOS One logoLink to PLOS One
. 2022 May 4;17(5):e0267084. doi: 10.1371/journal.pone.0267084

Comparative analysis of web-based programs for single amino acid substitutions in proteins

Arunabh Choudhury 1, Taj Mohammad 2, Farah Anjum 3, Alaa Shafie 3, Indrakant K Singh 4, Bekhzod Abdullaev 5, Visweswara Rao Pasupuleti 6,7,8, Mohd Adnan 9, Dharmendra Kumar Yadav 10,*, Md Imtaiyaz Hassan 2,*
Editor: Timir Tripathi11
PMCID: PMC9067658  PMID: 35507592

Abstract

Single amino-acid substitution in a protein affects its structure and function. These changes are the primary reasons for the advent of many complex diseases. Analyzing single point mutations in a protein is crucial to see their impact and to understand the disease mechanism. This has given many biophysical resources, including databases and web-based tools to explore the effects of mutations on the structure and function of human proteins. For a given mutation, each tool provides a score-based outcomes which indicate deleterious probability. In recent years, developments in existing programs and the introduction of new prediction algorithms have transformed the state-of-the-art protein mutation analysis. In this study, we have performed a systematic study of the most commonly used mutational analysis programs (10 sequence-based and 5 structure-based) to compare their prediction efficiency. We have carried out extensive mutational analyses using these tools for previously known pathogenic single point mutations of five different proteins. These analyses suggested that sequence-based tools, PolyPhen2, PROVEAN, and PMut, and structure-based web tool, mCSM have a better prediction accuracy. This study indicates that the employment of more than one program based on different approaches should significantly improve the prediction power of the available methods.

Introduction

Non-synonymous single nucleotide polymorphism (nsSNP) in the genome introduces a single amino acid change in the protein sequence, which may or may not affect a protein in terms of structure and subsequent function. An amino acid substitution on a protein can have several effects, including loss or gain in function, alteration of the catalytic site, structural instability, protein aggregation, or abnormal folding [1]. Also, missense mutations can impact the pre-translational and post-translational processes. Many human genetic disorders arise because of amino acid substitutions [2]. Over the past few decades, a considerable emphasis was given to analyzing single-point protein mutations to determine the effects and to understand the molecular mechanism [37]. This has produces many resources, including several databases and web-based tools that mostly focus on human mutations [8].

Many databases have been created to store the information about mutations of human and other organisms’ genomes which serve as a starting point of mutation analysis. Most of the SNP data is deposited in The Single Nucleotide Polymorphism Database (dbSNP, http://www.ncbi.nlm.nih.gov/SNP/) [9], and it serves as the primary source for retrieval of single nucleotide polymorphisms. Ensemble (https://www.ensembl.org/) [10] is another large database that stores information about human and other organisms’ genetic variations, and it also gives information about the pathogenesis of the variations. Other databases include Human Gene Mutation Database (HGMD, http://www.hgmd.cf.ac.uk/ac/index.php) [11], ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/) [12], Online Mendelian Inheritance in Man (OMIM, http://www.ncbi.nlm.nih.gov/sites/entrez?db=omim) [13], the Pharmacogenetics Knowledge Base (PharmGKB, http://www.pharmgkb.org/) [14], etc.

Many bioinformatics tools have been developed to analyze the impact of missense mutations. Different approaches have been applied to the development of the tools (Fig 1). The two broad categories for mutation analysis are the sequence-based and structure-based approaches. Both approaches use several factors that affect the protein structure and function. The sequence-based study uses various analyzing methods, including cellular localization, aggregation, disorder, functional effects, and stability [1]. The structure-based approach is mainly based on the free energy calculation. It considers electrostatic changes, steric effects, interresidue contacts, disorder, functional effects, and stability. Sequence conservation is another vital component of SNP analysis as disease-causing mutations frequently occur in evolutionarily conserved regions [1517]. Substitution of an amino acid increases the probability of protein getting aggregated which is involved in several neurodegenerative diseases [18, 19]. Amino acid substitution can introduce disorder in the protein structure. It can be estimated using amino acid composition, energy profiles and physicochemical properties, specific sequence patterns, missing X-ray coordinates, and B-factors. Changes in the electrostatic potential due to a substitution can affect the ligand-binding ability and folding mechanism. Phylogenetic information is also a key component in the prediction process.

Fig 1. Graphical representation of tools used in this study for comparison.

Fig 1

The tools use different scoring matrices (such as BLOSSUM62) [20] to calculate whether a mutation has a functional or structural impact on a protein or not [17]. For a given mutation, each tool provides a score to indicate the damaging probability. In our study, we have performed a comparative analysis of 15 different web tools, out of which 10 are sequence-based, and five are structure-based. To compare the tools, we have taken previously confirmed disease-causing mutations and some functionally impactful/damaging mutations of five proteins and analyzed them through all fifteen tools (Table 1).

Table 1. Tools for the analysis of single amino acid substitutions.

Tool URL Prediction Reference
PolyPhen2 http://genetics.bwh.harvard.edu/pph2/ Damaging or benign [39]
PROVEAN http://provean.jcvi.org/ Deleterious or neutral [24]
SIFT http://sift.jcvi.org/ Damaging or tolerated [26]
FATHMM http://hathmm.biocompute.org.uk Damaging or tolerated [40]
Mutation Assessor http://mutationassessor.org/r3/ Functionally impactful or neutral [27]
PON-P2 http://structure.bmc.lu.se/PON-P2/ Pathogenicity prediction [28]
MutPred2 http://mutpred.mutdb.org Pathogenicity prediction [41]
SNPs & GO https://snps-and-go.biocomp.unibo.it Disease-causing or neutral [42]
PhD-SNP https://snps.biofold.org/phd-snp/phd-snp.html Disease-causing or neutral [43]
PMut http://mmb.irbbarcelona.org/PMut Disease-causing or neutral [44]
mCSM http://biosig.unimelb.edu.au/mcsm/ Stability prediction [45]
SDM http://marid.bioc.cam.ac.uk/sdm2 Stability prediction [34]
MAESTROweb https://pbwww.che.sbg.ac.at/maestro/web Stability prediction [46]
CUPSAT http://cupsat.tu-bs.de/index.jsp Stability prediction [36]
DynaMut2 http://biosig.unimelb.edu.au/dynamut2/ Stability prediction [37]

Materials and methods

We have carried out an extensive mutational analysis of single point mutations of five different proteins through the 15 different sequence and structure-based tools. The proteins are Parkinson’s disease protein 7 (PARK7), E3 ubiquitin-protein ligase parkin (PARK2), Presenilin-1 (PESN1), GTPase HRas (HRAS), and Runt-related transcription factor 1 (RUNX1). We have taken only those mutations which were already found to affect the protein function and structure. Sequences and the non-synonymous mutations of these proteins are collected from the UniProt database [21]. All the mutations are associated with a disease or have an altering effect on the protein, i.e., all the mutations are damaging to the protein. The structure of each protein was downloaded from the RCSB Protein Data Bank (PDB) [22]. For PARK7, PARK2, PESN1, HRAS and RUNX1, we have analyzed 18, 38, 199, 32 and 51 single point mutations, respectively. The UniProt and PDB IDs of the five proteins are given in Table 2.

Table 2. UniProt IDs and PDB IDs of each protein.

Protein UniProt ID PDB ID
Parkinson’s disease protein 7 (PARK7) Q99497 1P5F
E3 ubiquitin-protein ligase parkin (PARK2) O60260 5C1Z
Presenilin-1 (PESN1) P49768 6IYC
GTPase HRas (HRAS) P01112 4Q21
Runt-related transcription factor 1 (RUNX1) Q01196 1E50

We reviewed fifteen different tools, of which 10 are sequence-based and 5 are structure-based. We have also performed single-point mutation analysis to estimate their performance. PolyPhen2, PROVEAN, FATHMM, SIFT, Mutation Assessor, PON-P2, SNPs & GO, PhD-SNP, MutPred2 and PMut are sequence-based and mCSM, SDM, MAESTROweb, CUPSAT and DynaMut2 are structure-based tools.

PolyPhen2

Polymorphism phenotyping (PolyPhen-2) is a sequence-based tool. The FASTA file is given as an input for the protein sequence [23]. To calculate the damaging probability of a mutation, it compares the physical properties of the wild-type and mutant variant. It incorporates multiple sequence alignment, and a machine learning-based classifier developed for high throughput NGS data analysis. PolyPhen2 derives Position-Specific Independent Count (PSIC) score for the variant and then estimates the difference of PSIC between mutant and the wild-type. For a PSIC score greater than 0.09, the tool considers a mutation to be deleterious.

PROVEAN

The protein variation effect analyzer (PROVEAN) calculates the functional consequence of a single amino acid substitution on the protein [24]. PROVEAN categorizes mutations as deleterious or neutral; a mutation with a PROVEAN score of <-2.5 is deleterious, whereas mutations with scores >-2.5 are considered neutral. PROVEAN web server comprises three tools, PROVEAN Protein (includes any species), PROVEAN Protein Batch and PROVEAN Genome Variants (specifically for mouse and human). The PROVEAN Protein Batch tool also returns the result of SIFT tool, and it can process a large number of protein variants. The input for this program takes amino acid substitution and supports public domain protein identifiers from NCBI RefSeq, UniProt, and Ensembl.

FATHMM

Functional Analysis through Hidden Markov Model (FATHMM) is a web-based tool for predicting the functional consequences of coding and non-coding variants in the human genome [25]. The coding variants can be analyzed for inherited diseases, cancer and specific diseases. FATHMM is comprised of two algorithms: unweighted and weighted. The unweighted method is based on sequence conservation, and the weighted method is a combination of sequence conservation and pathogenicity weights. The unweighted method searches conserved residues through the amino acid probabilities of various Hidden Markov Models (HMMs) representing the alignment of protein domains that are conserved and homologous sequences. The weighted method assigns pathogenicity weights that correlate with disease-causing amino acid substitutions, with sequence conservation found through searching HMMs.

SIFT

Sorting Intolerant from Tolerant (SIFT) is a web-server that determines whether single amino acid substitutions on a protein are deleterious or not. The tool considers sequence similarity and physical properties of the amino acid to calculate the damaging probability. A SIFT score of less than or equal to 0.05 indicates an intolerable mutation [26].

Mutation assessor

Mutation Assessor is a sequence-based tool to predict the functional consequences of amino-acid substitutions in proteins. The Mutation Assessor depends upon multiple sequence alignment and amino acid residues that are evolutionarily conserved. The input of this tool includes UniProt protein accession or NCBI Refseq protein ID. It categorizes the protein variants as high, medium, low or neutral for damaging impacts. It returns the FI score for each variant. A variant with an FI score greater than 2.00 is predicted as a deleterious variant [27].

PON-P2

PON-P2 is another web-based classifier for protein variants, and it uses a machine-learning-based approach. This tool differentiates the amino acid substitutions into pathogenic, neutral and unknown classes. It is a fast tool as it analyzes a large amount of variant data in less time in a highly efficient manner. This tool considers evolutionary sequence conservation, biochemical attributes and physical attributes of a protein. It also uses functional annotations and Gene Ontology (GO) annotations based on availability. The input of PON-P2 needs amino acid substitutions and one of Ensembl, Entrez or UniProtKB/Swiss-Prot accession ID [28].

MutPred2

MutPred2 is a web-server that categorizes a mutation as disease-associated or neutral [29]. It estimates the molecular mechanism of pathogenicity of an amino acid substitution using a machine-learning-based technique. This tool considers fifty different protein properties to calculate the effect of the substitutions. For a pathogenic mutation, the MutPred2 score is greater than 0.5.

SNPs & GO

SNPs & GO is a support vector machine (SVM) based web-server to identify deleterious single amino acid substitutions [30]. The SVM-based classifier consists of a single SVM that takes input protein sequence, profile and functional information. It uses GO annotations to classify a missense variant into disease-related or neutral. It requires amino acid sequence/SwissProt code, GO terms and amino acid substitutions as input. An SNPs & GO score of more than 0.5 indicates a disease-causing mutation. This tool also gives the result of PANTHER and PhD-SNP.

PhD-SNP

Predictor of human Deleterious Single Nucleotide Polymorphisms (PhD-SNP) also uses SVM based classifier to classify the disease-associated variants [31]. Sequence and profile information is used in the classification process of the amino acid substitutions into neutral and disease-associated. The sequence profile is calculated using an input vector derived from wild-type (WT) and mutant amino acid frequencies, the number of aligned sequences, and the conservation score in the substituted location. A PhD-SNP score of more than 0.5 indicates a disease-causing mutation.

PMut

Mutations that are associated with disease phenotype are identified using the PMut web server. A neural network-based method is used to train the classifier of PMut, and it uses the manually curated protein sequence data from the SwissProt database. Sequence conservation and physiochemical attributes of amino acids are used as the main features. People can also generate predictors for some protein families in the new version of the tool, and previously predicted results are also deposited in the webserver. If the PMut score for an amino acid substitution is greater than 0.05, then the variant is pathogenic [32].

mCSM

mCSM predicts the stability of an amino acid substitution using a graph-based approach. The prediction method is trained with the environment derived from the atomic distance patterns of all the amino acid residues. It can estimate destabilizing probabilities for various protein structures and understand disease-associated variants. For a mutation that destabilizes a protein structure, the mCSM score (ΔΔG) is less than 0 [33].

SDM

Site-Directed Mutator (SDM) evaluates the protein stability upon single point mutations. Environment-specific amino acid substitution tables with parameters like packing density and residue length and PDB coordinate files are used to determine the stability of a mutant protein. SDM was tested with 2690 amino acid substitution from 132 different 3D structures of proteins. For a destabilizing amino acid substitution, the predicted ΔΔG is greater than 0 [34].

MAESTROweb

MAESTRO is a multi-agent stability prediction web tool that calculates the free energy change on protein unfolding. The free energy difference (ΔΔG) between the WT and mutant protein is calculated to determine the stability upon change in amino acid residues. The tool can evaluate both predetermined and modeled PDB coordinate files, although prediction accuracy for modeled structures are less efficient. For a mutation that has a destabilizing impact on a protein structure, the MAESTRO score is less than zero [35].

CUPSAT

Cologne University Protein Stability Analysis Tool (CUPSAT) is a web-server to estimate changes in protein stability upon mutation [36]. The tool consists of a prediction model based on torsion angle distribution and potentials of the amino acid atoms. It assesses the amino acid environment around the substituted position. Secondary structure specificity and solvent accessibility are also used to determine the amino acid environment. In CUPSAT, the amino acid atom potentials of 40 amino acid atoms from Melo-Feytmans are used to construct the radial pair distribution function. CUPSAT gives the stability prediction upon mutation for all the amino acid mutations for a specific position. It can also predict custom PDB structures.

DynaMut2

DynaMut2 is a protein stability prediction tool that combines Normal Mode Analysis (NMA) techniques to capture protein motion and graph‐based signatures to represent the WT environment [37]. The data for the amino acid substitutions were taken from ProTherm. For stability prediction upon single point mutation, each mutation was modeled using many properties, including WT residue environment, protein dynamics (NMA), substitution propensities and contact potential scores, interatomic interactions and graph‐based signatures method. These methods were then used to train the machine learning algorithm. DynaMut2 can give predictions for single and multiple mutations. We have used the single mutation prediction feature for our analysis.

Results and discussion

We have performed mutation analysis of all the five proteins through the 15 sequence and structure-based tools to estimate their performance (S1 Data). For PolyPhen2, we have used the batch query feature. In the batch query, several mutations can be predicted at once. PolyPhen2 categorizes the mutations into damaging and benign classes. The predicted damaging mutations for PARK7, PARK2, PESN1, HRAS and RUNX1 are 66.67%, 92.11%, 96.09%, 78.13% and 100%, respectively. PolyPhen2 predicted an average of 86.60% damaging mutation from the five proteins. The predicted damaging mutations by PROVEAN for PARK7, PARK2, PESN1, HRAS and RUNX1 were 72.22%, 71.05%, 88.83%, 100%, and 100% respectively. The predicted damaging mutations by SIFT for PARK7, PARK2, PESN1, HRAS and RUNX1 were 66.67%, 78.95%, 90.50%, 90.63%, and 100%, respectively. PROVEAN predicted an average of 86.42% variants as damaging mutations, and SIFT predicted an average of 85.35% substitutions as damaging mutations from the five proteins (Table 3). For FATHMM analysis, we used the inherited disease feature under coding variants applying an unweighted algorithm. FATHMM gave 64.39% average damaging mutation for the five proteins. Mutation Assessor analyses predicted an average of 76.46% mutations as functionally impactful for the five proteins.

Table 3. Percentage of deleterious/ pathogenic/ destabilizing single point mutations predicted by all the fifteen tools for PARK7, PARK2, PESN1, HRAS and RUNX1.

Tools PARK7 PARK2 PSN1 HRAS RUNX1 Average
Sequence-based Deleterious/Damaging PolyPhen2 66.67% 92.11% 96.09% 78.13% 100% 86.60%
PROVEAN 72.22% 71.05% 88.83% 100% 100% 86.42%
SIFT 66.67% 78.95% 90.50% 90.63% 100% 85.35%
FATHMM 50% 63.16% 68.16% 53.13% 87.5% 64.39%
Mutation Assessor 66.67% 86.84% 94.41% 71.88% 62.5% 76.46%
Pathogenicity PON-P2 44.44% 28.95% 74.86% 81.25% 68.75% 59.65%
MutPred2 61.11% 76.32% 94.97% 100% 93.75% 85.23%
SNPs & GO 61.11% 39.47% 73.74% 53.13% 93.75% 64.24%
PhD-SNP 55.56% 60.53% 86.03% 75% 100% 75.42%
PMut 94.44% 63.16% 94.97% 96.88% 100% 89.89%
Structure-based Stability mCSM 88.89% 94.74% 87.71% 93.75% 100% 93.02%
SDM 83.33% 81.58% 63.69% 62.50% 62.5% 70.72%
MAESTROweb 88.89% 63.16% 79.33% 84.38% 81.25% 79.40%
CUPSAT 66.67% 73.68% 64.25% 68.75% 56.25% 65.92%
DynaMut2 83.33% 89.47% 82.68% 84.38% 87.5% 85.47%

The next five sequence-based tools predict the disease phenotype (pathogenicity) of a single amino acid substitution. PON-P2, SNPs&GO, PhD-SNP, MutPred2 and PMut identify mutations as pathogenic or neutral. For PARK7, PARK2, PESN1, HRAS and RUNX1, the predicted pathogenic mutations by PON-P2 were 44.44%, 28.95%, 74.86%, 81.25% and 68.75%, respectively (S1S5 Figs). Through PON-P2 identifier submission, we have obtained an average of 59.65% pathogenic mutations for the proteins. MutPred2 identifies the pathogenic mutations and tells the altering impact of a particular mutation on the protein structure. MutPred2 gives structure altering mutations for PARK7, PARK2, PESN1, HRAS and RUNX1 estimated 61.11%, 76.32%, 94.97%, 100% and 93.75% pathogenic mutations, respectively. MutPred2 predicted an average of 85.23% pathogenic mutation for the five proteins. SNPs&GO uses gene ontology terms to predict disease-associated mutation. It also returns the PhD-SNP results along with the SNPs&GO results. Both tools predict the disease association through functional annotation. The average number of pathogenic mutations estimated by SNPs&GO was 64.24%, and the predicted average disease-associated mutations was 75.42% for PhD-SNP. PMut webserver predicted 94.44%, 63.16%, 94.97%, 96.88% and 100% pathogenic mutations for PARK7, PARK2, PESN1, HRAS and RUNX1, respectively.

After the sequence-based analysis, we performed the structure-based analysis of the mutations by five tools, namely mCSM, SDM, MAESTROweb, CUPSAT and DynaMut2. These tools provide Gibbs free energy change values (ΔΔG) for each protein structure; the change in free energy during the unfolding of a kinetically stable protein is described by this ΔΔG value. Sometimes the mutation in proteins differentiates the free energy landscape between the mutant and the WT protein. This variance in the free energy landscape is why the mutation affects the stability of a protein. Thermodynamically, the Gibbs free energy difference between folded (Gf) and unfolded (Gu) protein can be calculated as ΔG = Gu-Gf. The change of protein stability (ΔΔG) and free energy landscape between mutant (Gm) and WT (Gw) is calculated as ΔΔG = Gm-Gw. A negative ΔΔG value indicates stabilizing, and a positive ΔΔG shows destabilizing [38]. mCSM predicted an average of 93% mutations as destabilizing and 7% mutations as stabilizing for all five proteins. Predicted destabilizing mutations for PARK7, PARK2, PESN1, HRAS and RUNX1 by SDM were 83.33%, 81.58%, 63.69%, 62.50% and 62.5%, respectively, with an average of 70.72%. MAESTROweb estimated an average of 79.40% destabilizing mutations, whereas CUPSAT predicted an average of 65.92% destabilizing amino acid substitutions. The predicted destabilizing mutations for PARK7, PARK2, PESN1, HRAS and RUNX1 by DynaMut2 were 66.67%, 92.11%, 96.09%, 78.13% and 100%, respectively, with an average of 85.47%.

The sequence-based analysis using ten different tools revealed a comparative assessment of the tools. PolyPhen2, PROVEAN, FATHMM, SIFT and Mutation Assessor are the five sequence-based tools which categorize mutation into damaging/deleterious and tolerant categories. PolyPhen2, PROVEAN and SIFT showed almost equal prediction accuracy, whereas FATHMM showed a significant drop in average deleterious mutations with 64.39%. The other five sequence-based tools, PON-P2, SNPs&GO, PhD-SNP, MutPred2 and PMut, predicts the disease phenotype or pathogenicity of a single point mutation. PMut showed the highest average pathogenicity prediction with 89.9%. On the other hand, PON-P2 estimated the least average with 59.6%. MutPred2 showed higher accuracy than SNPs&GO and PhD-SNP. After the sequence-based tools, we compared the results of five structure-based tools mCSM, SDM, MAESTROweb, CUPSAT and DynaMut2. The structure-based tools predict the stabilizing or destabilizing mutations based on ΔΔG. mCSM predicted the highest number of mutations as destabilizing, whereas CUPSAT showed the least number of mutations as destabilizing (Fig 2).

Fig 2. Distribution of average deleterious/destabilizing single amino acid substitutions predicted by all fifteen tools for PARK7, PARK7, PESN1, HRAS and RUNX1.

Fig 2

Conclusion

Single point amino-acid substitutions are associated with several human diseases, including cancer and neurodegenerative diseases, and are contemplated as one of the most recurrent genetic variants. Detailed analysis of the single point amino-acid substitution can help us understand the impact of mutation and the disease-causing mechanism. With a growing number of genetic variations, it is critical to predict the impact of a mutation through computational approaches in a fast and reliable manner. There are several computational methods available to analyze the molecular consequences of single point mutations. We have performed a detailed analysis of several mutations through 15 different tools to determine the prediction accuracy based on previously available data. Out of the sequence-based tools that estimate deleterious/damaging mutation, we have found that PolyPhen2 and PROVEAN showed higher prediction accuracy. In sequence-based pathogenicity prediction, PMut showed the highest prediction accuracy. Out of the structure-based web tools, mCSM showed a higher number of mutations as destabilizing and showed higher prediction power than others. The results of this study may be used to designate the most suitable program for mutational analysis. An advanced platform then can be developed that can automatically select the program that is likely to give the most precise predictions.

Supporting information

S1 Data

(XLSX)

S1 Fig. Distribution of deleterious/destabilizing mutations predicted by all 15 tools for PARK7.

(DOCX)

S2 Fig. Distribution of deleterious/destabilizing mutations predicted by all 15 tools for PARK2.

(DOCX)

S3 Fig. Distribution of deleterious/destabilizing mutations predicted by all 15 tools for PSN1.

(DOCX)

S4 Fig. Distribution of deleterious/destabilizing mutations predicted by all 15 tools for HRAS.

(DOCX)

S5 Fig. Distribution of deleterious/destabilizing mutations predicted by all 15 tools for RUNX1.

(DOCX)

Data Availability

All relevant data are within the manuscript and its Supporting Information files.

Funding Statement

This work was supported by Taif University Researchers Supporting Project Number (TURSP-2020/131), Taif University, Taif, Saudi Arabia.

References

  • 1.Thusberg J, Vihinen M (2009) Pathogenic or not? And if so, then how? Studying the effects of missense mutations using bioinformatics methods. Hum Mutat 30: 703–714. doi: 10.1002/humu.20938 [DOI] [PubMed] [Google Scholar]
  • 2.Ng PC, Henikoff S (2006) Predicting the effects of amino acid substitutions on protein function. Annu Rev Genomics Hum Genet 7: 61–80. doi: 10.1146/annurev.genom.7.080505.115630 [DOI] [PubMed] [Google Scholar]
  • 3.Mohammad T, Amir M, Prasad K, Batra S, Kumar V, et al. (2020) Impact of amino acid substitution in the kinase domain of Bruton tyrosine kinase and its association with X-linked agammaglobulinemia. International journal of biological macromolecules 164: 2399–2408. doi: 10.1016/j.ijbiomac.2020.08.057 [DOI] [PubMed] [Google Scholar]
  • 4.Choudhury A, Mohammad T, Samarth N, Hussain A, Rehman MT, et al. (2021) Structural genomics approach to investigate deleterious impact of nsSNPs in conserved telomere maintenance component 1. Scientific reports 11: 1–13. doi: 10.1038/s41598-020-79139-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Umair M, Khan S, Mohammad T, Shafie A, Anjum F, et al. (2021) Impact of single amino acid substitution on the structure and function of TANK‐binding kinase‐1. Journal of cellular biochemistry 122(10), 1475–1490. doi: 10.1002/jcb.30070 [DOI] [PubMed] [Google Scholar]
  • 6.Habib I, Khan S, Mohammad T, Hussain A, Alajmi MF, et al. (2021) Impact of non-synonymous mutations on the structure and function of telomeric repeat binding factor 1. Journal of Biomolecular Structure and Dynamics: 1–14. doi: 10.1080/07391102.2021.1922313 [DOI] [PubMed] [Google Scholar]
  • 7.Amir M, Ahmad S, Ahamad S, Kumar V, Mohammad T, et al. (2020) Impact of Gln94Glu mutation on the structure and function of protection of telomere 1, a cause of cutaneous familial melanoma. Journal of Biomolecular Structure and Dynamics 38(5):1514–1524. doi: 10.1080/07391102.2019.1610500 [DOI] [PubMed] [Google Scholar]
  • 8.Mooney SD, Krishnan VG, Evani US (2010) Bioinformatic tools for identifying disease gene and SNP candidates. Methods Mol Biol 628: 307–319. doi: 10.1007/978-1-60327-367-1_17 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, et al. (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29: 308–311. doi: 10.1093/nar/29.1.308 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Hubbard T, Barker D, Birney E, Cameron G, Chen Y, et al. (2002) The Ensembl genome database project. Nucleic Acids Res 30: 38–41. doi: 10.1093/nar/30.1.38 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Cooper DN, Stenson PD, Chuzhanova NA (2006) The Human Gene Mutation Database (HGMD) and its exploitation in the study of mutational mechanisms. Curr Protoc Bioinformatics Chapter 1: Unit 1.13. [DOI] [PubMed] [Google Scholar]
  • 12.Landrum MJ, Lee JM, Riley GR, Jang W, Rubinstein WS, et al. (2014) ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res 42: D980–985. doi: 10.1093/nar/gkt1113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Hamosh A, Scott AF, Amberger J, Valle D, McKusick VA (2000) Online Mendelian Inheritance in Man (OMIM). Hum Mutat 15: 57–61. doi: [DOI] [PubMed] [Google Scholar]
  • 14.Altman RB (2007) PharmGKB: a logical home for knowledge relating genotype to drug response phenotype. Nat Genet 39: 426. doi: 10.1038/ng0407-426 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Miller MP, Kumar S (2001) Understanding human disease mutations through the use of interspecific genetic variation. Hum Mol Genet 10: 2319–2328. doi: 10.1093/hmg/10.21.2319 [DOI] [PubMed] [Google Scholar]
  • 16.Mooney SD, Klein TE (2002) The functional importance of disease-associated mutation. BMC Bioinformatics 3: 24. doi: 10.1186/1471-2105-3-24 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Ng PC, Henikoff S (2001) Predicting deleterious amino acid substitutions. Genome Res 11: 863–874. doi: 10.1101/gr.176601 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Sami N, Rahman S, Kumar V, Zaidi S, Islam A, et al. (2017) Protein aggregation, misfolding and consequential human neurodegenerative diseases. Int J Neurosci 127: 1047–1057. doi: 10.1080/00207454.2017.1286339 [DOI] [PubMed] [Google Scholar]
  • 19.Kumar V, Sami N, Kashav T, Islam A, Ahmad F, et al. (2016) Protein aggregation and neurodegenerative diseases: From theory to therapy. Eur J Med Chem 124: 1105–1120. doi: 10.1016/j.ejmech.2016.07.054 [DOI] [PubMed] [Google Scholar]
  • 20.Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A 89: 10915–10919. doi: 10.1073/pnas.89.22.10915 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.(2007) The Universal Protein Resource (UniProt). Nucleic Acids Res 35: D193–197. doi: 10.1093/nar/gkl929 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, et al. (2000) The Protein Data Bank. Nucleic Acids Res 28: 235–242. doi: 10.1093/nar/28.1.235 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, et al. (2010) A method and server for predicting damaging missense mutations. Nat Methods 7: 248–249. doi: 10.1038/nmeth0410-248 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Choi Y, Chan AP (2015) PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels. Bioinformatics 31: 2745–2747. doi: 10.1093/bioinformatics/btv195 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Shihab HA, Gough J, Cooper DN, Stenson PD, Barker GL, et al. (2013) Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models. Hum Mutat 34: 57–65. doi: 10.1002/humu.22225 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Ng PC, Henikoff S (2003) SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res 31: 3812–3814. doi: 10.1093/nar/gkg509 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Reva B, Antipin Y, Sander C (2011) Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res 39: e118. doi: 10.1093/nar/gkr407 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Niroula A, Urolagin S, Vihinen M (2015) PON-P2: prediction method for fast and reliable identification of harmful variants. PloS one 10: e0117380. doi: 10.1371/journal.pone.0117380 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Pejaver V, Urresti J, Lugo-Martinez J, Pagel KA, Lin GN, et al. (2020) Inferring the molecular and phenotypic impact of amino acid variants with MutPred2. Nat Commun 11: 5918. doi: 10.1038/s41467-020-19669-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Capriotti E, Calabrese R, Fariselli P, Martelli PL, Altman RB, et al. (2013) WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation. BMC Genomics 14 Suppl 3: S6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Capriotti E, Calabrese R, Casadio R (2006) Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information. Bioinformatics 22: 2729–2734. doi: 10.1093/bioinformatics/btl423 [DOI] [PubMed] [Google Scholar]
  • 32.López-Ferrando V, Gazzo A, de la Cruz X, Orozco M, Gelpí JL (2017) PMut: a web-based tool for the annotation of pathological variants on proteins, 2017 update. Nucleic Acids Research 45: W222–W228. doi: 10.1093/nar/gkx313 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Pires DEV, Ascher DB, Blundell TL (2013) mCSM: predicting the effects of mutations in proteins using graph-based signatures. Bioinformatics 30: 335–342. doi: 10.1093/bioinformatics/btt691 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Pandurangan AP, Ochoa-Montaño B, Ascher DB, Blundell TL (2017) SDM: a server for predicting effects of mutations on protein stability. Nucleic Acids Res 45: W229–w235. doi: 10.1093/nar/gkx439 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Laimer J, Hofer H, Fritz M, Wegenkittl S, Lackner P (2015) MAESTRO—multi agent stability prediction upon point mutations. BMC Bioinformatics 16: 116. doi: 10.1186/s12859-015-0548-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Parthiban V, Gromiha MM, Schomburg D (2006) CUPSAT: prediction of protein stability upon point mutations. Nucleic Acids Res 34: W239–242. doi: 10.1093/nar/gkl190 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Rodrigues CHM, Pires DEV, Ascher DB (2021) DynaMut2: Assessing changes in stability and flexibility upon single and multiple point missense mutations. Protein Sci 30: 60–69. doi: 10.1002/pro.3942 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Quan L, Lv Q, Zhang Y (2016) STRUM: structure-based prediction of protein stability changes upon single-point mutation. Bioinformatics 32: 2936–2946. doi: 10.1093/bioinformatics/btw361 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Adzhubei I, Jordan DM, Sunyaev SR (2013) Predicting functional effect of human missense mutations using PolyPhen‐2. Current protocols in human genetics 76: 7.20. 21–27.20. 41. doi: 10.1002/0471142905.hg0720s76 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Rogers MF, Shihab HA, Mort M, Cooper DN, Gaunt TR, et al. (2018) FATHMM-XF: accurate prediction of pathogenic point mutations via extended features. Bioinformatics 34: 511–513. doi: 10.1093/bioinformatics/btx536 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Pejaver V, Urresti J, Lugo-Martinez J, Pagel KA, Lin GN, et al. (2017) MutPred2: inferring the molecular and phenotypic impact of amino acid variants. bioRxiv: 134981. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Calabrese R, Capriotti E, Fariselli P, Martelli PL, Casadio R (2009) Functional annotations improve the predictive score of human disease‐related mutations in proteins. Human mutation 30: 1237–1244. doi: 10.1002/humu.21047 [DOI] [PubMed] [Google Scholar]
  • 43.Capriotti E, Fariselli P (2017) PhD-SNPg: a webserver and lightweight tool for scoring single nucleotide variants. Nucleic acids research 45: W247–W252. doi: 10.1093/nar/gkx369 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Ferrer-Costa C, Gelpí JL, Zamakola L, Parraga I, De La Cruz X, et al. (2005) PMUT: a web-based tool for the annotation of pathological mutations on proteins. Bioinformatics 21: 3176–3178. doi: 10.1093/bioinformatics/bti486 [DOI] [PubMed] [Google Scholar]
  • 45.Pires DE, Ascher DB, Blundell TL (2014) mCSM: predicting the effects of mutations in proteins using graph-based signatures. Bioinformatics 30: 335–342. doi: 10.1093/bioinformatics/btt691 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Laimer J, Hiebl-Flach J, Lengauer D, Lackner P (2016) MAESTROweb: a web server for structure-based protein stability prediction. Bioinformatics 32: 1414–1416. doi: 10.1093/bioinformatics/btv769 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Timir Tripathi

26 Dec 2021

PONE-D-21-35921Comparative analysis of web-based programs for single amino acid substitutions in proteinsPLOS ONE

Dear Dr. Hassan,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

==============================

The manuscript has been reviewed by three independent reviewers. They find merit in this manuscript but have highlighted several areas in which improvement and corrections are necessary. These areas include the organization of the text as well as technical details, and must be addressed for the manuscript to be considered for publication. Few language issues also need to be resolved by the authors.​

==============================

Please submit your revised manuscript by Feb 09 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Timir Tripathi, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. 

When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section.

3. We note you have included a table to which you do not refer in the text of your manuscript. Please ensure that you refer to Table 3 in your text; if accepted, production will need this reference to link the reader to the Table.

4. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information. 

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: N/A

Reviewer #3: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: 1. Introduction should be comprehensive. A discussion on the significance of these tools should be addressed.

2. The rationale to do such kind of review and analytical comparison must be described.

3. The calculations were performed on 15 different predictors. Are these tools based on the same prediction methods? Discuss this in the manuscript.

4. The authors used five different proteins in the study. What would be the results of calculations upon only one protein?

5. Were the average values estimated for all the parameters of each tool?

6. The explanation of the observed differences in the prediction power of each tool should be elaborated in light of their algorithms and resultant decision.

7. Authors double-check the manuscript for abbreviations used.

8. Language editing is required to improve the quality of the manuscript. The author should recheck this manuscript carefully and remove typos and grammatical errors.

9. All references should be thoroughly checked, and especially Author must confirm only relevant publications should be cited.

Reviewer #2: This study compares various available bioinformatics tools for predicting the impact of mutations in human proteins on their structure and function. The scope of the study is wide, with over a dozen computational web-based tools to evaluate their prediction power. The current form of the manuscript, unfortunately, ignores the following points which need to be resolved during revision.

· Despite interesting findings this paper lacks sound rationale and experimental support. The drawn conclusion should be focused and crisp.

· What is the basis of selecting five different proteins in this study? Is there any evidence that SNPs under consideration in this study are associated with disease? Discuss the rationale for this.

· Is there any clinical evidence showing that the destabilizing/deleterious nsSNPs are associated with protein dysfunction?

· How was the comparison of different predictors batched? A little detail of each predictor must be addressed.

· Method section may be shortened.

· Discussion should be improved in light of the author's findings and previous literature.

· The authors have not explained any relation between the prediction strategies of different predictors. This should be highlighted in the text.

· It is necessary to write the reasons why different five proteins were used as the benchmark.

· In conclusion, the authors write that “Out of the structure-based web tools, mCSM showed a higher number of mutations as destabilizing”. What does it mean? This statement should be elaborated.

· The results section is redundant. Please revise, focusing on the specific outcomes and their importance. The strength of the author’s findings should be highlighted.

· A uniform presentation is required. The author should proofread the manuscript before final submission.

Reviewer #3: 1. Abstract: Can be concise and trim-down; repetitive meanings should be avoided.

2. The scientific problem is described well, however, there are a few language mistakes in the text. Therefore, language editing is required to improve the quality of the manuscript. Grammatical mistakes and typographical errors should be corrected.

3. Introduction: Some of the English terminology used is odd which needs to be updated during the revision.

4. Introduction, second paragraph, first sentence, needs citation.

5. The results section has some redundancy in many parts. Authors should update this, focusing on the specific outcomes and their significance.

6. It is not clear how the destabilizing parameters compared with the values estimated by different tools based on totally different approaches?

7. The terms deleterious/pathogenic/destabilizing should be described clearly, and their correlations with the prediction should be discussed.

8. Table 1, a column mentioning the reference for the corresponding tool, should be added.

9. Discussion should be focused. The author should adhere to the results obtained from experiments.

10. Conclusion should be crisp and focused. Outcome must be highlighted in the conclusion section. Conclusions should provide be more details and further highlight the work and its potential importance/application

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Tooba Naz Shamsi

Reviewer #2: No

Reviewer #3: Yes: Dr. Ethayathulla Abdul samath

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2022 May 4;17(5):e0267084. doi: 10.1371/journal.pone.0267084.r002

Author response to Decision Letter 0


27 Mar 2022

Journal: PLOS ONE

Manuscript Title: Comparative analysis of web-based programs for single amino acid substitutions in proteins

Manuscript ID: PONE-D-21-35921

Reviewer #1:

1. Introduction should be comprehensive. A discussion on the significance of these tools should be addressed.

Response: Thank you for your valuable suggestion. Now we have discussed the introduction more compressively comprising the significance of each tool.

2. The rationale to do such kind of review and analytical comparison must be described.

Response: The rationale of the study has been described now.

3. The calculations were performed on 15 different predictors. Are these tools based on the same prediction methods? Discuss this in the manuscript.

Response: Different tools are not based on the same method. We have discussed this in the revised manuscript.

4. The authors used five different proteins in the study. What would be the results of calculations upon only one protein?

Response: Five different proteins were used for high accuracy, involving the divorce datasets in the calculation. It minimizes the false prediction where only one protein can have less coherent outcomes.

5. Were the average values estimated for all the parameters of each tool?

Response: No, the prediction was considered as yes or no for destabilizing mutation.

6. The explanation of the observed differences in the prediction power of each tool should be elaborated in light of their algorithms and resultant decision.

Response: Thank you for your valuable suggestion. Now we have discussed the prediction power of all the tools and their algorithms more compressively.

7. Authors double-check the manuscript for abbreviations used.

Response: The manuscript has been thoroughly checked for abbreviations and updated during this revision. Thanks!

8. Language editing is required to improve the quality of the manuscript. The author should recheck this manuscript carefully and remove typos and grammatical errors.

Response: The manuscript has now been thoroughly checked and corrected for typos and language errors.

9. All references should be thoroughly checked, and especially Author must confirm only relevant publications should be cited.

Response: The reference section has been updated now.

Reviewer #2:

This study compares various available bioinformatics tools for predicting the impact of mutations in human proteins on their structure and function. The scope of the study is wide, with over a dozen computational web-based tools to evaluate their prediction power. The current form of the manuscript, unfortunately, ignores the following points which need to be resolved during revision.

• Despite interesting findings this paper lacks sound rationale and experimental support. The drawn conclusion should be focused and crisp.

Response: Thank you for your valuable suggestion. Now we have highlighted the rationale and discussion part of the revised manuscript.

• What is the basis of selecting five different proteins in this study? Is there any evidence that SNPs under consideration in this study are associated with disease? Discuss the rationale for this.

Response: Multiple proteins were used to identify diseased mutations since studying only one protein can provide some false positives. To avoid any false prediction, multiple datasets were used, warranting more accuracy of the outcomes. Yes, there are several evidences SNPs under consideration are associated with disease progression. The text has been updated in the revised manuscript.

• Is there any clinical evidence showing that the destabilizing/deleterious nsSNPs are associated with protein dysfunction?

Response: There are several clinical findings that the destabilizing/deleterious nsSNPs of the selected proteins are associated with protein dysfunction resulting in disease progression. We have updated the discussion part of the revised manuscript.

• How was the comparison of different predictors batched? A little detail of each predictor must be addressed.

Response: The prediction was considered as an independent decision of each tool. Now we have described each predictor during this revision.

• Method section may be shortened.

Response: The method section has already been written briefly. Shortening the method section may cause intricacy for the readers.

• Discussion should be improved in light of the author's findings and previous literature.

Response: Thank you for this valuable suggestion; the discussion section has now been updated.

• The authors have not explained any relation between the prediction strategies of different predictors. This should be highlighted in the text.

Response: We have used different tools for detecting the pathogenicity of the variations. The relation between the prediction strategies of these predictors has been highlighted in the revised manuscript.

• It is necessary to write the reasons why different five proteins were used as the benchmark.

Response: Five different proteins were used to avoid any false prediction; only one protein can provide some false positives. Now the reason for this has been updated in the revised manuscript.

• In conclusion, the authors write that “Out of the structure-based web tools, mCSM showed a higher number of mutations as destabilizing”. What does it mean? This statement should be elaborated.

Response: We have studied only destabilizing/diseased mutations to compare the predictive power of each tool. Here mCSM showed a higher number of mutations as destabilizing means having higher prediction power than others. We have discussed this in more detail during this revision.

• The results section is redundant. Please revise, focusing on the specific outcomes and their importance. The strength of the author’s findings should be highlighted.

Response: The result section has been updated in light of the reviewers comment.

• A uniform presentation is required. The author should proofread the manuscript before final submission.

Response: The manuscript has been thoroughly checked and updated during this revised submission.

Reviewer #3:

Comments to the Authors:

1. Abstract: Can be concise and trim-down; repetitive meanings should be avoided.

Response: Thank you for your suggestion. Now the abstract of the manuscript has been updated during this revision.

2. The scientific problem is described well, however, there are a few language mistakes in the text. Therefore, language editing is required to improve the quality of the manuscript. Grammatical mistakes and typographical errors should be corrected.

Response: The manuscript has now been thoroughly checked and corrected for typos and language errors.

3. Introduction: Some of the English terminology used is odd which needs to be updated during the revision.

Response: Now, the introduction section has been checked for any odd terminology and updated in this revised submission.

4. Introduction, second paragraph, first sentence, needs citation.

Response: The citation has now been added to the respective section.

5. The results section has some redundancy in many parts. Authors should update this, focusing on the specific outcomes and their significance.

Response: The result section has been revised as per the reviewer's suggestion. We have updated this in a more comprehensive way now.

6. It is not clear how the destabilizing parameters were compared with the values estimated by different tools based on totally different approaches?

Response: Each tool gives its score based on some calculations and predicts either a mutation in a protein is destabilizing or not. This has been discussed in the revised submission.

7. The terms deleterious/pathogenic/destabilizing should be described clearly, and their correlations with the prediction should be discussed.

Response: Thank you for your suggestion. Now we have described these terms and discussed their correlation with the prediction in the revised submission.

8. Table 1, a column mentioning the reference for the corresponding tool, should be added.

Response: Added

9. Discussion should be focused. The author should adhere to the results obtained from experiments.

Response: The discussion part of the manuscript has been revised now.

10. Conclusion should be crisp and focused. Outcome must be highlighted in the conclusion section. Conclusions should provide be more details and further highlight the work and its potential importance/application.

Response: Thank you for your suggestion. Now we have updated the conclusion of the revised manuscript.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 1

Timir Tripathi

4 Apr 2022

Comparative analysis of web-based programs for single amino acid substitutions in proteins

PONE-D-21-35921R1

Dear Dr. Hassan,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Timir Tripathi, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #3: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #3: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #3: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors have addressed all the suggestions and fixed them very well. The revised manuscript should be accepted as is.

Reviewer #3: The authors have addressed all the concerns and queries in the manuscript. Accordingly to my review it can be accepted in the current status

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Tooba Naz Shamsi

Reviewer #3: Yes: Ethayathulla Abdul Abdul samath

Acceptance letter

Timir Tripathi

25 Apr 2022

PONE-D-21-35921R1

Comparative analysis of web-based programs for single amino acid substitutions in proteins

Dear Dr. Hassan:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Timir Tripathi

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Data

    (XLSX)

    S1 Fig. Distribution of deleterious/destabilizing mutations predicted by all 15 tools for PARK7.

    (DOCX)

    S2 Fig. Distribution of deleterious/destabilizing mutations predicted by all 15 tools for PARK2.

    (DOCX)

    S3 Fig. Distribution of deleterious/destabilizing mutations predicted by all 15 tools for PSN1.

    (DOCX)

    S4 Fig. Distribution of deleterious/destabilizing mutations predicted by all 15 tools for HRAS.

    (DOCX)

    S5 Fig. Distribution of deleterious/destabilizing mutations predicted by all 15 tools for RUNX1.

    (DOCX)

    Attachment

    Submitted filename: Response to Reviewers.docx

    Data Availability Statement

    All relevant data are within the manuscript and its Supporting Information files.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES