Skip to main content

Some NLM-NCBI services and products are experiencing heavy traffic, which may affect performance and availability. We apologize for the inconvenience and appreciate your patience. For assistance, please contact our Help Desk at info@ncbi.nlm.nih.gov.

Contrast Media & Molecular Imaging logoLink to Contrast Media & Molecular Imaging
. 2022 Jul 31;2022:4202623. doi: 10.1155/2022/4202623

In silico Prediction of Deleterious Single Nucleotide Polymorphism in S100A4 Metastatic Gene: Potential Early Diagnostic Marker

Aisha Farhana 1,, Sangeetha Kothandan 2, Abdullah Alsrhani 1, Pooi Ling Mok 1,3, Suresh Kumar Subbiah 4,5,6, Yusuf Saleem Khan 7
PMCID: PMC9357733  PMID: 35965620

Abstract

S100A4 protein overexpression has been reported in different types of cancer and plays a key role by interacting with the tumor suppressor protein Tp53. Single nucleotide polymorphisms (SNP) in S100A4 could directly influence the biomolecular interaction with the tumor suppressor protein Tp53 due to their aberrant conformations. Hence, the study was designed to predict the deleterious SNP and its effect on the S100A4 protein structure and function. Twenty-one SNP data sets were screened for nonsynonymous mutations and subsequently subjected to deleterious mutation prediction using different computational tools. The screened deleterious mutations were analyzed for their changes in functionality and their interaction with the tumor suppressor protein Tp53 by protein-protein docking analysis. The structural effects were studied using the 3DMissense mutation tool to estimate the solvation energy and torsion angle of the screened mutations on the predicted structures. In our study, 21 deleterious nonsynonymous mutations were screened, including F72V, E74G, L5P, D25E, N65S, A28V, A8D, S20L, L58P, and K26N were found to be remarkably conserved by exhibiting the interaction either with the EF-hand 1 or EF-hand 2 domain. The solvation and torsion values significantly deviated for the mutant-type structures with S20L, N65S, and F72L mutations and showed a marked reduction in their binding affinity with the Tp53 protein. Hence, these deleterious mutations might serve as prospective targets for diagnosing and developing personalized treatments for cancer and other related diseases.

1. Introduction

S100A4 is a calcium-binding protein belonging to the S100 family of proteins and contributes to the metastasis of different cancer. The increased expression of the S100A4 protein is associated with poor prognosis in patients with various cancer types and is a predictive marker for colorectal and breast cancer [16].

S100A4 exists in intracellular and extracellular forms and possesses no enzymatic activity. However, it has been shown to interact with numerous tumor-related proteins promoting tumor progression through an increase in motility, invasion, apoptosis inhibition, and cancer metastasis through the induction of prometastatic activities such as angiogenesis stimulation [79]. The stimulation of S100A4 attracts immune cells to the cancerous regions and promotes cytokine and growth factor secretion towards the tumor niche. T-lymphocytes are stimulated by chemotaxis by forming a complex with PGLYRP1, resulting in lymphocyte migration through the CCR5 and CXCR3 receptors.

The two EF-hand calcium-binding domains (helix-loop-helix motif) are parts of the S100 monomer protein, in which the N-terminal EF-hand comprises 14 amino acids. This part attaches the calcium through weak carbonyl oxygen atoms present in the backbone. At the same time, the C-terminal end is composed of 12 amino acids and binds calcium through the side-chain and carboxylates oxygen with higher affinity [10].

A striking conformational change occurs after calcium binding to the protein, resulting in the disclosure of a hydrophobic binding pocket in each monomer. The interaction of the calcium with the monomeric molecule paves the way for the binding of the other intracellular or extracellular proteins [11, 12]. Upon dimerization, p53 binds with S100A4 resulting in the degradation of p53. The proapoptotic function of Tp53 is also modulated by the binding of the C-terminal transactivation domain with S100A4, which leads to a reduction in the concentration of Tp53 protein levels [13].

S100A4 acts as a metastasin, playing a role in tumor progression by interacting with proteins that include p53 tumor suppressor proteins, annexin, nonmuscle myosin, and liprin β-1 [1315]. The mutations which result in deleterious or neutral types may impact the protein structure or function and gene regulation and their downstream interactions with other proteins [16]. Deleterious mutations represent the harmful effect on the health of the organisms influenced by many genetic alterations, resulting in the cancer phenotype leading to driver alterations or also as simply drivers. This influences the cancer-related pathways, resulting in the occurrence of the same genes and loci in different patients, whereas the neutral mutations are believed to show nonsignificant phenotypic changes to neoplastic cells [17].

Among the SNPs, 50% of the mutations are consequences of nsSNPs and are reported in autoimmune, genetic, and inflammatory diseases. The changes in the amino acids due to SNPs could alter the protein structures as reflected by the changes in protein dynamics, geometry, charge, hydrophobicity, and finally, the interaction of the protein with other proteins or factors.

Hence, in the present study, the deleterious nonsynonymous SNPs of the S100A4 gene were identified, and their structural and functional effects were analyzed in silico. The detection of these deleterious SNPs could help propose the development of personalized treatments.

2. Materials and Methods

2.1. Retrieval of nsSNPs

The information on nsSNPs was retrieved from the National Center for Biotechnology Information (NCBI) (https://www.ncbi.nlm.nih.gov/snp/?term=S100A4), and their respective protein sequences were retrieved from Uniprot (Figure 1). The information on SNP ID, residue alteration, and location were accessed and subjected to subsequent studies.

Figure 1.

Figure 1

Pie chart indicating the SNP distribution in the S100A4 gene as retrieved from the dbSNP database.

2.2. Identification of Deleterious SNPs

The bioinformatics tools SIFT (https://sift.jcvi.org/www/SIFT_seq_submit2.html), PANTHER (Protein Analysis Through Evolutionary Relationship) (https://www.pantherdb.org/tools), PolyPhen-2 (Polymorphism Phenotyping v2) (https://genetics.bwh.harvard.edu/pph2/), PROVEAN (https://provean.jcvi.org/index.php), and Predict SNP (https://loschmidt.chemi.muni.cz/predictsnp) were used to predict the deleterious nsSNPs [1821].

2.3. Identification of nsSNPs in the Domains of Protein S100A4

The software InterPro (https://www.ebi.ac.uk/interpro/) was used to identify nsSNPs locations on protein S100A4 conserved domains. The motif region, domain prediction, and functional characteristics of the proteins were identified by this tool [22].

2.4. Evaluating the Effect of the nsSNPs on Protein Stability

The impact of the mutations on the structure and stability of the protein was investigated by the I-Mutant 2-056 (https://folding.biofold.org/i-mutant/i-mutant2.0.html) tool, and the data of nsSNPs protein S100A4 was submitted in FASTA format [23].

2.5. Analyzing Protein Evolutionary Conservation

The ConSurf (https://consurf.tau.ac.il) tool was employed to identify the evolutionary conservation of amino acids. The analysis was based on the phylogenetic relationships between homologous sequences [24] (Figure 2). The nsSNPS that were found to be highly conserved were used, listed, and analyzed further.

Figure 2.

Figure 2

Evolutionary conservation of S100A4 produced by ConSurf. The conservation scale variations of the different amino acids are given in which ‘e' represents an exposed residue; ‘b' represents a buried residue; ‘f' is a functional residue that is highly conserved and buried; ‘s' is a structural residue that is highly conserved and buried.

2.6. Structural Effect Prediction on Human S100A4 Protein

HOPE61 (https://www3.cmbi.umcn.nl/hope/) was utilized to predict the SNP's effect on the protein structure. The S100A4 protein with the UniProt Acc IDQ8WWW0 and 24 individual nsSNPs were used as input [25, 26]. The Swiss-PDB viewer (https://spdbv.vital-it.ch/) was utilized for the mutated protein model generation with corresponding amino acid substitutions. The 3D Missense mutation tool was used to estimate the solvation energy and torsion angle of the mutations on the predicted structures. It was compared with the wild-type sequence for the deviations [27, 28]. TM-align was used for the comparison of the native and mutated protein structures.

2.7. Post-Translational Modification Sites Prediction

The different post-translational modifications of the proteins at amino acids such as serine, threonine, and tyrosine were predicted by the tool NetPhos 3.1. A score of greater than 0.5 was obtained through analysis by NetPhos 3.1 predicted amino acid phosphorylation. The sites of ubiquitylation and SUMOylation were also predicted.

2.8. Detection of SNPs in miRNA Target Sites

The miRNA seed and target site in UTR regions were detected using the Poly miRTS database web server, and the transcript NM_019554 was used as a query sequence (https://compbio.uthsc.edu/miRSNP/). The chromosome location chr1(-):153516094-153518282 and the SNP rs IDs were submitted to the analysis server.

2.9. Molecular Docking

The molecular interactions between the S100A4 protein of the selected deleterious mutations and the target protein Tp53 were studied using AutoDock Vina and ClusPro v2.0. The binding energy and the interactions of amino acid residues between the mutant models of S100A4 protein and Tp53 protein were analyzed.

2.10. Molecular Dynamics Simulation CABS-Flex 2.0

The CABS-flex 2.0 web server (http://biocomp.chem.uw.edu.pl/CABSflex2/) was used to study the dynamic simulation of the mutant proteins. The simulation was carried out with the default parameters of 50 cycles for 10 ns and a 1.0 fixed global weight for the modeled protein complexes.

2.11. Correlation of Identified SNPs in the COSMIC Database

The identified SNPs of the S100A4 gene were also verified in the COSMIC database to comprehend their effect on different malignancies. The COSMIC database provides comprehensive information on the somatic mutations of human cancer and their distribution (https://cancer.sanger.ac.uk/cosmic). The gene S100A4 was confirmed by searching for the missense mutation.

2.12. Protein-Protein Interaction (PPI) Networks and Functional Annotation

STRING v11 (http://www.string-db.org) was used to construct an interactome map of the S100A4 genes screened with the key genes involved in the EMT Pathways (TGF-β, Wnt, Notch, and Hedgehog signaling pathways). The PPI network was constructed with the key genes involved in this pathway, such as TGF-β, Smad 2, Smad-3, ZEB1, Foxc-2, TWIST, Snail, slug, E-Cadherin, N-Cadherin, β-Cadherin, PTEN, P13K, AKT2, GSK 3β, STAT-3, Cyclin D1, C-myc, Survin, MUC-1, PRR–X1, ZNF488, VGLL4, and Gli 1, were curated and subjected to the interaction analysis. Cytoscape (version 3.6.1) was used to visualize the PPI network, and the pivotal nodes were recognized based on the connectivity degrees.

3. Results

3.1. SNP Annotation

The NCBI dbSNP database for S100A4 had 1718 SNPs data in which 55 in-frame deletions, 57 initiator codon variants, 1232 intron, 54 noncoding transcript variants, and 137 SNPs were missense SNPs. Those 137 SNPs were subjected to further analysis.

3.2. Identification of Deleterious nsSNPs

Four different in silico nsSNP prediction tools predicted 24 SNPs of the protein S100A4 as deleterious and damaging (Table 1). The SIFT scores, PANTHER scores, and polyphen2 scores with the other neutral 55 neutral mutations that were screened were compared.

Table 1.

Identification of deleterious SNPs in S100A4.

SNP ID SNPs PROVEAN score PROVEAN SIFT Polyphen2 PANTHER
rs116208483 L62V −2.733 Deleterious Deleterious Damaging Damaging
rs147390231 T39I −2.735 Deleterious Tolerated Damaging Damaging
rs148291612 F72V −6.655 Deleterious Deleterious Damaging Damaging
rs199505533 G92A −2.645 Deleterious Tolerated Damaging Damaging
rs200099267 E74G −6.630 Deleterious Deleterious Damaging Damaging
rs368160023 E88K −2.842 Deleterious Tolerated Damaging Damaging
rs373367471  L5P −6.050 Deleterious Deleterious Damaging Damaging
rs377093845 D25E −3.493 Deleterious Deleterious Damaging Damaging
rs536309763 N65S −4.491 Deleterious Deleterious Damaging Damaging
rs566299932 A8V −3.698 Deleterious Deleterious Damaging Damaging
rs566299932 A8D −5.420 Deleterious Deleterious Damaging Damaging
rs576307674  R40W −4.053 Deleterious Deleterious Damaging Damaging
rs747430747  S20L −5.594 Deleterious Deleterious Damaging Damaging
rs747868513 V70A −3.450 Deleterious Deleterious Damaging Damaging
rs751051544 E88G −4.461 Deleterious Deleterious Damaging Damaging
rs754093018  K57E −2.525 Deleterious Deleterious Damaging Damaging
rs759858655 L58P −5.658 Deleterious Deleterious Damaging Damaging
rs762009722 A8S −2.619 Deleterious Deleterious Damaging Damaging
rs762542639 L62P −6.701 Deleterious Deleterious Damaging Damaging
rs762597174 D92V −4.467 Deleterious Deleterious Damaging Damaging
rs766589410 F89I −5.246 Deleterious Deleterious Damaging Damaging
rs772235092 E69L −6.263 Deleterious Deleterious Damaging Damaging
rs868262406 K26N −4.593 Deleterious Deleterious Damaging Damaging
rs899447674 F72L −5.865 Deleterious Deleterious Damaging Damaging

3.3. Identification of nsSNPs on the Domains of S100A4

InterPro predicted the two functional domains of the protein S100A4 as EF-Hand 1 and EF-Hand 2. The deleterious nsSNPs identified by different in silico nsSNP prediction tools were further subjected to identification of their location on the two domains, namely EF-hand 1 and EF-hand 2.

3.4. Determination of Protein Structural Stability

The RI (Reliability Index) and free energy change values (DDG-Delta Delta G) were predicted by the I-Mutant tool. This helped us to analyze the stability changes represented in Table 2.

Table 2.

Prediction of the effect of nsSNPs on protein stability by I-MUTANT 2.0.

SNP ID Amino acid substitution I-mutant DDG RI
rs116208483 L62V Decrease 1.75 8
rs147390231 T39I Decrease 1.09 5
rs148291612 F72V Decrease −3.56 9
rs199505533 G92V Decrease −0.83 2
rs200099267 E74G Decrease −1.65 6
rs368160023 E88K Decrease −0.82 7
rs373367471  L5P Decrease −1.152 7
rs377093845 D25E Decrease 0.05 1
rs536309763 N65S Decrease −0.88 8
rs566299932 A8V Decrease −0.07 2
rs566299932 A8D Decrease −1.12 4
rs576307674  R40W Decrease 0.20 3
rs747430747  S20L Decrease −1.05 1
rs747868513 V70A Decrease −2.17 9
rs751051544 E88G Decrease −0.74 6
rs754093018  K57E Decrease −0.43 5
rs759858655 L58P Decrease −0.43 8
rs762009722 A8S Decrease −0.82 8
rs762542639 L62P Decrease −2.32 7
rs762597174 D92V Decrease 0.76 2
rs766589410 F89I Decrease −1.04 8
rs772235092 E69L Decrease 0.08 4
rs868262406 K26N Decrease 0.16 1
rs899447674 F72L Decrease −2.52 8

3.5. Evolutionary Conservation Analysis

The ConSurf Analysis was carried out for 101 amino acid residues of S100A4 identified as SNPs. The highly conserved and exposed residues with the functional characteristics were identified as M1, E6, Y19, S20, G24, L29, E33, E41, E63, N65, D67, E74, N81, and K101. The residues L5, F16, L29, L62, and F72 were highly conserved structural residues buried within the protein structure.

3.6. Impact of nsSNPs on Human S100A4 Protein Structure

Among the mutations, F72V, E74G, L5P, D25E, N65S, A28V, A8D, S20L, L58P, and K26N were highly conserved and exhibited an interaction with the calcium-binding domain, EF-hand 1 or EF-hand 2 domains. The deleterious mutations F72V and E74G were located within the EF-hand 2 protein and were demonstrated to disrupt the calcium ion interaction. In the deleterious mutation, D25E, the amino acid residue of the wild type is smaller than the mutant residue. Distinctively, in the N65S deleterious mutation, the change in the amino acid to serine has made the occupied site significantly smaller and hydrophobic. This mutation highly affects the structure and causes destabilization as it is situated in EF-hand 2. Additionally, it also leads to the loss of the cysteine bond. The mutations K26N and S20L were analyzed as examples presented in Figure 3.

Figure 3.

Figure 3

Structural analysis of the human S100A4 protein. Figures (a) and (b) represents the structural alteration due to the changes in the amino acid residue N26 and 20 L respectively as analyzed by Project HOPE. The green color represents the wild-type residue and the red color is shown by the mutant residue. (a) K26N. (b) S20L.

Similarly, the mutation in A28V causes disturbance in the core structure of the domain as the mutant residues are buried. In A8D, the mutant residues are bigger and neutral, disturbing the domain core structure and binding properties. The impact of the other deleterious mutations is presented in Table 3.

Table 3.

Mutational effects on the structure and conservation of the S100A4 protein.

S.No Mutation Structure Conservation
rs116208483 L62V EF-hand -2 Not conserved
rs147390231 T39I EF-hand 1 Not conserved
rs148291612 F72V EF-hand domain Located near a highly conserved protein
rs199505533 G92V Surface of the domain Not conserved
rs200099267 E74G Surface of the domain Conserved
rs368160023 E88K Surface of the domain Not conserved
rs373367471  L5P Surface of the domain Conserved
rs377093845 D25E EF-hand 1 Very conserved
rs566299932 A8V EF-hand domain Very conserved
rs566299932 A8D Surface of the domain Very conserved
rs576307674  R40W EF-hand 1 Not conserved
rs747430747  S20L EF-hand 1 Very conserved
rs747868513 V70A EF-hand 2 Very conserved
rs754093018  K57E EF-hand 2 Conserved
rs759858655 L58P EF-hand 2 Not conserved
rs766589410 F89I EF domain hand pair Very conserved
rs868262406 K26N EF-hand 1 Very conserved
rs762542639 L62P EF-hand 2 Very conserved
rs536309763 N65S EF-hand 2 Very conserved
rs751051544 E88G EF-hand 2 Not conserved
rs762542639 L62P EF-hand 2 Very conserved
rs762597174 D92V Present in turn Not conserved
rs772235092 E69L EF-hand 2 Very conserved
rs899447674 F72L EF-hand 2 Not conserved
rs147390231 T39I EF-hand 1 Not conserved

3.7. Structure Analysis of Mutant and Wild Models

3D models were predicted for the 24 deleterious nsSNPs and compared with the wild-type model, which showed the solvation energy of −0.42 and the torsion angle of −1.10. The mutation showed a higher deviation in both the solvation energy and the torsion from the wild type.

3.8. Post-Translational Modification Site Prediction

The phosphorylation sites were predicted at the regions of 15T, 50T, 20S, 60S, 64S, and 80S sites, and only the 60S highly deleterious nsSNP was found. The ubiquitylation sites were located at 100K and 101K predicted by UBpred. SUMOylation sites were not observed in any of the predicted highly deleterious nsSNPs.

3.9. Detection of SNPs in miRNA Target Sites

The PolymiRTS database detected four sites for miRNA binding due to 5 SNPs in the UTR region, and the sites were predicted to be abolished by these SNPs. The results are presented in Table 4.

Table 4.

Detection of SNPs effect in the S100A4 regulatory region by PolymiRTS database.

Location dbSNP ID Variant type Wobble base pair Ancestral Allele miR ID Conservation miRSite Function Experimental Context + score change
Allele Class Support
153516154 rs113443697 SNP N G G hsa-miR-505-5p 7 acccTGGCTCCtt D N −0.217
C hsa-miR-1207-5p 5 aCCCTGCCtcctt C N −0.255
hsa-miR-1827 5 accCTGCCTCctt C N −0.169
hsa-miR-3612 7 acccTGCCTCCtt C N −0.216
hsa-miR-4695-5p 5 accctGCCTCCTt C N −0.238
hsa-miR-4763-3p 5 aCCCTGCCtcctt C N −0.255
hsa-miR-650 7 acccTGCCTCCtt C N −0.16
hsa-miR-6808-5p 5 acCCTGCCTcctt C N −0.189
hsa-miR-6893-5p 5 acCCTGCCTcctt C N −0.17
hsa-miR-940 5 acCCTGCCTcctt C N −0.133
153516160 rs1051044 SNP N A C hsa-miR-6798-5p 6 TtttcCCCCCTGg C N 0.226
153516185 rs143742855 SNP N C C hsa-miR-1273a 2 cccTGTCGCCAgt D N 0.437
T hsa-miR-196a-3p 2 cccTGTTGCCAgt C N 0.264
T hsa-miR-3128 2 ccctgTTGCCAGt C N 0.093

N in wobble pair column represents ‘No.' This indicates that the particular SNP cannot form a G:U wobble base pair with miRNA. N in the experimental support column also represents ‘No' It indicates that so far no experimental data have been reported for the predicted target site. In the functional class column, D represents that a derived allele disrupts a conserved miRNA site (ancestral allele with support ≥2), and N represents that the derived allele disrupts a nonconserved miRNA site (ancestral allele with support ≥2).

3.10. Molecular Docking

All the mutant models of S100A4 interacted with a very low binding affinity with the Tp53 protein, as observed in Clus Pro, which ranged from −564.4 to 670. Also, the deviation in the hydrogen bond interaction with the target was observed. The energy minimization through the Swiss PDF viewer revealed a larger scale of variation for all the mutants, but in specific E88G, the wild-type showed −6525 kJ/mol and the mutant exhibited −7625.542 kJ/mol. Also the differences in the energy minimization was observed for the other mutants S20L (−6371.678 kJ/mol; A8D (−6392.843 kJ/mol); A8V (−6264.361 kJ/mol); D25E (−6578.558 kJ/mol); E74G (−6390.220 kJ/mol; F72V (−6420 kJ/mol); L5P (6319.368 kJ/mol).

3.11. Molecular Dynamics Simulations

Root mean square fluctuation (RMSF) was determined by molecular dynamic simulations to understand the atomic level deviation of the mutant proteins in physiological conditions. The RMSF values of the mutant models E88G and D25E were lower than the wild-type model. In the mutant model, E88G, a decrease in the RMSF value was observed (1.0650) compared to the wild-type (1.540). In the mutant model, D25E, the RMSF (3.5130) was very close to the mutant model with a RMSF value of 3.650. The alteration in the RMSF conferred a loss of versatility in the protein's mutant structure, leading to changes in the dynamic behavior. More flexibility was observed between the residues 30 and 70 which significantly affected the stability of the protein .

3.12. Correlation of Identified SNPs in the COSMIC Database

The search in the cosmic database resulted in the identification of 4 hits with the following Ensemble IdsS100A4_ENST00000368714; S100A4_ENST00000354332; S100A4,ENST00000368716.8; S100A4_ENST00000368715 and reported 388 mutations. Further analysis of the missense substitution of the Ensemble IdsS100A4_ENST00000368714 resulted in the report of E88K in the COSMIC database among the screened 21 nonsynonymous SNPs. This E88K mutation has been identified as a somatic mutation reported in large intestine carcinoma and adenocarcinoma. The tissue distribution of sample 1 has been reported for this mutation, with the FATHMM prediction score of 0.82 being pathogenic [13].

3.13. Protein-Protein Interaction (PPI) Networks and Functional Annotation

16 nodes and 65 edges in the PPI network had a local clustering coefficient of 0.734. An enrichment p value of less than <1.0e − 16 with an average node degree of 8.12 was detected. S100A4 gene interacted with the TWIST1, SNAI2, CDH2, CDH1, ZEB1, and SMAD2 directly and also with the other interactors such as PTEN, C-MYC, MUCI, CCND1, AKT2, and TGIF-2. Some PPI network connections with the S100A protein, signifying known and predicted genes are indicated in Figure 5.

Figure 5.

Figure 5

Protein-Protein interaction of the genes involved in different signaling pathways with S100A4 in String Database V 11.0. The colored nodes represent the first shell of interactors, and the green connecting lines represent the gene neighbourhood. The black lines between the genes represent gene coexpression.

4. Discussion

S100A4 interacts with the tumor suppressor protein Tp53 in the various cancer pathways and with other transcription factors involved in the EMT pathways. Thus, the mutations in S100A4 could influence the biomolecular interactions with their target proteins due to their aberrant conformations and possibly affect their downstream functions [2931]. Therefore, identifying the deleterious nsSNPs in S100A4 could help determine their influence, detrimental effects, and progression mechanisms of various cancers [32].

Among the 77 nsSNPs of S100A4 found in the NCBI database, we screened 56 mutations showing neutral effects and 21 significantly deleterious nsSNPs using the PROVEAN in silico SNP prediction tool. However, among the twenty-one mutations, rs147390231 (T39I), rs199505533 (G92A), and rs368160023 (E88K) were observed to be tolerated in the SIFT algorithm. Polyphen and PANTHER identified 21 nsSNPs as damaging mutations and, hence, these were reconfirmed and analyzed by different tools (Figure 1). The deleterious mutations predicted thus far in our study were further studied through the InterPro tool to identify the location of the nsSNPs on various domains present in S100A4. The 21 mutations were located in the protein's EF-hand 1 and EF-hand 2 domains. Five nsSNPs were positioned in the EF-hand 1 domain and nine nsSNPs in the EF-hand 2 domain, disturbing the interactions and calcium-binding properties (Figure 2).

Among the highly scored deleterious mutations, F72V, E72G, L5P, D25E, N65S, ASV, ASD, S20L, L58P, F89I, and K26N were found to be largely conserved and affecting the functionality of the protein (Figure 3). The mutation F72V located in the domain of EF-hand 2, disturbs the interaction of the protein with the calcium ion. In E72G, the mutant residue is smaller, neutral, and hydrophobic than the wild-type residue and subsequently influences the interaction with the metal ion, calcium (Figure 4).

Figure 4.

Figure 4

RMSF Plots of different mutants in comparison with the wild type of S100A4 Protein. (a) E88G. (b) D25E. (c) S20L. (d) F72V. (e) Wild type.

Similarly, in the L5P mutation, the size difference of the mutated amino acid influences its structural interaction. The amino acid residue phenylalanine at position 5 occupies a larger space in the wild-type protein. It forms a hydrogen bond with phenylalanine at position 27 and a salt bridge with lysine at position 28, which is disrupted in the mutated protein. In the D25E mutation, the mutant fails to form bonds at the respective position, affecting the stability (Figure 4). Furthermore, in N65S, the mutant residue is smaller than the wild-type residue. The lack of cysteine bridge formation affects the protein stability in the mutant type, causing the loss of interaction, which produces a severe effect on the 3D structure of the protein. In the A8V mutation, the mutant residue is bigger and buried in the core in contrast to the wild-type residue (Figure 4). This greatly influences the multimeric interactions of the protein. Additionally, the wild-type alanine is located in an alpha-helix, which is changed to an unfavorable valine residue in the mutant disturbing the core structure of this domain and affecting the binding properties of the protein.

For the A8D mutation, the mutant aspartic acid is negatively charged and less hydrophobic than wild-type alanine. Thus, the mutation has introduced a bigger and more charged residue, disturbing the multimeric interactions and protein folding properties. In S20L, the mutant residue is bigger than the wild-type residue. This mutation causes the loss of hydrogen bonds in the core, resulting in the disturbance of the correct folding, which subsequently influences the protein structure and function in the EF-hand 2 domain. Likewise, for the K26N mutation, the size differences of the amino acid disturb the interaction with the calcium ion, which leads to the destabilization of the domain (Figures 3 and 4).

The F89I mutation being both deleterious and highly conserved has the mutant residue, isoleucine located in a domain important for binding of other molecules. The mutation could influence the interaction between two domains and the possible loss of external interactions was predicted. The smaller size of the mutant residues is too small to make multimer contacts which could also affect the functionality of protein.

As predicted by the I-Mutant tool, the observed deleterious mutations in S100A4 showed a decrease in stability and significant changes in the RI and free energy change values (DDG).

Protein structural stability is important to maintain the native structure and function of the proteins. The structural and functional parameters were estimated in this study based on the ΔG value. The energy minimization through the Swiss PDF viewer revealed a larger scale of variation for all the mutants. Still, specifically for E88G, the wild-type showed −6525 kJ/mol and the mutant exhibited −7625.542 kJ/mol. Also the differences in the energy minimization was observed for the other mutants S20L (−6371.678 kJ/mol; A8D (−6392.843 kJ/mol); A8V (−6264.361 kJ/mol); D25E (−6578 .558 kJ/mol); E74G (−6390.220 kJ/mol; F72V (−6420 kJ/mol); L5P (6319.368 kJ/mol). All the predicted SNPs showed a negative value, indicating the decrease in the stability of the structure and hence its becoming unfavorable due to its unfolded/misfolded state (Figure 4 and Table 2).

Furthermore, the mutant structures were subjected to molecular simulations, and the RMSF was estimated to evaluate their flexibility under varying physiological conditions. Limited fluctuations and flexibility represent the stable structural state [33]. The RMSF of atomic residues of the mutant models was considerably different from the wild-type, which inferred the decrease in the thermodynamic stability of the mutant models. This further could impair the structural stability and the functions of the proteins. As S100A4 interacts directly with the tumor suppressor protein Tp53 and the other interactors of the EMT pathway in cancer, it may have a profound effect on tumor suppression.

About 14 different miRNAs were identified in this study that have altered SNPs. These miRNA were reported to be modulated in different cancers, namely, hsa-miR-505-5p (human cervical cancer), hsa-miR-1827; hsa-miR-650 (colorectal cancer), hsa-mir-3612; hsa-miR-940(cervical cancer), hsa–miR-4695-3p; hsa-miR-4763-3p; hsa-miR-3128 (ERBB2/Her2 gene) [3440]. However, these alterations have been observed at different sites in regions other than the nsSNPs regions of the S100A4 gene (Table 4). It is noteworthy that these miRNAs can be explored further through proper clinical trials as potential treatment strategies.

The search for these deleterious nsSNPs identified that the E88K mutation was associated with the carcinoma of the large intestine and adenocarcinoma tissue samples. This signifies that studying these mutations in clinical samples and analyzing their possible effects on the interaction of S100A4 with the other proteins sought through in vitro and in vivo studies may lead to possible therapeutic interventions.

S100A4 has been demonstrated in the development of an aggressive metastatic phenotype progressing into cancer and metastasis. Also, the poor prognosis of cancer has been correlated with the upregulation of S100A4 in tumor cells, and its expression has been regulated by other factors like β-catenin, epidermal growth factor, tumor necrosis factor alpha (TNF-α), and methylation [41, 42]. In addition to its role in cancer metastasis, S100A4 is also reported in various pathophysiologies such as inflammation, fibrosis, angiogenesis, and neuroprotection [4345]. Except for the E88K which is associated with colorectal carcinoma, the deleterious nonsynonymous SNPs identified in this study through the COSMIC database could be further explored in the tissue samples of various cancers and in different physiological conditions. Hence, further interaction studies could also help design and facilitate rational drug designing through miRNAs and personalized treatment in patients.

5. Conclusion

Comprehensive bioinformatics analyses for the identification of the deleterious nonsynonymous mutations were performed for the S100A4 gene, which are reported to play a significant role in cancer and other pathophysiological diseases. In this study, twenty-four deleterious mutations were identified by Provean, SIFT, Polyphen, and PANTHER. The SNPs E88G, S20L, A8D, A8V, D25E, E74G, F72V, and L5P were highly conserved and interacted with the EF-hand domain of the protein, showing significantly higher energy minimization and structural instability, ultimately affecting the functionality of the protein. The E88K mutation identified by our analysis has been reported in the COSMIC database. We conclude that the plethora of mutations identified in this study can be explored in tissue samples of the various cancer types and physiological conditions to facilitate rational drug designing through miRNAs for personalized cancer treatment.

Acknowledgments

The authors acknowledge the Central Laboratory at Jouf University. The authors extend their appreciation to the Deputyship for Research and Innovation, Ministry of Education in Saudi Arabia for funding this research work through the project no. 375213500.

Data Availability

Data used in this study are available as hyperlinks in this paper.

Consent

This article does not contain any studies with human participants.

Disclosure

The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Authors' Contributions

A.F and S.K. were responsible for composing the manuscript. A.F, and S.K.S. were responsible for conceiving the experimental study design, analyzing the data, and editing the manuscript. AF, S.K, and Y.S.K analyzed the data, produced the figures, and did the statistical analysis. A.A and P.L.M edited the manuscript. All authors were involved in reviewing the manuscript.

References

  • 1.Cho Y. G., Nam S. W., Kim T. Y., et al. Overexpression of S100A4 is closely related to the aggressiveness of gastric cancer. Acta Pathologica, Microbiologica et Immunologica Scandinavica . 2003;111(5):539–545. doi: 10.1034/j.1600-0463.2003.1110502.x. [DOI] [PubMed] [Google Scholar]
  • 2.Cui J. F., Liu Y. K., Pan B. S., et al. Journal of Cancer Research and Clinical Oncology . 2004;130(10):615–622. doi: 10.1007/s00432-004-0576-5. [DOI] [PubMed] [Google Scholar]
  • 3.Missiaglia E., Blaveri E., Terris B., et al. Analysis of gene expression in cancer cell lines identifies candidate markers for pancreatic tumorigenesis and metastasis. International Journal of Cancer . 2004;112(1):100–112. doi: 10.1002/ijc.20376. [DOI] [PubMed] [Google Scholar]
  • 4.Rudland P. S., Platt-Higgins A., Renshaw C., et al. Prognostic significance of the metastasis-inducing protein S100A4 (p9Ka) in human breast cancer. Cancer Research . 2000;60(6):1595–1603. [PubMed] [Google Scholar]
  • 5.Mazzucchelli L. Protein S100A4: too long overlooked by pathologists? American Journal Of Pathology . 2002;160(1):7–13. doi: 10.1016/s0002-9440(10)64342-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Hernan R., Fasheh R., Calabrese C., et al. ERBB2 up-regulates S100A4 and several other prometastatic genes in medulloblastoma. Cancer Research . 2003;63(1):140–148. [PubMed] [Google Scholar]
  • 7.Ambartsumian N. S., Grigorian M. S., Larsen I. F., et al. Metastasis of mammary carcinomas in GRS/A hybrid mice transgenic for the mts 1 gene. Oncogene . 1996;13(8):1621–1630. [PubMed] [Google Scholar]
  • 8.Belot N., Pochet R., Heizmann C. W., Kiss R., Decaestecker C. Extracellular S100A4 stimulates the migration rate of astrocytic tumor cells by modifying the organization of their actin cytoskeleton. Biochimica et Biophysica Acta, Proteins and Proteomics . 2002;1600(1-2):74–83. doi: 10.1016/s1570-9639(02)00447-8. [DOI] [PubMed] [Google Scholar]
  • 9.Schmidt-Hansen B., Ornas D., Grigorian M., et al. Extracellular S100A4 (mts1) stimulates invasive growth of mouse endothelial cells and modulates MMP-13 matrix metalloproteinase activity. Oncogene . 2004b;23(32):5487–5495. doi: 10.1038/sj.onc.1207720. [DOI] [PubMed] [Google Scholar]
  • 10.Garrett S. C., Varney K. M., Weber D. J., Bresnick A.R. S100A4, a mediator of metastasis. Journal of Biological Chemistry . 2006;281(2):677–680. doi: 10.1074/jbc.r500017200. [DOI] [PubMed] [Google Scholar]
  • 11.Otterbein L.R., Kordowska J., Witte-Hoffmann C., Wang C. L., Dominguez R. Crystal structures of S100A6 in the Ca(2+)-free and Ca(2+)-bound states: the calcium sensor mechanism of S100 proteins revealed at atomic resolution. Structure . 2002;10(4):557–567. doi: 10.1016/s0969-2126(02)00740-2. [DOI] [PubMed] [Google Scholar]
  • 12.Wright N. T., Varney K. M., Ellis K. C., et al. The three-dimensional solution structure of Ca2+-bound S100A1 as determined by NMR spectroscopy. Journal of Molecular Biology . 2005;353(2):410–426. doi: 10.1016/j.jmb.2005.08.027. [DOI] [PubMed] [Google Scholar]
  • 13.Grigorian M., Andresen S., Tulchinsky E., et al. Tumor suppressor p53 protein is a new target for the metastasis-associated mts1/s100a4 protein: functional consequences of their interaction. Journal of Biological Chemistry . 2001;276(25):22699–22708. doi: 10.1074/jbc.m010231200. [DOI] [PubMed] [Google Scholar]
  • 14.Kriajevska M., Fischer-Larsen M., Moertz E., et al. Liprin beta 1, a member of the family of LAR transmembrane tyrosine phosphatase-interacting proteins, is a new target for the metastasis-associated protein S100A4 (Mts1) Journal of Biological Chemistry . 2002;277(7):5229–5235. doi: 10.1074/jbc.m110976200. [DOI] [PubMed] [Google Scholar]
  • 15.Semov A., Moreno M. J., Onichtchenko A., et al. Metastasis-associated protein S100A4 induces angiogenesis through interaction with Annexin II and accelerated plasmin formation. Journal of Biological Chemistry . 2005;280(21):20833–20841. doi: 10.1074/jbc.m412653200. [DOI] [PubMed] [Google Scholar]
  • 16.Tokuriki N., Stricher F., Serrano L., Tawfik D. S. How protein stability and new functions trade off. PLoS Computational Biology . 2008;4(2) doi: 10.1371/journal.pcbi.1000002.e1000002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.McFarland C. D., Korolev K. S., Kryukov G. V., Sunyaev S. R., Mirny L. A. Impact of deleterious passenger mutations on cancer progression. Proceedings of the National Academy of Sciences . 2013;110(8):2910–2915. doi: 10.1073/pnas.1213968110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Kumar P., Henikoff S., Ng P. C. Predicting the effects of coding nonsynonymous variants on protein function using the SIFT algorithm. Nature Protocols . 2009;4(7):1073–1081. doi: 10.1038/nprot.2009.86. [DOI] [PubMed] [Google Scholar]
  • 19.Capriotti E., Calabrese R., Fariselli P., Martelli P. L., Altman R. B., Casadio R. WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation. BMC Genomics . 2013;14(Suppl 3):p. S6. doi: 10.1186/1471-2164-14-s3-s6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Choi Y., Chan A. P. PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels. Bioinformatics . 2015;31(16):2745–2747. doi: 10.1093/bioinformatics/btv195. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Bendl J., Stourac J., Salanda O., et al. PredictSNP: robust and accurate consensus classifier for prediction of disease-related mutations. PLoS Computational Biology . 2014;10(1) doi: 10.1371/journal.pcbi.1003440.e1003440 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Apweiler R., Attwood T. K., Bairoch A., et al. The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Research . 2001;29(1):37–40. doi: 10.1093/nar/29.1.37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Capriotti E., Fariselli P., Casadio R. I-Mutant 2.0: predicting stability changes upon mutation from the protein sequence or structure. Nucleic Acids Research . 2005;33:W306–W310. doi: 10.1093/nar/gki375. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Ashkenazy H., Abadi S., Martz E., et al. ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules. Nucleic Acids Research . 2016;44(W1):W344–W350. doi: 10.1093/nar/gkw408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Venselaar H., Te Beek T. A., Kuipers R. K., Hekkelman M. L., Vriend G. Protein structure analysis of mutations causing inheritable diseases. An e-Science approach with life scientist friendly interfaces. BMC Bioinformatics . 2010;11(1):p. 548. doi: 10.1186/1471-2105-11-548. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Ittisoponpisan S., Islam S. A., Khanna T., Alhuzimi E., David A., Sternberg M. J. E. Can predicted protein 3D structures provide reliable insights into whether missense variants are disease associated? Journal of Molecular Biology . 2019;431(11):2197–2212. doi: 10.1016/j.jmb.2019.04.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Johansson M. U., Zoete V., Michielin O., Guex N. Defining and searching for structural motifs using DeepView/Swiss- PdbViewer. BMC Bioinformatics . 2012;13(1):p. 173. doi: 10.1186/1471-2105-13-173. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Waterhouse A., Bertoni M., Bienert S., et al. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Research . 2018;46(W1):W296–W303. doi: 10.1093/nar/gky427. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Orre L. M., Panizza E., Kaminskyy V. O., et al. S100a4 interacts with p53 in the nucleus and promotes p53 degradation. Oncogene . 2013;32(49):5531–5540. doi: 10.1038/onc.2013.213. [DOI] [PubMed] [Google Scholar]
  • 30.Ng P. C., Henikoff S. Predicting the effects of amino acid substitutions on protein function. Annual Review of Genomics and Human Genetics . 2006;7(1):61–80. doi: 10.1146/annurev.genom.7.080505.115630. [DOI] [PubMed] [Google Scholar]
  • 31.Liu X., Yun F., Shi L., Li Z. H., Luo N. R., Jia Y.-F. Roles of signaling pathways in the epithelial-mesenchymal transition in cancer. Asian Pacific Journal of Cancer Prevention . 2015;16(15):6201–6206. doi: 10.7314/apjcp.2015.16.15.6201. [DOI] [PubMed] [Google Scholar]
  • 32.Farhana A., Koh A. E. H., Kothandan S., Alsrhani A., Mok P. L., Subbiah S. K. Treatment of HT29 human colorectal cancer cell line with nanocarrier-encapsulated camptothecin reveals histone modifier genes in the Wnt signaling pathway as important molecular cues for colon cancer targeting. International Journal of Molecular Sciences . 2021;22(22):p. 12286. doi: 10.3390/ijms222212286. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Livesay D. R., Dallakyan S., Wood G. G., Jacobs D. J. A flexible approach for understanding protein stability. FEBS Letters . 2004;576(3):468–476. doi: 10.1016/j.febslet.2004.09.057. [DOI] [PubMed] [Google Scholar]
  • 34.Giannakis M., Mu X. J., Shukla S. A., et al. Genomic correlates of immune-cell infiltrates in colorectal carcinoma. Cell Reports . 2016;15(4):857–865. doi: 10.1016/j.celrep.2016.03.075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Bentwich I., Avniel A., Karov Y., et al. Identification of hundreds of conserved and nonconserved human microRNAs. Nature Genetics . 2005;37(7):766–770. doi: 10.1038/ng1590. [DOI] [PubMed] [Google Scholar]
  • 36.Landgraf P., Rusu M., Sheridan R., et al. A mammalian microRNA expression atlas based on small RNA library sequencing. Cell . 2007;129(7):1401–1414. doi: 10.1016/j.cell.2007.04.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Lui W. O., Pourmand N., Patterson B. K., Fire A. Patterns of known and novel small RNAs in human cervical cancer. Cancer Research . 2007;67(13):6031–6043. doi: 10.1158/0008-5472.can-06-0561. [DOI] [PubMed] [Google Scholar]
  • 38.Fasihi A., M Soltani B., Atashi A., Nasiri S. Introduction of hsa-miR-103a and hsa-miR-1827 and hsa-miR-137 as new regulators of Wnt signaling pathway and their relation to colorectal carcinoma. Journal of Cellular Biochemistry . 2018;119(7):5104–5117. doi: 10.1002/jcb.26357. [DOI] [PubMed] [Google Scholar]
  • 39.Persson H., Kvist A., Rego N., et al. Identification of new microRNAs in paired normal and tumor breast tissue suggests a dual role for the ERBB2/Her2 gene. Cancer Research . Res2011;71(1):78–86. doi: 10.1158/0008-5472.can-10-1869. [DOI] [PubMed] [Google Scholar]
  • 40.Cummins J. M., He Y., Leary R. J., et al. The colorectal microRNAome. Proceedings of the National Academy of Sciences . 2006;103:3687–3692. doi: 10.1073/pnas.0511155103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Missiaglia E., Blaveri E., Terris B., et al. Analysis of gene expression in cancer cell lines identifies candidate markers for pancreatic tumorigenesis and metastasis. International Journal of Cancer . 2004;112(1):100–112. doi: 10.1002/ijc.20376. [DOI] [PubMed] [Google Scholar]
  • 42.Hashida H., Coffey R. J. Significance of a calcium-binding protein S100A14 expression in colon cancer progression. Journal of Gastrointestinal Oncology . 2022 Feb;13(1):149–162. doi: 10.21037/jgo-21-528. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Stein U., Arlt F., Walther W., et al. The metastasis-associated gene S100A4 is a novel target of beta-catenin/T-cell factor signaling in colon cancer. Gastroenterology . 2006;131(5):1486–1500. doi: 10.1053/j.gastro.2006.08.041. [DOI] [PubMed] [Google Scholar]
  • 44.Ellson C. D., Dunmore R., Hogaboam C. M., Sleeman M. A., Murray L. A. Danger-associated molecular patterns and danger signals in idiopathic pulmonary fibrosis. American Journal of Respiratory Cell and Molecular Biology . 2014;51(2):163–168. doi: 10.1165/rcmb.2013-0366TR. [DOI] [PubMed] [Google Scholar]
  • 45.Rodrigues M. A., Gomes D. A., Cosme A. L., Sanches M. D., Resende V., Cassalia G. D. Inositol 1,4,5-trisphosphate receptor type 3 (ITPR3) is overexpressed in cholangiocarcinoma and its expression correlates with S100 calcium-binding protein A4 (S100A4) Biomedicine & Pharmacotherapy . 2022;145:p. 112403. doi: 10.1016/j.biopha.2021.112403. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data used in this study are available as hyperlinks in this paper.


Articles from Contrast Media & Molecular Imaging are provided here courtesy of Hindawi Ltd. and John Wiley and Sons, Inc.

RESOURCES