Abstract
Postaxial Polydactyly (PAP) is a congenital disorder of limb abnormalities characterized by posterior extra digits. Mutations in the N-terminal region of the Zinc finger protein 141 (ZNF141) gene were recently linked with PAP type A. Zinc finger proteins exhibit similarity at their N-terminal regions due to C2-H2 type Zinc finger domains, but their functional preferences vary significantly by the binding patterns of DNA. Methods: This study delineates the pathogenic association, miss-fold aggregation, and conformational paradigm of a missense variant (c.1420C > T; p.T474I) in ZNF141 gene segregating PAP through a molecular dynamics simulations approach. Results: In ZNF141 protein, helices play a crucial role by attaching three specific target DNA base pairs. In ZNF141T474I protein, H1, H3, and H6 helices attain more flexibility by acquiring loop conformation. The outward disposition of the proximal portion of H9-helix in mutant protein occurs due to the loss of prior beta-hairpins at the C terminal region of the C2-H2 domain. The loss of hydrogen bonds and exposure of hydrophobic residues to solvent and helices turning to loops cause dysfunction of ZNF141 protein. These significant changes in the stability and conformation of the mutant protein were validated using essential dynamics and cross-correlation maps, which revealed that upon point mutation, the overall motion of the proteins and the correlation between them were completely different, resulting in Postaxial polydactyly type A. Conclusions: This study provides molecular insights into the structural association of ZNF141 protein with PAP type A. Identification of active site residues and legends offers new therapeutic targets for ZNF141 protein. Further, it reiterates the functional importance of the last residue of a protein.
Keywords: ZNF141 gene, point mutation, molecular dynamics simulations, non-synonymous SNPs, postaxial polydactyly
1. Introduction
Polydactyly is a common congenital hand and foot deformity that refers to the duplication of fingers or supernumerary digits. It is a genetic disorder that commonly occurs in new infants [1]. It is characterized by abnormal joints, ligament insertion, anomalous tendons, and hypoplastic structures [2]. It ranges from a minor mildest form to complete duplication of digits, thus showing a great variety in recessive patterns of inheritance [3,4]. Polydactyly is classified into three major categories: Postaxial, pre-axil and center polydactyly. Postaxial polydactyly is further classified into types A and B. It is an autosomal dominant disease with different penetrance in a diverse population [5]. This disease is spread with a rate of incidence ranging from 1 per 531 newborn infants. Besides the above, it is more frequent in African and American peoples [6]. Kalsoom et al. (2013) reported a missense variant (c.1420C > T; p.T474I) in the ZNF141 gene segregating PAP type A [7].
The ZNF141 protein is a member of the C2-H2 (Krüppel) family of zinc finger proteins which make up to ∼2% of all genes and thus it is the second-largest family of human genes [8]. C2-H2 domain is among four abundant domains found in zinc finger proteins, i.e., C2-H2, Lin-ll, Isl-1, Mec-3 (LIM domains), plant homeodomain (PHD), and a really interesting new gene (RING) domain [9]. A typical C2-H2 domain comprises two cysteines in one chain and two histidines in the other chain. C2H2-ZNF proteins contain an effector domain next to the zinc finger region [10]. These domains have C-x-C-x-H-x-H DNA interacting motifs that bind to specific sequences like (T/A) (G/A) CAGAA (T/G/C) and repress target gene expression [11]. ZNFs have been reported to have a role in the repression of RNA Polymerase II and III promotors and RNA binding and splicing. ZNF141 gene has an open reading frame of 1422 bp, which encodes 474 residues long protein characterized by Kruppel associated box (KRAB) domain. The KRAB domain is subdivided into box A, box B, and ten zinc finger motifs. ZNF141 protein is expressed in the brain, kidney, lungs, liver, pancreas, placenta, spleen, skeletal muscle, and testis [12]. About 103 KRAB-ZNF genes are conserved across mammals, while 136 are conserved across primates [13,14]. This signifies the functional preferences of ZNF proteins.
Mutations in the ZNF141 gene were recently linked with limb development. Non-Synonymous Single nucleotide polymorphisms (nsSNPs) are among the most common SNPs within the human genome. nsSNP causes the substitution of amino acids within the protein, which may alter the structure, solubility, function, charge, and stability of the protein, resulting in changed phenotypes [15,16]. These properties make it of particular concern for experimental studies [17]. Various complex diseases in humans are associated with these nsSNPs. Bioinformatics studies have aided in the estimation of the molecular mechanisms and the possible clinical consequences of nsSNPs. Structural disposition and instability of protein upon point mutations are effectively shown through computational approaches. The time-dependent behavior of the mutant protein and wild type reveals the differences between two proteins upon single amino acid substitution. The solvent accessibility and hydrogen bond estimation give insights into the conformational changes that affect proteins’ globularity.
Bioinformatics studies have been phenomenal in Genome-wide studies (GWAS) to identify mutations [15,18], novel drug targets, novel drugs [18,19,20,21] and influence of neighbor proteins on protein function [22]. This study explored the conformational transitions, and misfold aggregation and linked the pathogenic association of missense mutation (c.1420C>T; T474I) with postaxial polydactyly type A, using structural bioinformatics approaches. This study unravels the causes of ZNF141 protein dysfunction by providing structural shreds of evidence of the protein’s behavior upon point mutation.
2. Methodologies
2.1. SNP’s Annotation
ZNF141 protein fasta sequence was retrieved from the UniProt database (https://www.uniprot.org/ accessed on 10 October 2022) through the ID: Q15928. The variations present in ZNF141 were retrieved from ENSEMBL release 95 (https://asia.ensembl.org accessed on 10 October 2022) [23]. The canonical transcript was considered to investigate nonsynonymous coding regions SNPs. Sorting Intolerant from Tolerant (SIFT) and PolyPhen v2 were used to find the impact of SNPs on protein. SIFT is a web server used to determine the effects of amino acid substitution on the protein structure and function that gives an output score in the range of 0-1; the score in 0 corresponds to tolerated while score in the range of 0.5 indicates the harmful effect of nsSNPs [24], Polymorphism Phenotyping v2 score ranges from 0.0 (tolerant) to 1.0 (deleterious). Variants with a score of 0 are predicted to be benign. Values near 1.0 are more confidently anticipated to be deleterious [25]. PROVEAN [26], ReveL [27], CADD [28] and MetaL [29] were also employed to find the functional consequences of variant ZNF141T474I. InterPro Server was utilized to annotate the ZNF141 protein [30].
2.2. ZNF141WT 3D Structure Prediction
The protein multi-template structure modeling technique was employed to model the consistent and accurate ZNF141WT protein structure using the Modeller v 9.25 tool [31]. Four PDB structures (5V3G, 5V3J, 5V3M, and 5WJQ) ranked according to the Global Model Quality Estimate (GMQE) and Quaternary Structure Quality Estimate (QSQE) were utilized to model the best possible structure of the protein [32]. Python scripts for model refinement were used to pre-refine the modeled structure [31].
2.3. Protein Model Refinement and Validation
The predicted protein model was minimized using the GROMOS 54a7 force field in Gromacs v5.1.4 [33]. Ramachandran plot was generated with PROCHECK to check the Phi and Psi angles [34]. WinCOOT v0.9.2 was used to correct the outliers, and unusual rotamers manually and remove the discontinuities and large variance in atoms B factor [35]. The refined structure was validated by ERRAT [36] and RAMPAGE tools [37]. After validating the modeled structure, the active site residues, and legend binding pockets were identified in the modeled protein’s structure using the COACH server [38].
2.4. ZNF141T474I Structure Prediction and Comparison with the ZNF141WT Structure
The mutant 3D structure was obtained from ZNF141T474I protein using the amino acid swap technique in Chimera v1.15 [39]. The structure was then cleared to remove any possible clashes. Before drawing the comparison, the mutant structure was minimized with consistent configuration through GORMOS 54a7 force field in Gromacs [40]. The wild-type and mutant protein structures were then compared by the PyMol tool [41] to show the structural differences in the side chains of the two amino acids. Hope Server was used to draw a residue-to-residue comparison [42].
2.5. Conservation Analysis
The ConSurf web server (http://consurf.tau.ac.il accessed on 10 October 2022) examines the evolutionary trend of the macromolecule’s amino/nucleic acid sequences to identify sections that are significant for structure and/or function. The server automatically selects homologues from a query sequence or structure, infers their multiple sequence alignment, and reconstructs a phylogenetic tree that depicts their evolutionary relationships. These data are then utilized to estimate the evolutionary rates of each sequence position within a probabilistic framework. ConSurf provides the ability to homology-model query proteins, predict the secondary structure of query RNA molecules from sequence, see the biological assembly of a query (in addition to the single chain), and map the conservation grades onto 2D RNA models.
2.6. Molecular Dynamics Simulations
Protein MD simulations were done for 100ns to gain structural insights into ZNF141 protein using GROMACS v5.1.4 utilizing GROMOS 54a7 force field [33]. The mutant and ZNF141WT models were placed in a dodecahedral box under explicit solvent and periodic boundary conditions to perform the MD simulation. The solvated water system was further neutralized by adding Chlorine ions. Energy minimization was done to have Fmax <1000 kj/moL/nm employing the steepest descent algorithm for 5000 steps to obtain stable conformations. Both the systems were equilibrated with canonical ensembles (NVT) and isobar isothermal ensembles (NPT) with a constant temperature of 300 K and a constant pressure of 1 atm for 300 ps. The MD simulation configuration was set to run at 300 K temperature for 100 ns time. The Root mean square deviation (RMSD), Root mean square fluctuation (RMSF), radius of gyration (Rg), solvent accessible surface area (SASA), and Hydrogen bond Analysis was performed utilizing different Gromacs modules. Protein PDB files were generated after 100 ns simulation and analyzed through the PDBsum tool [43]. The chimera Residue to residue (RR) distance map tool was used for RR distance and standard deviation comparison of initial and final structures [44].
3. Results
3.1. SNPs Annotation
Variants in the ZNF141 gene were retrieved from the ENSEMBL database. Non-synonymous SNPs were selected out of all variants. There were 272 missense variants of which 145 were coding sequence variants and among them, only one SNP (rs587776959; T474I) was found to have clinical significance as pathogenic. Different tools with different logical algorithms were used for the structural and functional annotation of the SNP. SIFT predicted the nsSNP as deleterious with a score of 0.01. Polyphen2 also indicated the nsSNP as harmful with a score of 0.997. The functional impact of the nsSNP on protein function predicted by PROVEAN was also deleterious, with a score of −4.810. The CADD score was 17, the ReveL score was 0.119 and the MetaLR score was 0.046 (Supplementary Table S1).
3.2. Structure Prediction of C2-H2 Domain (171-474) of ZNF141 Protein
The InterPro server annotated the protein as a member of the C2-H2 (Kruppel) family of Zinc finger proteins having ten Zinc finger motifs and Kruppel-associated box domain (KRAB) (Supplementary Figure S1), which is further divided into box A and box B. Since, the mutation was mapped at the end of C2-H2 Domain, the 3D structure of C2-H2 domain (Res; 171–474) of ZNF141 protein of human was predicted by Modeller against multiple templates having PDB ID’s: 5V3G, 5V3J, 5V3M and 5WJQ.
3.3. Validation of 3D Structure
The predicted 3D structure (Figure 1A) was validated by different computational tools that use different protein properties for structure validation. Ramachandran plot shows that 96.6% of residues fall in the most favored regions and 3.31% residues in the allowed area while no residues fall in the outlier region (Supplementary Figure S2). ERRAT calculated the Quality factor as A: 92.49 (Supplementary Figure S3). COOT omega angle distortion analysis depicted no unexpected peptide bond and no unusual rotamers were found in rotamers analysis.
3.4. Structural Comparison of ZNF141WT and ZNF141T474I
The mutant T474I structure was modeled and then aligned with the ZNF141WT structure to show the differences between the residues. As evident in Supplementary Figure S4, potential energies for both structures show that both structures were best minimized before comparison. The structural comparison depicted the differences in the side chains of the residues. The residue-to-residue comparison elucidated that the mutant Isoleucine residue is bigger than the wild-type threonine residue, which leads to bumps (Supplementary Figure S6a). The mutant Isoleucine residue is water repellent, thus more hydrophobic than the wild-type residue. Exposure of hydrophobic residues to solvent results in the loss of hydrogen bonds and disturbs correct foldings (Supplementary Figure S6b).
3.5. ConSurf Conservation Analysis
A disease-causing mutation is often found in highly conserved locations. The conservation analyses of PAP type A promoting mutation of ZNF141 protein was performed based on protein structure. Through homologous sequence alignment with the SWISS-PROT, UniProt, and UniRef90 protein databases, the substituted Threonine residue was determined to be in a highly conserved amino acid region. The conservation score was given an 8 out of a possible 9. Additionally, the 474th Threonine residues was also predicted to have a functional impact on ZNF141 protein. Figure 2 depicts the key findings of the conservation analysis.
3.6. Molecular Dynamics Simulation Analysis
3.6.1. Stability of ZNF141 Protein
Molecular dynamics simulations of 100 ns revealed the disruptive effect of mutant T474I compared to the ZNF141WT. We did not find significant structural changes in wild-type protein as the RMSD was relatively stable after 30 ns except for the loops connecting the zinc fingers, which were flexible during simulation. Consequently, the RMSD value of ZNF141WT closed at 1.62 nm, with an average of 1.71 nm through 100 ns simulation. On the contrary, the T474I variant was unstable with more obvious fluctuations in RMSD value. The RMSD value of variant T474I closed at 1.9 nm with a mean value of 1.76 nm, much higher than that of ZNF141WT protein. These findings suggest that the wild-type ZNF141 protein is more stable than the mutant type (Figure 3A).
3.6.2. Flexibility Analysis of ZNF141 Protein
The residual flexibility was shown by RMSF analysis as Figure 2C manifests threonine fluctuations with substituted isoleucine residue. The mutation affected the residual flexibility at position 474 along with the overall flexibility of the protein. The RMSF of wild-type Threonine residue was noted as 1.17 nm, while mutated Isoleucine was recorded as 2.39 nm. This is particularly because of the mutation at this point. The transition of parts of H1, H3, H6, and H7 helices into loops increases the overall flexibility of the protein from 0.91 nm to 1.11 nm. This transition might be supported by instability caused due to unwrapping of the Zinc fingers besides the substitution of Threonine. Besides the collective increase, the RMSF in the proximity of T474I (res 470–474) was much higher than ZNF141WT (Figure 3B).
3.6.3. Gyration Analysis of ZNF141 Protein
The gyration analysis manifested the change in the overall compactness of the mutant T474I compared to the ZNF141WT structure (Figure 4A). The wild-type protein was more compact with an Rg score of 2.86 nm than the mutant protein having an Rg score of 3.39 nm, indicating a significant decrease in the globularity of protein. After the simulation time, the ZNF141WT shows compactness of 2.9 nm while T474I shows 3.4 nm, reflecting that mutant T474I protein exhibited more variations. Interestingly, the Rg value of mutant T474I at the start of the simulation was 2.93 nm which remained higher up to 4.02. This indicates that the mutant protein has lost compactness which may result in protein dysfunction.
3.6.4. Solvent Accessible Surface Area Analysis
Solvent accessible surface area (SASA) analysis was performed to investigate the hydrophobic core regions of wild-type and mutant protein. Significant changes in SASA were observed for both proteins (Figure 4B). The average SASA value for wild-type ZNF protein was 226.19 nm2 while mutant protein was 235.66 nm2 (Table 1). The SASA values vary significantly because of the unwinding of protein folds, making the surface exposed to solvent (Figure 3B,C). This leads to the exposure of hydrophobic residues to the solvent, subsequently unfolding the protein and causing protein dysfunction.
Table 1.
S.No. | Protein | RMSD (nm) (Average) |
RMSF (nm) (Average) |
RMSF (T474I) |
H-Bonds (Average) |
Rg (nm) | SASA (nm2) (Average) |
---|---|---|---|---|---|---|---|
1 | wt-ZNF141 | 1.71 | 0.91 | 1.17 | 207 | 2.9 | 226.19 |
2 | T474I | 1.76 | 1.11 | 2.39 | 200 | 3.4 | 235.66 |
3.6.5. Intra-Protein Hydrogen Bond Analysis
Hydrogen bond analysis provides an essential understanding of the intramolecular hydrogen bond network of ZNF141WT and ZNF141T474I variants. Figure 4C provides insights into the H-bond network of both models. Careful evaluation of the number of hydrogen bonds with length ≤ 3.5 Å revealed that the number of hydrogen bonds with single point mutations significantly varied in the ZNF141 proteins. ZNF141WT has an average of 207 H-bonds, while variant ZNF141T474I recorded the average number of hydrogen bonds as 200 (Table 1). The final wild-type structure has 212H-bonds while mutant ZNF141T474I has 214 at the end of the simulation. Loss in H-bonds reflects reduce stability and compactness of the structure. To check if our simulation is physically valid and there is no systematic drift, the total energies of both systems were calculated and compared through the course of the simulation.
3.7. Secondary Structure Analysis
Secondary structural changes were evaluated for both the wild-type and mutant ZNF141 proteins. We noticed that mutant ZNF141T474I has experienced significant changes in the secondary structure as compared to the ZNF141WT structure (Supplementary Figure S5A,B). H1, H3, H6 and H7 helices were partially converted into loops region, making the protein flexible (Figure 4B). The beta hairpins were converted into loops resulting in the outward disposition of the proximal region after the H9 helix. This outward disposition was also the cause of an increase in the solvent-accessible surface area of the mutant protein. These results were consistent with results obtained from other analyses, which depicted that decreased stability and increased solvent accessibility of ZNF141 are the causes of PAP.
3.8. Residue to Residue Distance Map
RR distance map generated for the average structure of the systems showed distance-based differences and standard deviations among residues that were presented via a color-coded map as shown in Figure 5. Results deduced in the current system have been investigated for equivalent pairs by subtracting the distance of the initial structure from the next one. Herein, a minor change has been studied with a pair vise correlation of wild-type protein structure before and after simulation. An evident residual intensity has been found, indicating the stability of the system with no confirmatory changes. However, in the case of mutant protein structure, besides loop regions, significant confirmatory changes have been observed in the helix region, resulting in a drastic increase in the residue-to-residue distance. The standard deviation between final structures is shown in Figure 5B, which further delineates the conformational changes in both proteins. The open conformation phenomena drive these changes upon point mutation.
3.9. Dynamic Cross Correlation Map (DCCM)
We created and analyzed a dynamics cross-correlation matrix to explore the functional displacements of ZNF141 atoms as a function of time (DCCM). When all mutant complexes were compared to wild-type complexes, different patterns of associated movements were discovered (Figure 6). However, the correlation of atomic displacements differed significantly in ZNF141WT, whereas mutant ZNF141T474I displayed differently correlated movements (Figure 6B). T474I had weak, negatively linked movements from residues R406-K426 and V447-F460, although it had partial correlations between residues of themselves that are opposite to the correlation observed for the same residues of ZNF141WT. Helix 9 of ZNF141T474I displayed distantly linked motions than the wild-type protein. The movement of loops varied between protein models, which corresponds to the relevant RMSF (Figure 2) Helix 1 and 3 showed comparable linked motions in both systems. In summary, the mutant had different correlated movements than the ZNF141WT protein, where most of the residues showed negative correlations.
3.10. Essential Dynamics of ZNF141 Protein
These findings may be supported by examining the graphical depiction of total system mobility along PC1 and PC2, which allows us to investigate the direction and amount of the motions that contribute to total system mobility. According to the projections, the WT system moves in the other way, causing an uneven expansion of the structure (Figure 7), which is consistent with Rg findings; that is, R406-K426 and V447-F460 areas moved in the opposite direction with greater amplitude. However, in comparison to WT, the direction of movement changes for ZNF141T474I, and the amplitude of movement is greater for the C2-H2 area and specifically R406-K426 and V447-F460 (Figure 7B), which correlates to the cluster distribution of PC1 versus PC2 projection. This mutation not only enhances the flexibility of the C2 H2 domain but also alters its orientation, potentially increasing deformation and altering the protein structure.
3.11. Identification of Ligand, Active Sites Residues, and Enzyme Commission Number
COACH server predicted the active site residues that were possibly the sites of attachment for the ligand. The ligand ZN has the highest C-score of 0.17 among all the predicted ligands and the residues with the highest probability of being active site residues were Arginine 378 (Table 2). The EC number of ZN is EC 7.2.2.12.
Table 2.
Rank | C-Score | Cluster Size | PDB Hit | Lig Name | Binding Residues |
---|---|---|---|---|---|
1 | 0.17 | 14 | 2lt7A | ZN | 257,260,273,277 |
2 | 0.05 | 6 | 1a1iA | Nuc.acid | 337,346,348,350,353,356,357,360,374,376,377, 378,381,385,388,402,406,409,412 |
3 | 0.04 | 4 | 3g0bD | NAG | 229,232,236,245 |
4 | 0.04 | 3 | 2wbsA | ZN | 285,288,301,305 |
5 | 0.03 | 3 | 2i13B | Nuc.acid | 337,346,348,350,353,356,357,374,376,377,378, 381,385,388,402,404,406,409,412,413,416 |
6 | 0.03 | 3 | 1tf6A | Nuc.acid | 262,264,265,266,269,273,276,292,296,297,300, 304,318,320,322,325,329,332,348,350,353,357, 360,376,378,381,385,388,391 |
7 | 0.02 | 2 | 4kx7A | ZN | 313,315,316,329,334 |
4. Discussion
Post Axial Polydactyly (PAP) type A is mainly autosomal dominant, the rare anomaly of fifth digit duplication in hands and/or feet. Autosomal recessive inheritance of PAP type A has been linked with mutations in the ZNF141 gene at different chromosomal locations. ZNF141 protein is a member of the C2-H2 type Zinc finger proteins family, which contains two cysteines and two histidines in two separate chains coordinated by a zinc ion [8]. Classical zinc finger domains have two β-sheets and one α-helix as revealed by crystallographic studies [10]; however, non-classical types of Zinc finger proteins have different combinations of C2-H2, C2-CH, and C2-C2. C2-H2 domain is among four abundant domains found in zinc finger proteins, i.e., C2-H2, Lin-ll, Isl-1, Mec-3 (LIM domains), plant homeodomain (PhD), and a really interesting new gene (RING) domain [9].
The C2-H2 Zinc finger contains a more prominent transcription factor with C-x-C-x-H-x-H motifs that interact with DNA sequences. Some C2-H2 members like ZNF217 include multiple domains that bind to specific DNA sequences, consequently repressing the gene expression in target genes [11]. The zinc finger domain collocates three base pairs of target DNA to its α-helices. The specificity of DNA sequence recognition depends upon the amino acid residues at the contact site. So, any change in these amino acids can change the binding specificity of three base pairs of DNA [45]. In addition, ZNF141 contains ten ZNF motifs, and each one of these zinc finger motifs binds to a different set of DNA sequences which, upon exploitation, results in unbinding of DNA sequences [45], causing dysfunction of protein. Previously, another C2-H2 type Zinc finger protein KLF4 was reported to have a role in keratinocyte differentiation, expressed in the suprabasal layers of the epidermis. It regulates the expression of keratinocyte differentiation genes like SPINK5, ECM1, CDSN, LCE3, and FLG. KLF4 protein’s ectopic expression accelerates the differentiation process in the epidermis, which results in epidermal barrier formation [46]. Mutation at the N-terminal region of ZNF141 has been recently linked with abnormal limb development; however, the structural basis of this association was still unknown [7]. Bioinformatics studies have been phenomenal in Genome-wide studies (GWAS) to identify mutations [19,47], their annotation [15,48,49], and impact [16], identification of novel and potent drugs [50,51,52] and its targets [18]. In silico methods are significant to define the position of a gene, predict its transcripts and interaction with neighbors [22], and determining the function and structure of a protein generated from that gene within the cell. In silico study also supports us to differentiate the neutral and deleterious SNPs through various algorithms and accessible information in the databases [52,53]. Compared to small molecule inhibitors, peptide inhibitors have less toxicity and better target selectivity. Bioinformatics analysis adds valuable insights into the macromolecular structural-functional correlation using molecular dynamics simulation of wild-type and mutant models under explicit conditions. Several studies have been conducted in the past that emphasized computational tools, especially molecular dynamics simulations that diversified the understanding of SNPs causing disease in different proteins [53].
In this study, we have established a structural association between the dysfunction of ZNF141 protein that provokes postaxial polydactyly, as reported by [7]. All 100 ns of all atoms MD simulation manifested time-dependent behavior of wild-type and mutant ZNF141 protein. The structural investigation revealed that ZNF141 protein binds to the target DNA sequence through its alpha-helix; however, the mutation T474I disrupts the structure of ZNF141 protein, subsequently resulting in partial transition of H1, H3, H6, and H7 helices into loops. Additionally, beta-hairpins in the structure disappear resulting in the flexibility of the structure. Several hydrogen bonds were lost in the mutant protein as the mutant protein lost its compactness. Solvent accessible surface area increased drastically due to point mutation, resulting in the exposure of hydrophobic residues to the solvent, causing protein malfunctioning. Secondary structure analysis shows that three beta hairpins are lost along with the partial transition of helices into loops, resulting in a loss of protein stability. The wild-type ZNF141 protein maintains the most rigid and conformationally stable zinc-bound configuration when compared with the diseased counterparts. These significant changes in the mutant protein’s stability and conformation were confirmed using essential dynamics and cross-correlation maps, which revealed that after point mutation, the overall motion of the proteins and their correlation were completely different, resulting in Postaxial polydactyly type A. Besides providing the structural basis of ZNF141 association with Postaxial polydactyly, this study also emphasizes the role of the last residue of the chain in the correct folding of the protein, its stability, and flexibility. The active site residues and ligand identification of the ZNF141 gene provide the basis for novel gene therapies and targets for drug discovery strategies to restore the standard functionality of the ZNF141 protein.
5. Conclusions
The characterization of disease-associated SNP’s has a crucial role in modern genetic analysis, gene association studies, protein structure, and stability. We have investigated the structural impact of point mutation on the ZNF141 gene using a molecular dynamics simulation assay. We have concluded that H1, H3, H6, and H7 α-helices play a crucial role in DNA binding. The loss of beta hairpins leads to bumps, loss of stability, compactness, and hydrogen bonds that results in incorrect foldings and exposure of hydrophobic residues to solvent. Subsequently, it causes protein dysfunction leading to postaxial polydactyly type A during fetal development. Altogether this study shows the structural association of ZNF141T474I with PAP and may contribute to the inventory of novel inhibitors that can compete with DNA for binding based on intrinsic properties.
Acknowledgments
The author extends his appreciation to the Deanship of Scientific Research at King Khalid University for funding this work through General Research Project under Grant number (RGP. 1/289/43). The authors are thankful to all members of the Department of Computer Science and Bioinformatics, Khushal Khan Khattak University, Karak, Khyber-Pakhtunkhwa, Pakistan.
Supplementary Materials
The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/bioengineering9120749/s1, Figure S1: InterPro server annotation of the ZNF141 protein showing C2-H2; IPR013087 domain from residue 171-474 with ten Zinc finger motifs; Figure S2: Ramachandran plot of ZNF141; C2-H2 domain (Res 171-474) with all the residues in Preferred and allowed regions and no residue in the disallowed region; Figure S3: ERRAT validation graph showing an overall quality score of the structure as 92.491 with the low relatively low confidence score of the error for the rest of the residues; Figure S4: The potential energy of both wt-ZNF141 and T474I models after energy minimization; Figure S5: Secondary structure analysis of (A) wt-ZNF141 and (B) mutant T474I protein after 100ns molecular dynamics simulations; Figure S6: (a) The figure shows the schematic structures of the original (left; threonine) and the mutant (right; isoleucine) amino acid. The backbone, which is the same for each amino acid, is colored red. The side chain, unique for each amino acid, is colored black; (b) Superimposition of wild-type and mutant-type residues at 474th position. The wild type of threonine is shown in sticks while mutant Isoleucine is shown in Lines; Table S1: Functional impact of ZNF141 T474I mutation predicted by different tools.
Author Contributions
Conceptualization, Y.A. and F.A.; methodology, M.F.U.; software, M.I.K.; validation, M.F.U., Y.A. and A.A.; formal analysis A.A. and N.U.H.; investigation, M.F.U. and M.I.U.H.; resources, F.Z.; data curation, F.A. and S.M.E.; writing—original draft preparation, Y.A., F.A. and M.F.U.; writing—review and editing, M.I.K. and A.A.; visualization, F.Z. and M.I.U.H.; supervision, A.A. and M.F.U.; project administration, S.M.E.; funding acquisition, S.M.E. and F.Z. All authors have read and agreed to the published version of the manuscript.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
All the data regarding research work are clearly presented in the research work. Some of data are provided by special request of author.
Conflicts of Interest
The authors declare no conflict of interests.
Funding Statement
This study is completed without any funding support from any organization.
Footnotes
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Zhang Z., Sui P., Dong A., Hassell J., Cserjesi P., Chen Y.T., Behringer R.R., Sun X. Preaxial polydactyly: Interactions among ETV, TWIST1 and HAND2 control anterior-posterior patterning of the limb. Development. 2010;137:3417–3426. doi: 10.1242/dev.051789. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Faust K.C., Kimbrough T., Oakes J.E., Edmunds J.O., Faust D.C. Polydactyly of the hand. Am. J. Orthop. (Belle Mead NJ) 2015;44:E127–E134. [PubMed] [Google Scholar]
- 3.Castilla E.E., Lugarinho R., da Graca Dutra M., Salgado L.J. Associated anomalies in individuals with polydactyly. Am. J. Med. Genet. 1998;80:459–465. doi: 10.1002/(SICI)1096-8628(19981228)80:5<459::AID-AJMG5>3.0.CO;2-G. [DOI] [PubMed] [Google Scholar]
- 4.Zguricas J., Bakker W.F., Heus H., Lindhout D., Heutink P., Hovius S.E. Genetics of limb development and congenital hand malformations. Plast. Reconstr. Surg. 1998;101:1126–1135. doi: 10.1097/00006534-199804040-00039. [DOI] [PubMed] [Google Scholar]
- 5.Kozin S.H. Upper-extremity congenital anomalies. J. Bone Jt. Surg. Am. 2003;85:1564–1576. doi: 10.2106/00004623-200308000-00021. [DOI] [PubMed] [Google Scholar]
- 6.Watson B.T., Hennrikus W.L. Postaxial type-B polydactyly. Prevalence and treatment. J. Bone Jt. Surg. Am. 1997;79:65–68. doi: 10.2106/00004623-199701000-00007. [DOI] [PubMed] [Google Scholar]
- 7.Kalsoom U.E., Klopocki E., Wasif N., Tariq M., Khan S., Hecht J., Krawitz P., Mundlos S., Ahmad W. Whole exome sequencing identified a novel zinc-finger gene ZNF141 associated with autosomal recessive postaxial polydactyly type A. J. Med. Genet. 2013;50:47–53. doi: 10.1136/jmedgenet-2012-101219. [DOI] [PubMed] [Google Scholar]
- 8.Messina D.N., Glasscock J., Gish W., Lovett M. An ORFeome-based analysis of human transcription factor genes and the construction of a microarray to interrogate their expression. Genome Res. 2004;14:2041–2047. doi: 10.1101/gr.2584104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Gray K.A., Yates B., Seal R.L., Wright M.W., Bruford E.A. Genenames.org: The HGNC resources in 2015. Nucleic Acids Res. 2015;43:D1079–D1085. doi: 10.1093/nar/gku1071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Zhang W., Xu C., Bian C., Tempel W., Crombet L., MacKenzie F., Min J., Liu Z., Qi C. Crystal structure of the Cys2His2-type zinc finger domain of human DPF2. Biochem. Biophys. Res. Commun. 2011;413:58–61. doi: 10.1016/j.bbrc.2011.08.043. [DOI] [PubMed] [Google Scholar]
- 11.Nunez N., Clifton M.M.K., Funnell A.P.W., Artuz C., Hallal S., Quinlan K.G.R., Font J., Vandevenne M., Setiyaputra S., Pearson R.C.M., et al. The multi-zinc finger protein ZNF217 contacts DNA through a two-finger domain. J. Biol. Chem. 2011;286:38190–38201. doi: 10.1074/jbc.M111.301234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Tommerup N., Aagaard L., Lund C.L., Boel E., Baxendale S., Bates G.P., Lehrach H., Vissing H. A zinc-finger gene ZNF141 mapping at 4p16.3/D4S90 is a candidate gene for the Wolf-Hirschhorn (4p-) syndrome. Hum. Mol. Genet. 1993;2:1571–1575. doi: 10.1093/hmg/2.10.1571. [DOI] [PubMed] [Google Scholar]
- 13.Bellefroid E.J., Marine J.C., Matera A.G., Bourguignon C., Desai T., Healy K.C., Bray-Ward P., Martial J.A., Ihle J.N., Ward D.C. Emergence of the ZNF91 Krüppel-associated box-containing zinc finger gene family in the last common ancestor of anthropoidea. Proc. Natl. Acad. Sci. USA. 1995;92:10757–10761. doi: 10.1073/pnas.92.23.10757. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Hamilton A.T., Huntley S., Kim J., Branscomb E., Stubbs L. Lineage-specific expansion of KRAB zinc-finger transcription factor genes: Implications for the evolution of vertebrate regulatory networks. Cold Spring Harb. Symp. Quant. Biol. 2003;68:131–140. doi: 10.1101/sqb.2003.68.131. [DOI] [PubMed] [Google Scholar]
- 15.Ijaz A., Shah K., Aziz A., Rehman F.U., Ali Y., Tareen A.M., Khan K., Ayub M., Wali A. Novel frameshift mutations in XPC gene underlie xeroderma pigmentosum in Pakistani families. Indian J. Dermatol. 2021;66:220. doi: 10.4103/ijd.IJD_63_20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ahmad S.U., Ali Y., Jan Z., Rasheed S., Nazir N.u.A., Khan A., Rukh Abbas S., Wadood A., Rehman A.U. Computational screening and analysis of deleterious nsSNPs in human p 14ARF (CDKN2A gene) protein using molecular dynamic simulation approach. J. Biomol. Struct. Dyn. 2022:1–12. doi: 10.1080/07391102.2022.2059570. [DOI] [PubMed] [Google Scholar]
- 17.Purohit R. Role of ELA region in auto-activation of mutant KIT receptor: A molecular dynamics simulation insight. J. Biomol. Struct. Dyn. 2014;32:1033–1046. doi: 10.1080/07391102.2013.803264. [DOI] [PubMed] [Google Scholar]
- 18.Jan Z., Ahmad S.U., Amara Qadus Y.A., Sajjad W., Rais F., Tanveer S., Khan M.S., Haq I. 19. Insilico structural and functional assessment of hypothetical protein L345_13461 from Ophiophagus hannah. Pure Appl. Biol. (PAB) 2021;10:1109–1118. [Google Scholar]
- 19.Khattak S., Rauf M.A., Zaman Q., Ali Y., Fatima S., Muhammad P., Li T., Khan H.A., Khan A.A., Ngowi E.E. Genome-wide analysis of codon usage patterns of SARS-CoV-2 virus reveals global heterogeneity of COVID-19. Biomolecules. 2021;11:912. doi: 10.3390/biom11060912. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Rafique R., Khan K.M., Arshia, Kanwal, Chigurupati S., Wadood A., Rehman A.U., Karunanidhi A., Hameed S., Taha M., et al. Synthesis of new indazole based dual inhibitors of α-glucosidase and α-amylase enzymes, their in vitro, in silico and kinetics studies. Bioorg. Chem. 2020;94:103195. doi: 10.1016/j.bioorg.2019.103195. [DOI] [PubMed] [Google Scholar]
- 21.Ajmal A., Ali Y., Khan A., Wadood A., Rehman A.U. Identification of novel peptide inhibitors for the KRas-G12C variant to prevent oncogenic signaling. J. Biomol. Struct. Dyn. 2022:1–10. doi: 10.1080/07391102.2022.2138550. [DOI] [PubMed] [Google Scholar]
- 22.Ali Y., Yasin M., Aziz A., Khan A.W., ur Rahman S., Haq N.U. In-silico analysis of 2-cysteine peroxiredoxin genes in arabidopsis thaliana with possible role in carbon dioxide fixation through carbonic anhydrase regulation. Pak. J. Biochem. Biotechnol. 2022;3:175–189. doi: 10.52700/pjbb.v3i1.126. [DOI] [Google Scholar]
- 23.Hunt S.E., McLaren W., Gil L., Thormann A., Schuilenburg H., Sheppard D., Parton A., Armean I.M., Trevanion S.J., Flicek P., et al. Ensembl variation resources. Database. 2018;2018:bay119. doi: 10.1093/database/bay119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Sim N.L., Kumar P., Hu J., Henikoff S., Schneider G., Ng P.C. SIFT web server: Predicting effects of amino acid substitutions on proteins. Nucleic Acids Res. 2012;40:W452–W457. doi: 10.1093/nar/gks539. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Adzhubei I.A., Schmidt S., Peshkin L., Ramensky V.E., Gerasimova A., Bork P., Kondrashov A.S., Sunyaev S.R. A method and server for predicting damaging missense mutations. Nat. Methods. 2010;7:248–249. doi: 10.1038/nmeth0410-248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Choi Y., Chan A.P. PROVEAN web server: A tool to predict the functional effect of amino acid substitutions and indels. Bioinformatics. 2015;31:2745–2747. doi: 10.1093/bioinformatics/btv195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Ioannidis N.M., Rothstein J.H., Pejaver V., Middha S., McDonnell S.K., Baheti S., Musolf A., Li Q., Holzinger E., Karyadi D., et al. REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants. Am. J. Hum. Genet. 2016;99:877–885. doi: 10.1016/j.ajhg.2016.08.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Rentzsch P., Witten D., Cooper G.M., Shendure J., Kircher M. CADD: Predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 2019;47:D886–D894. doi: 10.1093/nar/gky1016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Dong C., Wei P., Jian X., Gibbs R., Boerwinkle E., Wang K., Liu X. Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies. Hum. Mol. Genet. 2015;24:2125–2137. doi: 10.1093/hmg/ddu733. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hunter S., Apweiler R., Attwood T.K., Bairoch A., Bateman A., Binns D., Bork P., Das U., Daugherty L., Duquenne L., et al. InterPro: The integrative protein signature database. Nucleic Acids Res. 2009;37:D211–D215. doi: 10.1093/nar/gkn785. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Sali A., Blundell T.L. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 1993;234:779–815. doi: 10.1006/jmbi.1993.1626. [DOI] [PubMed] [Google Scholar]
- 32.Ko J., Park H., Heo L., Seok C. GalaxyWEB server for protein structure prediction and refinement. Nucleic Acids Res. 2012;40:W294–W297. doi: 10.1093/nar/gks493. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Schmid N., Eichenberger A.P., Choutko A., Riniker S., Winger M., Mark A.E., van Gunsteren W.F. Definition and testing of the GROMOS force-field versions 54A7 and 54B7. Eur. Biophys. J. EBJ. 2011;40:843–856. doi: 10.1007/s00249-011-0700-9. [DOI] [PubMed] [Google Scholar]
- 34.Laskowski R.A., MacArthur M.W., Moss D.S., Thornton J.M. PROCHECK: A program to check the stereochemical quality of protein structures. J. Appl. Crystallogr. 1993;26:283–291. doi: 10.1107/S0021889892009944. [DOI] [Google Scholar]
- 35.Emsley P., Lohkamp B., Scott W.G., Cowtan K. Features and development of Coot. Acta Crystallogr. Sect. D. 2010;66:486–501. doi: 10.1107/S0907444910007493. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Colovos C., Yeates T.O. Verification of protein structures: Patterns of nonbonded atomic interactions. Protein Sci. A Publ. Protein Soc. 1993;2:1511–1519. doi: 10.1002/pro.5560020916. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Lovell S.C., Davis I.W., Arendall W.B., 3rd, de Bakker P.I., Word J.M., Prisant M.G., Richardson J.S., Richardson D.C. Structure validation by Calpha geometry: Phi, psi and Cbeta deviation. Proteins. 2003;50:437–450. doi: 10.1002/prot.10286. [DOI] [PubMed] [Google Scholar]
- 38.Yang J., Roy A., Zhang Y. Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment. Bioinformatics. 2013;29:2588–2595. doi: 10.1093/bioinformatics/btt447. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Pettersen E.F., Goddard T.D., Huang C.C., Couch G.S., Greenblatt D.M., Meng E.C., Ferrin T.E. UCSF Chimera—A visualization system for exploratory research and analysis. J. Comput. Chem. 2004;25:1605–1612. doi: 10.1002/jcc.20084. [DOI] [PubMed] [Google Scholar]
- 40.Abraham M.J., Murtola T., Schulz R., Páll S., Smith J.C., Hess B., Lindahl E.J.S. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015;1:19–25. doi: 10.1016/j.softx.2015.06.001. [DOI] [Google Scholar]
- 41.Rigsby R.E., Parker A.B. Using the PyMOL application to reinforce visual understanding of protein structure. Biochem. Mol. Biol. Educ. A Bimon. Publ. Int. Union Biochem. Mol. Biol. 2016;44:433–437. doi: 10.1002/bmb.20966. [DOI] [PubMed] [Google Scholar]
- 42.Venselaar H., Te Beek T.A., Kuipers R.K., Hekkelman M.L., Vriend G. Protein structure analysis of mutations causing inheritable diseases. An e-Science approach with life scientist friendly interfaces. BMC Bioinform. 2010;11:548. doi: 10.1186/1471-2105-11-548. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Laskowski R.A., Jabłońska J., Pravda L., Vařeková R.S., Thornton J.M. PDBsum: Structural summaries of PDB entries. Protein Sci. A Publ. Protein Soc. 2018;27:129–134. doi: 10.1002/pro.3289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Chen J.E., Huang C.C., Ferrin T.E. RRDistMaps: A UCSF Chimera tool for viewing and comparing protein distance maps. Bioinformatics. 2015;31:1484–1486. doi: 10.1093/bioinformatics/btu841. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Peng Y., Clark K.J., Campbell J.M., Panetta M.R., Guo Y., Ekker S.C. Making designer mutants in model organisms. Development. 2014;141:4042–4054. doi: 10.1242/dev.102186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Segre J.A., Bauer C., Fuchs E. Klf4 is a transcription factor required for establishing the barrier function of the skin. Nat. Genet. 1999;22:356–360. doi: 10.1038/11926. [DOI] [PubMed] [Google Scholar]
- 47.Ahmad S.U., Hafeez Kiani B., Abrar M., Jan Z., Zafar I., Ali Y., Alanazi A.M., Malik A., Rather M.A., Ahmad A., et al. A comprehensive genomic study, mutation screening, phylogenetic and statistical analysis of SARS-CoV-2 and its variant omicron among different countries. J. Infect. Public Health. 2022;15:878–891. doi: 10.1016/j.jiph.2022.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Shah A.A., Amjad M., Hassan J.-U., Ullah A., Mahmood A., Deng H., Ali Y., Gul F., Xia K. Molecular Insights into the Role of Pathogenic nsSNPs in GRIN2B Gene Provoking Neurodevelopmental Disorders. Genes. 2022;13:1332. doi: 10.3390/genes13081332. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Wadood A., Shareef A., Ur Rehman A., Muhammad S., Khurshid B., Khan R.S., Shams S., Afridi S.G. In Silico Drug Designing for ala438 Deleted Ribosomal Protein S1 (RpsA) on the Basis of the Active Compound Zrl15. ACS Omega. 2022;7:397–408. doi: 10.1021/acsomega.1c04764. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Khan M.S., Mehmood B., Yousafi Q., Bibi S., Fazal S., Saleem S., Sajid M.W., Ihsan A., Azhar M., Kamal M.A. Molecular docking studies reveal rhein from rhubarb (rheum rhabarbarum) as a putative inhibitor of ATP-binding cassette super-family G member 2. Med. Chem. 2021;17:273–288. doi: 10.2174/1573406416666191219143232. [DOI] [PubMed] [Google Scholar]
- 51.Essadssi S., Krami A.M., Elkhattabi L., Elkarhat Z., Amalou G., Abdelghaffar H., Rouba H., Barakat A. Computational Analysis of nsSNPs of ADA Gene in Severe Combined Immunodeficiency Using Molecular Modeling and Dynamics Simulation. J. Immunol. Res. 2019;2019:5902391. doi: 10.1155/2019/5902391. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Yue P., Moult J. Identification and analysis of deleterious human SNPs. J. Mol. Biol. 2006;356:1263–1274. doi: 10.1016/j.jmb.2005.12.025. [DOI] [PubMed] [Google Scholar]
- 53.Bibi S., Sakata K. An integrated computational approach for plant-based protein tyrosine phosphatase non-receptor type 1 inhibitors. Curr. Comput.-Aided Drug Des. 2017;13:319–335. doi: 10.2174/1573409913666170406145607. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All the data regarding research work are clearly presented in the research work. Some of data are provided by special request of author.