Skip to main content
Bioinformation logoLink to Bioinformation
. 2011 Mar 2;6(1):23–30. doi: 10.6026/97320630006023

Comparative analysis of human matrix metalloproteinases: Emerging therapeutic targets in diseases

Astha Jaiswal 1, Aastha Chhabra 1, Umang Malhotra 1, Shrey Kohli 1, Vibha Rani 1,*
PMCID: PMC3064848  PMID: 21464841

Abstract

The identification of specific target proteins for any diseased condition involves extensive characterization of the potentially involved proteins. Members of a protein family demonstrating comparable features may show certain unusual features when implicated in a pathological condition. Advancements in the field of computational biology and the use of various bioinformatics tools for analysis can aid researchers to comprehend their system of work in primary stages of research. This initial screening can help to reduce time and cost of testing and experimentation in laboratory. Human matrix metalloproteinase (MMP) family of endopeptidases is one such family of 23 members responsible for the remodeling of extracellular matrix (ECM) by degradation of the ECM proteins. Though their role has been implicated in various pathological conditions such as arthritis, atherosclerosis, cancer, liver fibrosis, cardio-vascular and neurodegenerative disorders, little is known about the specific involvement of members of the large MMP family in diseases. A comparative in silico characterization of the MMP protein family has been carried out to analyze their physico-chemical, secondary structural and functional properties. Based on the observed patterns of occurrence of atypical features, we hypothesize that cysteine rich and highly thermostable MMPs might be key players in diseased conditions. Thus, a plausible grouping of disease responsive MMPs that might be considered as promising clinical targets may be done. This study can be used as a fundamental approach to characterize, analyze and screen large protein families for the identification of signature patterns.

Background

The past two decades have seen an exponential rise in the accumulation of genomic and proteomic data stored in the form of countless numbers of nucleotide and protein sequences in the data banks. Massive efforts by thousands of research scientists are being done to annotate the structures and functions of proteins in biological organisms. Conversely, the understanding of the function of newly discovered proteins that may potentially play a designated role in normal or diseased conditions is greatly aided by this amassing collection of data [1]. The systematic annotation of these protein sequences with the help of bioinformatics tools is one of the major thrust areas of application biology today. The progress in the field of bioinformatics is marked by the development of numerous tools through which the classification and identification of certain significant proteins has been made systematic and easier, thus saving the time and cost of experimentation by repeated trial and error in the laboratory. Such a prior analysis may also provide a direction to wet laboratory studies and thus help to integrate the fields of in silico and experimental work together. Protein families consist of proteins that have evolved during the course of time from a common ancestor and exhibit a threshold level of relationship [2]. MMPs are a family of zinc containing endopeptidases, which is a subset of the metzincin superfamily of metalloproteinases. These regulatory proteases are the extracellular matrix (ECM) remodelers characterized by their substrate specificity to degrade ECM proteins. Based on this, they have been classified as collagenases, gelatinases, stromelysins, matrilysins, membrane type MMPs (MT-MMPs) and other unclassified MMPs [3]. Structurally, MMPs consist of four domains: an amino terminal hydrophobic pro- domain, a Zn2+ containing catalytic domain, a flexible hinge region and a carboxy terminal hemopexin-like domain responsible for their substrate specific nature [4]. Activity of MMPs is regulated by Tissue Inhibitors of Matrix Metalloproteinases (TIMPs). Under normal conditions, this control is responsible for maintenance of the ECM. An imbalance in the regulation of activity may thus, disrupt the integrity of ECM [5]. MMPs have been implicated as clinical targets in numerous physiological and pathological conditions, such as arthritis, atherosclerosis, cancer, eye diseases, skin diseases, cardio-vascular and neurodegenerative disorders [6].

Out of the 26 MMPs reported till date, 23 have been identified in humans [7]. Our study reports an in silico comparative characterization and analysis of human MMPs using various bio-computational tools, pertaining to their physico-chemical, secondary structural and functional features. Any atypical but significant feature may have various connotations with respect to the role of MMPs in pathological conditions. The aim here is to identify potential disease responsive MMPs that might possibly be implicated for their role in diseases. Moreover, such an in depth knowledge of all human MMPs would greatly aid researchers to identify the MMPs of interest relevant to their respective working systems. This would further set a precedent for similar comparative characterization studies for other large protein families, using the numerous resources from the field of computational biology.

Methodology

Protein sequence retrieval

UniProtKB/Swiss-Prot, a high quality manually annotated and non-redundant protein sequence database, was used to retrieve the complete sequences of the 23 human MMPs [8]. These sequences were used for further analysis using various online bio-computational tools.

Physico-chemical analysis

The computation of various physical and chemical parameters, such as amino acid composition, molecular weight, isoelectric point (pI), total number of negative and positive charged residues, extinction coefficient, instability index, aliphatic index and Grand Average of Hydropathy (GRAVY), was done using ExPASy's ProtParam tool (http://us.expasy.org/ tools/protparam.html). ExPASy's ProtScale tool was used to analyze the number of codons, bulkiness, polarity, refractivity, recognition factors, hydrophobicity, transmembrane tendency, percent buried residues, percent accessible residues, average area buried, average flexibility and relative mutability (http://us.expasy.org/tools/protscale.html) [9].

Secondary structural analysis

SOPMA tool (Self-Optimized Prediction Method with Alignment) of NPS@ (Network Protein Sequence Analysis) server was used to characterize the secondary structural features of the proteins such as, alpha helix, 310 helix, Pi helix, beta bridge, extended strand, beta turn, bend region, random coil, ambiguous states and other states [10].

Functional analysis

The analysis of the MMP motifs was done with the help of Motif Scan tool (http://myhits.isb-sib.ch/cgi-bin/motif_scan) [11]. The SOSUI server prediction yielded the transmembrane regions of the human MMPs, which were further classified as membrane bound and soluble proteins [12].

Results and Discussion

MMPs are secreted in latent form as pro-MMPs and these zymogens are required to be cleaved for activation. They are found to exhibit pro and active forms, characterized by a difference in molecular weights (Table 1 see supplementary material). The exceptional behavior of MMP-12, with two active forms (45 kDa and 22 kDa), is because of an internal autolytic processing mechanism causing its carboxy terminal domain to be cleaved from its catalytic domain, thus yielding three products [13].

Analysis of amino acid composition indicates that while the percentage of cysteine residues in majority of MMPs lies in the range of 0.6-1.3%, MMP-2, 9 and 23 show a significant rise with values 2.9, 2.7 and 2.8 percent, respectively (Figure 1) (See Table 2 in supplementary material). High percentage of cysteine residues in MMP-2 and 9 might be correlated with presence of cysteine switch motif and role of these MMPs in pathological conditions. These gelatinases have been previously implicated in carcinomas and cardio-vascular disorders. High cysteine content of the unclassified MMP-23 might be attributed to the presence of cysteine array in its structure. Highly significant presence of cysteine suggests its role as a critical residue for MMP activity and thus these MMPs may be investigated for possible role in diseased conditions. Further analysis of the amino acid composition can help to locate amino acid presence at an unusual level and be correlated with specific pathological conditions [14].

Figure 1.

Figure 1

Percentage of cysteine residues in human MMPs computed by ExPASy's ProtParam tool. The amino acid composition of the 23 human MMPs was analyzed. Cysteine showed an abnormal trend as the percentage of cysteine residues in MMP-2, 9 and 23 was found to be exceptionally high as compared to other MMPs.

Other physico-chemical parameters also signify the behavior of MMPs in different conditions (see Table 3(a) and Table 3(b) see supplementary material). pI values for majority of MMPs (MMP-7, 12, 14, 15, 16, 19, 20, 21, 23, 24, 25, 27 and 28) lie in the alkaline range (pH>7) while for the others, (MMP-1, 2, 3, 8, 9, 10, 11, 13, 17 and 26) it falls in the acidic range (pH<7). In addition to this, the instability index classifies MMP - 1, 2, 3, 7, 8, 10, 12, 13, 16, 19, 20, 26 and 27 as stable (Instability index <40) and remaining as unstable metalloproteinases (Instability index <40) (Figure 2). Furthermore, aliphatic index, signifying the relative volume of protein occupied by aliphatic side chains helps to study thermo stable properties of an enzyme. It is found to span within a range of 61.09 to 83.59 (Figure 3). Stability of human MMPs in a small range suggests their unstable nature over wide temperature range, though MMP-23 is observed as the most thermostable MMP. Moreover, high extinction coefficients are observed for MMP- 2, 15, 16, 21, 24 and 28, which is correlated with a high concentration of lysine, tryptophan and tyrosine residues in the sequence and may be useful in protein-protein and proteinligand interaction studies in solution. Hydrophobicity values range from - 0.6720 of MMP-19 (most hydrophilic) to 0.4615 of MMP-14 (most hydrophobic).

Figure 2.

Figure 2

Distribution plot of Stable and Unstable MMPs as computed by ExPASy's ProtParam tool. The instability index classified MMP - 1, 2, 3, 7, 8, 10, 12, 13, 16, 19, 20, 26 and 27 as stable (Instability index <40) and MMP - 9, 11, 14, 15, 17, 21, 23, 24, 25 and 28 as unstable (Instability index >40) metalloproteinases.

Figure 3.

Figure 3

Computation of aliphatic index by ExPASy's ProtParam tool. The aliphatic index indicates the thermostability of proteins. MMP-23 was found to be the most thermostable MMP with a high aliphatic index of 83.59.

Secondary structural analysis indicates a pre-dominance of random coils, followed by α-helices, extended strands and β-turns in 20 MMPs while the extended strands exceed α-helices in MMP-9, 11 and 19 (Figure 4) (see Table 4). This is useful to predict three dimensional structures of proteins and can help in approximation of some aspects of protein function and their classification into families [15].

Figure 4.

Figure 4

Analysis of secondary structural features through SOPMA. The computation of 23 human MMPs showed a pre-dominance of random coils, followed by α-helices, extended strands and β-turns in 20 MMPs, while extended strands exceeded α-helices in MMP-9, 11 and 19. The figure shows an average plot of the data of all 23 human MMPs.

The Motif Scan tool predicts the presence of a cysteine switch, a zinc protease and a hemopexin motif in human MMPs which have been the subject of discussion in various literatures (see Table 5). The cysteine switch regulates activity of MMPs via complex formation between cysteine residue of prodomain and zinc atom of catalytic domain [16]. The hemopexin domain is an essential part of MMPs performing multiple functions in activation and inhibition, homodimerization and multimerization, binding and cleavage of substrates, attachment to cell surface and degradation of MMPs [17]. The primary sequence motif HExxH is present in the catalytic domain of zinc-dependant MMPs. While the two conserved histidine residues coordinate the zinc atom, the glutamic acid residue is a member of the active site of enzyme [18]. The zinc binding region signature has been defined as (uncharged)-(uncharged)-H-E-(uncharged)-(uncharged)-H-(uncharged)-(hydro phobic) [19]. Furthermore, an extra type II domain of fibronectin is found in MMP-2 and 9 at three regions within the catalytic domain, playing a pivotal role in the collagen binding region of these enzymes. MMP-17, 19, 21, 23, 25, 26 and 28, classified as ‘other MMPs’, show the presence of zinc protease motif only. Also, transmembrane regions of length 20-23 base pairs are predicted in 14 MMPs using SOSUI server (see Table 6). MMP-15, 16, 24 and 25 are found to possess two transmembrane regions.

Conclusion

Intensive characterization and comparative analysis of the MMP family of proteins with the help of numerous bio-computational tools yielded new insights and perspectives which can be used to identify and group MMPs that play a crucial role in a pathological condition. In this study, physico-chemical, secondary structural and functional analysis of the large human MMP family was carried out. The findings through this study may be used by researchers working on MMPs in context of any experimental system. The amino acid composition shows a considerably high percentage of cysteine residues in MMP-23, along with MMP-2 and 9. Also, MMP 23 is found to be the most thermo stable MMP. We, thus, hypothesize that MMP-23, along with MMP-2 and 9, might be a key player in pathological conditions. Further studies with the help of experimental research and testing need to be carried out to validate this proposal. In this manner, certain other groupings and clustering of disease responsive MMPs can be made by analysis of the various parameters of MMPs computed using bioinformatics tools. Additionally, this study may be taken as a prototype for similar in silico investigational studies with regard to other large proteins families, wherein such comparative analysis might aid in giving a direction and help to streamline the conduct of experimentation.

Supplementary material

Data 1
97320630006023S1.pdf (307KB, pdf)

Footnotes

Citation:Jaiswal et al, Bioinformation 6(1): 23-30 (2011)

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data 1
97320630006023S1.pdf (307KB, pdf)

Articles from Bioinformation are provided here courtesy of Biomedical Informatics Publishing Group

RESOURCES