MODexplorer: an integrated tool for exploring protein sequence, structure and function relationships

Jan Kosinski; Alessandro Barbato; Anna Tramontano

doi:10.1093/bioinformatics/btt062

. 2013 Feb 8;29(7):953–954. doi: 10.1093/bioinformatics/btt062

MODexplorer: an integrated tool for exploring protein sequence, structure and function relationships

Jan Kosinski ^1,^†, Alessandro Barbato ^1,^†, Anna Tramontano ^1,2,3,^*

PMCID: PMC3605600 PMID: 23396123

Abstract

Summary: MODexplorer is an integrated tool aimed at exploring the sequence, structural and functional diversity in protein families useful in homology modeling and in analyzing protein families in general. It takes as input either the sequence or the structure of a protein and provides alignments with its homologs along with a variety of structural and functional annotations through an interactive interface. The annotations include sequence conservation, similarity scores, ligand-, DNA- and RNA-binding sites, secondary structure, disorder, crystallographic structure resolution and quality scores of models implied by the alignments to the homologs of known structure.

MODexplorer can be used to analyze sequence and structural conservation among the structures of similar proteins, to find structures of homologs solved in different conformational state or with different ligands and to transfer functional annotations. Furthermore, if the structure of the query is not known, MODexplorer can be used to select the modeling templates taking all this information into account and to build a comparative model.

Availability and implementation: Freely available on the web at http://modorama.biocomputing.it/modexplorer. Website implemented in HTML and JavaScript with all major browsers supported.

Contact: anna.tramontano@uniroma1.it

Supplementary information: Supplementary data are available at Bioinformatics online

1 INTRODUCTION

Exploring the sequence, structure and function relationships between a protein and its homologs is a powerful strategy to transfer functional annotations such as, for example, ligand-binding sites and/or to detect the existence of alternative conformational states. Likewise, in homology modeling, a thorough survey of all homologs with known structure often can lead to building much better models than obtained by automatically selecting the closest homolog as a template.

Current tools either provide only a general overview of sequence–structure–function relationships [e.g. GeneSilico Metaserver (Kurowski and Bujnicki, 2003) or MESSA (Cong and Grishin, 2012)] or focus on selected aspects, such as ligand-binding sites [e.g. firestar (Lopez et al., 2011)]. A platform for the in-depth integrated analysis of the sequence, structure and function relationships is still missing. Consequently, the information contained in remote relationships may be missed or exploited only through tedious procedures.

Here we describe MODexplorer, a web server that integrates sequence analysis and structure comparison with functional annotations. It provides a more complete and detailed view of the sequence, structural and functional diversity within and between protein families than existing tools. If the structure of the query protein is not known, it permits to select a template for homology modeling and to automatically build the model.

2 TOOL DESCRIPTION

A snapshot of the user interface of MODexplorer is shown in Figure 1. A detailed description of integrated software and databases, along with the parameters and references, is included in Supplementary Data and available on the server. MODexplorer accepts as input a protein sequence, PDB code or PDB file. MODexplorer then

Creates the multiple sequence alignment (MSA) of the query protein family using HHblits (Remmert et al., 2012).
Generates alignments to proteins of known structure using HHSearch (Söding, 2005) and the PDB database filtered at 70% sequence identity.
For every HHSearch hit, retrieves and aligns related PDB chains, i.e. chains with a sequence similarity >70%. Such ‘redundant' chains can be useful. For example, they might have been solved in complex with different ligands, represent alternative conformational states or have a better structural quality.
Displays alignments to the HHSearch hits and their related PDB chains both schematically with a BLAST-like bar diagram and as multiple sequence alignments including query and hit sequences, and their homologs.
Graphically shows annotations on the alignments. The annotations include ligand- and DNA-/RNA-binding sites, secondary structure (predicted for the query and calculated for chains of known structures), disorder (predicted for the sequence query and estimated by B-factor/missing residue annotations for the structure query and known hit structures), HHSearch similarity scores and QMEAN (Benkert et al., 2011) scores of models built based on the alignments.
Allows filtering of the hits based on the presence of nucleic acids and other ligands in the structures, on the HHSearch scores, the experimental technique used to solve the structures and the crystallographic resolution.
Allows modeling of the query based on any selected alignment. Models are built using Modeller (Sali and Blundell, 1993) and evaluated using QMEAN.
Provides visualization of structural superpositions of PDB chains and model structures in Jmol (http://www.jmol.org/).
Enables superposition of PDB chains based on pairwise alignment inferred from the alignments to the query. This is useful, for example, to verify if a low scoring hit has a similar structure in the aligned region as a higher scoring one.
Allows the assessment and modification of the alignments via our interactive alignment editor MODalign (Barbato et al., 2012).
Provides management utilities such as eliminating the hits and creating a ‘favorites' list of relevant hits.

Fig. 1. — Snapshot of the MODexplorer interface in ‘Ligands’ display mode, where ligand-binding sites are marked on the alignments. The interface is composed of three panels. The filtering panel allows filtering the hits by functional and structural annotations. The overview panel displays the hits as a BLAST-like diagram. The detail view panel enables displaying alignment of the query to currently selected hit along with the MSAs of their families. In this example (query: C-terminal domain of PMS2 protein), users can easily find that one of the structures (3KDK) related to one of the two top-scoring templates (3KDG) contains metal ions associated with conserved motifs (see detail view panel)

3 CONCLUSIONS

MODexplorer can be used both to explore the protein sequence, structural and functional diversity and to help in template selection in homology modeling. Thanks to tight integration of sequence, structure and function information, MODexplorer provides a comprehensive overview of the features of the target protein and of its homologs, helps in detecting remote homologs and facilitates the selection of templates for modeling.

Supplementary Material

Supplementary Data

supp_29_7_953__index.html^{(958B, html)}

ACKNOWLEDGEMENT

The authors would like to thank all members of the Biocomputing group for useful feedback.

Funding: KAUST Award No. KUK-I1-012-43 made by King Abdullah University of Science and Technology (KAUST), Fondazione Roma, FIRB Italbionet and RBIN06E9Z8_005, and European Molecular Biology Organization (EMBO) Long-Term Fellowship to J.K.

Conflict of Interest: none declared

REFERENCES

Barbato A, et al. Improving your target-template alignment with MODalign. Bioinformatics. 2012;28:1038–1039. doi: 10.1093/bioinformatics/bts070. [DOI] [PMC free article] [PubMed] [Google Scholar]
Benkert P, et al. Toward the estimation of the absolute quality of individual protein structure models. Bioinformatics. 2011;27:343–350. doi: 10.1093/bioinformatics/btq662. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cong Q, Grishin NV. MESSA: MEta-server for protein sequence analysis. BMC Biol. 2012;10:82. doi: 10.1186/1741-7007-10-82. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kurowski MA, Bujnicki JM. GeneSilico protein structure prediction meta-server. Nucleic Acids Res. 2003;31:3305–3307. doi: 10.1093/nar/gkg557. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lopez G, et al. Firestar–advances in the prediction of functionally important residues. Nucleic Acids Res. 2011;39:W235–W241. doi: 10.1093/nar/gkr437. [DOI] [PMC free article] [PubMed] [Google Scholar]
Remmert M, et al. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat. Methods. 2012;9:173–175. doi: 10.1038/nmeth.1818. [DOI] [PubMed] [Google Scholar]
Sali A, Blundell TL. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 1993;234:779–815. doi: 10.1006/jmbi.1993.1626. [DOI] [PubMed] [Google Scholar]
Söding J. Protein homology detection by HMM-HMM comparison. Bioinformatics. 2005;21:951–960. doi: 10.1093/bioinformatics/bti125. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

supp_29_7_953__index.html^{(958B, html)}

supp_btt062_MODexplorer_SUPP_DATA_updated.doc^{(76.5KB, doc)}

[btt062-B1] Barbato A, et al. Improving your target-template alignment with MODalign. Bioinformatics. 2012;28:1038–1039. doi: 10.1093/bioinformatics/bts070. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btt062-B2] Benkert P, et al. Toward the estimation of the absolute quality of individual protein structure models. Bioinformatics. 2011;27:343–350. doi: 10.1093/bioinformatics/btq662. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btt062-B3] Cong Q, Grishin NV. MESSA: MEta-server for protein sequence analysis. BMC Biol. 2012;10:82. doi: 10.1186/1741-7007-10-82. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btt062-B4] Kurowski MA, Bujnicki JM. GeneSilico protein structure prediction meta-server. Nucleic Acids Res. 2003;31:3305–3307. doi: 10.1093/nar/gkg557. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btt062-B5] Lopez G, et al. Firestar–advances in the prediction of functionally important residues. Nucleic Acids Res. 2011;39:W235–W241. doi: 10.1093/nar/gkr437. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btt062-B6] Remmert M, et al. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat. Methods. 2012;9:173–175. doi: 10.1038/nmeth.1818. [DOI] [PubMed] [Google Scholar]

[btt062-B7] Sali A, Blundell TL. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 1993;234:779–815. doi: 10.1006/jmbi.1993.1626. [DOI] [PubMed] [Google Scholar]

[btt062-B8] Söding J. Protein homology detection by HMM-HMM comparison. Bioinformatics. 2005;21:951–960. doi: 10.1093/bioinformatics/bti125. [DOI] [PubMed] [Google Scholar]

PERMALINK

MODexplorer: an integrated tool for exploring protein sequence, structure and function relationships

Jan Kosinski

Alessandro Barbato

Anna Tramontano

Abstract

1 INTRODUCTION

2 TOOL DESCRIPTION

Fig. 1.

3 CONCLUSIONS

Supplementary Material

ACKNOWLEDGEMENT

REFERENCES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

MODexplorer: an integrated tool for exploring protein sequence, structure and function relationships

Jan Kosinski

Alessandro Barbato

Anna Tramontano

Abstract

1 INTRODUCTION

2 TOOL DESCRIPTION

Fig. 1.

3 CONCLUSIONS

Supplementary Material

ACKNOWLEDGEMENT

REFERENCES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases