Abstract
The VoroMQA (Voronoi tessellation-based Model Quality Assessment) web server is dedicated to the estimation of protein structure quality, a common step in selecting realistic and most accurate computational models and in validating experimental structures. As an input, the VoroMQA web server accepts one or more protein structures in PDB format. Input structures may be either monomeric proteins or multimeric protein complexes. For every input structure, the server provides both global and local (per-residue) scores. Visualization of the local scores along the protein chain is enhanced by providing secondary structure assignment and information on solvent accessibility. A unique feature of the VoroMQA server is the ability to directly assess protein-protein interaction interfaces. If this type of assessment is requested, the web server provides interface quality scores, interface energy estimates, and local scores for residues involved in inter-chain interfaces. VoroMQA, the underlying method of the web server, was extensively tested in recent community-wide CASP and CAPRI experiments. During these experiments VoroMQA showed outstanding performance both in model selection and in estimation of accuracy of local structural regions. The VoroMQA web server is available at http://bioinformatics.ibt.lt/wtsam/voromqa.
INTRODUCTION
Knowledge of three-dimensional (3D) structures of proteins and protein complexes is essential for comprehensive understanding of protein function, interactions and dynamics. Experimentally determined protein structures are accumulating at a steady pace, however, due to the flood of sequence data, the gap between the known protein sequences and structures is only widening. Not surprisingly, computationally derived structural models of proteins and protein complexes are gaining importance. The usefulness of a computational structural model for specific application is largely determined by the model accuracy (1). Therefore, protein structure assessment methods that are able to provide reliable estimates of both overall model accuracy and accuracy of local structural regions are becoming of prime importance.
One of the practical applications of methods for protein structure assessment is to help users to make an informed selection of computational models. At present, there are multiple automatic modeling pipelines, often implemented as web servers, that can computationally derive structural model(s) for virtually any protein sequence (2). However, having a set of computational models, often significantly different from each other, it may be not at all obvious, which model to select and which regions in the selected model are reliable and which are not. Effective structure assessment methods can help answer such questions.
Assessment of protein structure quality is also a key component in protein structure prediction and refinement. Community-wide CASP experiments that monitor state-of-the-art in protein structure prediction have recognized that the estimation of model accuracy continues to be an important bottleneck (3). Model accuracy estimation is also part of CAMEO, a platform for continuous testing of the computational methods (4).
Protein structures solved using experimental techniques (X-ray crystallography, NMR or cryo-EM) are often considered as the standard of truth. However, it is important to keep in mind that these structures are also models, even though they are derived from experimental data. Although experimental structures deposited in PDB undergo careful validation (5), some of them, especially at low resolution, may occasionally have significant errors such as incorrect chain topology or a shift in the residue register. Structure quality assessment methods can help identify such cases.
Here, we describe the VoroMQA web server devoted to the assessment of protein structure quality. VoroMQA (6), which stands for Voronoi tessellation-based Model Quality Assessment, is a method for the estimation of protein structure quality with an all-atom knowledge-based statistical potential at its core. However, in contrast to traditional statistical potentials based on interatomic distances, VoroMQA characterizes interactions through interatomic contact areas derived from the Voronoi tessellation of atomic balls (7). The VoroMQA scoring function can assess both monomeric proteins and multisubunit complexes. It produces quality scores at the level of atoms, residues and the entire structure. Since VoroMQA is based on contact areas, it can also provide scores for interaction surfaces in a straightforward way. Thus, VoroMQA can directly assess a protein-protein interaction interface, a surface defined by contact areas between atoms from different subunits. Although a number of structure quality assessment methods can score protein complexes, to our knowledge, the ability to provide scores specifically for the protein-protein interface is a unique feature among such methods. The server is designed to provide users not only with an easy access to all the major functionalities of VoroMQA, but also with the ability to interactively control the extent and type of data displayed.
MATERIALS AND METHODS
Contact areas
A protein structure can be represented as a set of atomic balls, each ball having a van der Waals radius corresponding to the atom type. A ball can be assigned a region of space that contains all the points that are closer (or equally close) to that ball than to any other. Such a region is called a Voronoi cell (Figure 1A) and the partitioning of space into Voronoi cells is called Voronoi tessellation or Voronoi diagram. Two adjacent Voronoi cells share a set of points that form a surface called a Voronoi face (Figure 1B). A Voronoi face can be viewed as a geometric representation of a contact between two atoms. The Voronoi cells of atomic balls may be constrained inside the boundaries defined by the solvent accessible surface of the same balls. Combining constrained contacts can be used to precisely define complex interaction interfaces (Figure 1C). The procedure to construct the described contact surfaces is implemented in Voronota (7), it uses triangulated representations of Voronoi faces and spherical surfaces. Contact areas are calculated as the areas of the corresponding triangulations.
Figure 1.

Using Voronoi tessellation to define contacts. (A) An example of a Voronoi cell, drawn with edges and with faces. (B) Defining a contact as a Voronoi face between two adjacent Voronoi cells. (C) Constraining contacts by a solvent accessible surface and describing an inter-chain interface.
Scoring using contact areas
VoroMQA (6) evaluates the quality of protein structural models using inter-atomic and solvent contact areas and employing the idea of a knowledge-based statistical potential. A contact type (ai, aj, ck) is described by two atom types (ai, aj) and a contact category ck and can be assigned a pseudo-energy value E(ai, aj, ck) calculated from the corresponding expected and observed probabilities (the probability values are estimated empirically using the contact area values calculated for a set of high-quality experimentally determined protein structures):
![]() |
(1) |
Given a single atom ϕ and the set of associated contacts Ωϕ, a normalized pseudo-energy value En(Ωϕ) is computed using the contact types and areas known for each contact ω ∈ Ωϕ:
![]() |
(2) |
(Ωϕ) is transformed into an atom quality score Qa(Ωϕ) ∈ [0, 1] using the Gauss error function with atom type-dependent μ (mean) and σ (standard deviation) values:
![]() |
(3) |
Given a set of all the atoms in a protein structure, the global structure quality score is defined as a weighted arithmetic mean of the atomic quality scores with weights indicating how deep each atom is buried inside a structure. The quality score of a residue is defined as an average of the quality scores of its atoms.
The score of an inter-chain interface is defined as an average of the quality scores of all the atoms that participate in the inter-chain contacts. Another VoroMQA-based interface assessment measure, called ‘interface energy’ (8), is defined as a total sum of the interface contact areas multiplied by the corresponding pseudo-energy values.
WEB SERVER DESCRIPTION
Input
As an input, the VoroMQA web server accepts one or more protein structure files in the PDB format. All non-protein atoms (ligands, nucleic acids, etc.) in input files are ignored. A user can specify either of the two ways to read each of the uploaded PDB files: as plain structures (read all the protein atoms and, if there are multiple ‘MODEL’ blocks, split the input into multiple structures); or as biological assemblies (read all the protein atoms and, if there are multiple ‘MODEL’ blocks, combine the input into a single multimeric structure). Alternatively, a user may request the VoroMQA server to download structures directly from the Protein Data Bank (9) by specifying a PDB ID.
An input structure can be a single subunit or a protein complex comprised of multiple chains. The VoroMQA server performs the same whole structure assessment for both single-chain and multi-chain structures, that is, inter-chain contacts are treated in the same way as contacts within a single chain. However, if requested by a user (via the designated check-box) the server can additionally evaluate all the inter-chain interfaces found in the input structures. In that case the results of interface assessment are appended to the whole structure assessment results.
In the case of uploading files as plain structures, a user can optionally provide an amino acid sequence to filter and renumber the residues in the input files. If this option is used, then, for each input structure, the VoroMQA server aligns the sequence extracted from the structure with the user-specified sequence. In the case of a multi-chain input structure, sequences of the individual chains are concatenated into a single sequence which is then aligned with the user-specified sequence. The residues that are left unmatched in the resulting alignment are discarded from the input structure (Figure 2). The remaining residues are renumbered according to the user-specified sequence, but the chain names are left unchanged. The user-specified sequence is not used in the scoring process, it is only used to alter (cut and renumber) the input structures. This option removes the need to edit PDB files in cases such as the assessment of a common structural core of models having heterogeneous tails.
Figure 2.
Truncating input structures according to the user-provided sequence. Unaligned regions (red) are removed from the input structures.
Structure scoring output (default)
There are two parts of the output information provided by the VoroMQA server: global and local. The default output page shows both parts alongside (Figure 3).
Figure 3.
Default output page with global and local scores: default view (A); compacted view (B).
For every processed structure, the global output contains the global VoroMQA score and the numbers of residues and atoms. To aid users in interpreting the global output, the output scores are put on the plot that summarizes the distribution of VoroMQA scores of high-quality X-ray structures (Figure 5B). This provides a context for judging the level of realism of the processed structures. It is not uncommon to see the global scores below the red line (indicating poor quality) for computational models, but in the case of experimental structures this calls for caution. The structure may have unusual properties or it may have serious flaws (see Discussion for details on this example).
Figure 5.
(A) Structures of the Rad9-Rad1-Hus1 complex, solved by three different groups, colored by smoothed per-residue VoroMQA scores using red-white-blue color gradient (lower scores are red, higher scores are blue). (B) Plot for interpreting global scores for a structure of a given size. The 90% of VoroMQA scores for high quality X-ray structures fall between red and blue lines. The gray line indicates median of the scores. Only 5% of scores fall below the red line and 5% are above the blue line.
The local output contains local VoroMQA scores (per-atom and per-residue) and additional per-residue information on secondary structure and solvent accessibility. There are two types of local per-residue scores: raw (detailed) and smoothed via sliding window to reduce noise. The local scores are provided in three forms: (i) as temperature factor values written into PDB files that can be either downloaded or viewed in 3D with JSmol (http://www.jmol.org), (ii) as interactive (clickable) color-coded profiles that show per-residue scores and (iii) as an interactive cartesian plot that shows both raw and smoothed local scores together. The secondary structure and solvent accessibility information is also presented as interactive color-coded profiles. Various visualizations can be turned off and on, allowing a user to focus on some of the features without being distracted by the others, which is particularly useful when viewing results for multiple models on a single page (Figure 3B).
Interface scoring output
If the evaluation of inter-chain interactions was requested, then the output for each processed multisubunit structure is enhanced (Figure 4). The global part is augmented with several values: the numbers of the interface atoms and the interface residues (the atoms and the residues that participate in the inter-chain contacts); the total area of the the inter-chain contacts; the interface quality VoroMQA score (average VoroMQA score of the interface atoms); the total VoroMQA pseudo-energy of the inter-chain contacts.
Figure 4.
Output page with both whole structure and interface quality assessment results: (A) default view; (B) compacted view.
By default the per-model outputs are sorted by the whole-structure VoroMQA scores. However, there also are other ordering options, including a tournament-based sorting procedure that accounts for both the whole-structure and the interface-based scores (8). Another way for a user to analyze several scores at once is through the interactive chart in which, for every processed complex structure, the whole-structure and the interface-based scores are plotted against the total pseudo-energy of the inter-chain interface.
The local output is augmented with the local VoroMQA scores of the interface atoms and the interface residues. The per-atom scores are written into PDB files and can be viewed in JSmol. The per-residue scores (which are the averages of the per-atom scores) are presented as an interactive color-coded profile. This profile is best viewed in conjunction with the profile that visualizes the inter-chain contact areas of the scored residues (Figure 4B).
DISCUSSION
The server provides a comfortable interface to the VoroMQA method at the same time enabling a user to perform advanced analysis of scoring results. The VoroMQA server includes the major capabilities exhibited by some of the previously developed protein structure quality assessment servers: producing both global and local quality scores (eQuant (10), ModFOLD (11), ProQ2 (12), ProQ3/ProQ3D (13,14), ProSA-web (15), QMEAN (16,17)); processing multiple structures (eQuant, ModFOLD, ProQ2, ProQ3/ProQ3D); processing structures with multiple chains (DFIRE (18), eQuant, QMEAN, SBROD (19)); providing means to interpret global scores (eQuant, ModFOLD, ProSA-web, QMEAN); providing interactive visualizations of local scores (eQuant, ModFOLD, ProSA-web, QMEAN); taking less than a minute to fully analyze an average-sized input structure (DFIRE, eQuant, ProSA-web, SBROD). In addition to all the aforementioned capabilities, the VoroMQA server allows scoring of inter-chain interfaces. Thus, the server provides quality assessment scores on four levels: atom, residue, interface and whole-structure. This makes the VoroMQA server a uniquely versatile tool for the quality assessment of both monomeric and multimeric protein structures.
An additional factor, contributing to the practical value of the VoroMQA server, is that VoroMQA does not use any additional evolutionary or structural information (e.g. sequence conservation, predicted secondary structure, etc.) that may change with time. Therefore, the same protein structure always produces the same VoroMQA score, a feature important for reproducibility.
Versatility of the server would be of little value without the robust performance of an underlying scoring method. The performance of VoroMQA was tested extensively in recent CASP and CAPRI experiments. In CASP12 the ‘VoroMQA-select’ group, which used VoroMQA to identify the best models generated by automated servers, outperformed all but one group in template-based modeling category (20). The interface scoring by VoroMQA was a key element in achieving the best performance in modeling protein assemblies by the ‘Venclovas’ group during CASP12-CAPRI experiment (8,21). Most recently, during CASP13, VoroMQA was identified as one of the best methods for detection of unreliable local regions, emphasizing its excellent local scoring capabilities (http://predictioncenter.org/casp13/doc/presentations/Assessment_EMA_andRoundTable_Redacted.pdf). The best performance in protein assembly modeling by the ‘Venclovas’ group in CASP13 has reaffirmed the value of interface scoring provided by VoroMQA (http://predictioncenter.org/casp13/doc/presentations/Assessment_assembly_JDuarte.pdf).
Scoring computational models is only one of possible application areas of the VoroMQA server. The server may also be used for independent assessment of experimental structures prior to their deposition into PDB or for helping to avoid utilizing PDB structures that do have serious flaws. The case in point is illustrated with an example in Figure 5. In PDB there are three independently solved structures of the human Rad9–Rad1–Hus1 (9–1–1) DNA damage checkpoint complex (22–24). One of these structures, PDB entry 3GGR (24), is an obvious outlier according to the global VoroMQA scores (Figure 5B). Superposition of 3GGR onto either of the other two structures reveals multiple regions displaying register-shift errors that can be seen as poor (red) local VoroMQA scores (Figure 5A). Unfortunately, this grossly incorrect structure has been used as the basis for other studies, including molecular dynamics simulations (25). VoroMQA and perhaps other methods of similar nature could have easily prevented selection of this incorrect structure as the basis for subsequent studies.
CONCLUSIONS
The VoroMQA server provides a straightforward way to assess any protein structure of interest, be it experimental or theoretical, monomeric or multimeric. User-friendly interface provides both global and local scores and enables visualization of these scores both in the context of 3D structures and along the sequence. The server therefore might be useful for various tasks such as flagging suspicious experimental structures and pinpointing problematic regions, selecting the most accurate computational models and estimating the accuracy of local regions as well as assessing the interaction interfaces in protein complexes.
ACKNOWLEDGEMENTS
The authors thank Justas Dapkūnas, Darius Kazlauskas and Kęstutis Timinskas for insightful suggestions on the server development.
FUNDING
Research Council of Lithuania [S-MIP-17-60]. Funding for open access charge: Research Council of Lithuania.
Conflict of interest statement. None declared.
REFERENCES
- 1. Baker D., Sali A.. Protein structure prediction and structural genomics. Science. 2001; 294:93–96. [DOI] [PubMed] [Google Scholar]
- 2. Lam S.D., Das S., Sillitoe I., Orengo C.. An overview of comparative modelling and resources dedicated to large-scale modelling of genome sequences. Acta Crystallogr. D Struct. Biol. 2017; 73:628–640. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Moult J., Fidelis K., Kryshtafovych A., Schwede T., Tramontano A.. Critical assessment of methods of protein structure prediction (CASP)-Round XII. Proteins. 2018; 86:7–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Haas J., Barbato A., Behringer D., Studer G., Roth S., Bertoni M., Mostaguir K., Gumienny R., Schwede T.. Continuous Automated Model EvaluatiOn (CAMEO) complementing the critical assessment of structure prediction in CASP12. Proteins. 2018; 86:387–398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Gore S., Sanz García E., Hendrickx P.M.S., Gutmanas A., Westbrook J.D., Yang H., Feng Z., Baskaran K., Berrisford J.M., Hudson B.P. et al.. Validation of Structures in the Protein Data Bank. Structure. 2017; 25:1916–1927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Olechnovič K., Venclovas Č.. VoroMQA: assessment of protein structure quality using interatomic contact areas. Proteins. 2017; 85:1131–1145. [DOI] [PubMed] [Google Scholar]
- 7. Olechnovič K., Venclovas Č.. Voronota: a fast and reliable tool for computing the vertices of the Voronoi diagram of atomic balls. J. Comput. Chem. 2014; 35:672–681. [DOI] [PubMed] [Google Scholar]
- 8. Dapkūnas J., Olechnovič K., Venclovas Č.. Modeling of protein complexes in CAPRI Round 37 using template-based approach combined with model selection. Proteins. 2018; 86:292–301. [DOI] [PubMed] [Google Scholar]
- 9. Berman H.M., Westbrook J., Feng Z., Gilliland G., Bhat T.N., Weissig H., Shindyalov I.N., Bourne P.E.. The Protein Data Bank. Nucleic Acids Res. 2000; 28:235–242. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Bittrich S., Heinke F., Labudde D.. Kozielski S, Mrozek D, Kasprowski P, Małysiak-Mrozek B, Kostrzewa D. eQuant - a server for fast protein model quality assessment by integrating hHigh-dimensional data and machine learning. Beyond Databases, Architectures and Structures. Advanced Technologies for Data Mining and Knowledge Discovery. 2016; 613:Cham: Springer International Publishing; 419–433. [Google Scholar]
- 11. Maghrabi A.H.A., McGuffin L.J.. ModFOLD6: an accurate web server for the global and local quality estimation of 3D protein models. Nucleic Acids Res. 2017; 45:W416–W421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Ray A., Lindahl E., Wallner B.. Improved model quality assessment using ProQ2. BMC Bioinformatics. 2012; 13:224. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Uziela K., Shu N., Wallner B., Elofsson A.. ProQ3: improved model quality assessments using Rosetta energy terms. Sci. Rep. 2016; 6:33509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Uziela K., Menéndez Hurtado D., Shu N., Wallner B., Elofsson A.. ProQ3D: improved model quality assessments using deep learning. Bioinformatics. 2017; 33:1578–1580. [DOI] [PubMed] [Google Scholar]
- 15. Wiederstein M., Sippl M.J.. ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins. Nucleic Acids Res. 2007; 35:W407–W410. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Benkert P., Tosatto S. C.E., Schomburg D.. QMEAN: a comprehensive scoring function for model quality assessment. Proteins. 2008; 71:261–277. [DOI] [PubMed] [Google Scholar]
- 17. Waterhouse A., Bertoni M., Bienert S., Studer G., Tauriello G., Gumienny R., Heer F.T., de Beer T.A.P., Rempfer C., Bordoli L. et al.. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res. 2018; 46:W296–W303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Yang Y., Zhou Y.. Ab initio folding of terminal segments with secondary structures reveals the fine difference between two closely related all-atom statistical energy functions. Protein Sci. 2008; 17:1212–1219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Karasikov M., Pagès G., Grudinin S.. Smooth orientation-dependent scoring function for coarse-grained protein quality assessment. Bioinformatics. 2018; doi:10.1093/bioinformatics/bty1037. [DOI] [PubMed] [Google Scholar]
- 20. Kryshtafovych A., Monastyrskyy B., Fidelis K., Moult J., Schwede T., Tramontano A.. Evaluation of the template-based modeling in CASP12. Proteins. 2018; 86:321–334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Lensink M.F., Velankar S., Baek M., Heo L., Seok C., Wodak S.J.. The challenge of modeling protein assemblies: the CASP12-CAPRI experiment. Proteins. 2018; 86:257–273. [DOI] [PubMed] [Google Scholar]
- 22. Sohn S.Y., Cho Y.. Crystal structure of the human rad9-hus1-rad1 clamp. J. Mol. Biol. 2009; 390:490–502. [DOI] [PubMed] [Google Scholar]
- 23. Doré A.S., Kilkenny M.L., Rzechorzek N.J., Pearl L.H.. Crystal structure of the rad9-rad1-hus1 DNA damage checkpoint complex–implications for clamp loading and regulation. Mol. Cell. 2009; 34:735–745. [DOI] [PubMed] [Google Scholar]
- 24. Xu M., Bai L., Gong Y., Xie W., Hang H., Jiang T.. Structure and functional implications of the human rad9-hus1-rad1 cell cycle checkpoint complex. J. Biol. Chem. 2009; 284:20457–20461. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Querol-Audí J., Yan C., Xu X., Tsutakawa S.E., Tsai M.-S., Tainer J.A., Cooper P.K., Nogales E., Ivanov I.. Repair complexes of FEN1 endonuclease, DNA, and Rad9-Hus1-Rad1 are distinguished from their PCNA counterparts by functionally important stability. Proc. Natl. Acad. Sci. U.S.A. 2012; 109:8528–8533. [DOI] [PMC free article] [PubMed] [Google Scholar]







