Graphical abstract
Keywords: Complexes, Docking, Molecular representations, Force field, Software
Abstract
The computational modeling field has vastly evolved over the past decades. The early developments of simplified protein systems represented a stepping stone towards establishing more efficient approaches to sample intricated conformational landscapes. Downscaling the level of resolution of biomolecules to coarser representations allows for studying protein structure, dynamics and interactions that are not accessible by classical atomistic approaches. The combination of different resolutions, namely hybrid modeling, has also been proved as an alternative when mixed levels of details are required. In this review, we provide an overview of coarse-grained/hybrid models focusing on their applicability in the modeling of biomolecular interactions. We give a detailed list of ready-to-use modeling software for studying biomolecular interactions allowing various levels of coarse-graining and provide examples of complexes determined by integrative coarse-grained/hybrid approaches in combination with experimental information.
1. Introduction
The chemistry that supports life is extremely sophisticated. Despite advances over the past decades, the scientific community still lacks fundamental knowledge to fully understand the biology behind the cell at atomic level. We know that basic subunit atoms (i.e. carbon, oxygen, hydrogen and nitrogen) can combine and form complex molecules such as lipids, carbohydrates, nucleic acids and proteins. At the same time, these biomolecules associate and create more intricated assemblies that adopt specific three-dimensional (3D) structures, essential for their biological functions. Their interactions mediate a wide range of biological functions such as for example signal transduction, molecular recognition or transport. Indeed, roughly 80% of the proteins might function upon association with other biomolecules [1]. It is therefore of great importance to understand how these macromolecules interact. Next to experimental methods, complementary computational approaches have been develop with the so-called integrative modeling emerging as the most promising strategy [2]. In short, integrative modeling aims at obtaining structural insights into a given system under study that cannot be revealed by a single approach alone. To do so, it combines data from multiple information sources (e.g. nuclear magnetic resonance (NMR) spectroscopy, cryo-electron microscopy (cryo-EM), mass spectrometry (MS), small angle x-ray scattering (SAXS), bioinformatics analysis…) [3] into computational approaches to model the assemblies. Integrative modelling has been extensively used to model increasingly larger systems in the recent past [4]. In this sense, we are probably closer than ever to construct a predictive model of an entire cell [5].
Classical atomistic computational modeling of interactions remains inefficient for many molecular assemblies. Larger systems often require longer simulations and their complex conformational landscapes cannot be efficiently and thoroughly sampled by atomistic approaches. The simplification of large systems to coarser representations offers a valuable approach to alleviate those limitations. There is already a huge body of literature on this topic and, in the present work, we do not aspire to give the most comprehensive review covering all possible contributions, but will focus on the modeling of biomolecular interactions. i.e. complexes, involving proteins, peptides and nucleic acids (DNA and RNA). The remaining of the text is organized as follows: We first start with a brief historical overview of the development of coarse-graining. We then describe several representative designs of simplified systems and parametrization strategies and discuss how these can be implemented into the modeling of biomolecular complexes, both for the generation of possible conformations (sampling) and the discrimination between native and non-native models (scoring). Finally, we provide an overview of currently available software that support coarse-grained modeling of biomolecular complexes and highlight several representative applications.
2. Historical perspective
The structural characterization of lysozyme in 1967 [6] spurred Arieh Warshel to study enzymatic reaction mechanisms. His developments in this field under the supervision of Martin Karplus, inaugurated the now well-established quantum mechanics/molecular mechanics (QM/MM) methods [7]. In parallel, Michael Levitt, a PhD student at the Medical Research Council at that time, was making significant advances for studying molecular conformations by computational approaches: Together with Shneior Lifson in 1972 at the Weizmann Institute in Israel, Levitt and Warshel started working on a simplified representation of a protein, where spheres would represent amino acids. In fact, this project, later on in 1975, turned out in the very first computer simulation of a protein system (pancreatic trypsin inhibitor) using a coarse-grained model [8]. These simulations suggested that the protein folding process has a relatively small number of conformations, and challenged the so-called “Levinthal paradox” [9]. In this work, each residue was represented by only two beads: The Cα atom and the centroid of the side chain. Non-bonded interactions were assumed to occur only between side chains. By doing so, only torsion angles between 4 consecutive Cα atoms were considered, considerably reducing the conformational space (one degree of freedom per residue). For all these premature findings Karplus, Levitt and Warshel were awarded with the Nobel Prize in Chemistry in 2013.
In 1975, Chothia and Janin established the structural basis of the hydrophobic effect as fundamental to the stabilization of protein association [10]. All these pioneering findings were used as a basis for the first computational analysis of a protein–protein complex: In 1978, Wodak and Janin studied the association of BPTI and trypsin using a coarse-grained representation of the system [11]. They used a combination of a simple averaged potential energy function including non-bonded (van der Waals) and residue-solvent interactions. Whilst encouraging, this early model totally neglected electrostatic interactions and was thus unable to describe hydrogen bonds and salt bridges, which, later on in 1984, were suggested to provide the specificity of the association [12]. In spite of the incompleteness of this work, they shed light on the idea that a simplified protein model could be an effective alternative to screen a relatively large number of possible interfaces, which constituted the first coarse-grained docking simulation. Ever since, coarse-grained/hybrid modeling approaches have gained importance in the computational structural biology field [13] and have become central in the study of folding, dynamics and association mechanisms of biomolecules.
3. Coarse-grained/Hybrid modeling of biomolecular interactions
In this section, we will focus on macromolecular docking approaches allowing some level of coarse-grained/hybrid representations for the modeling of interactions. These usually include two different steps: The generation of possible complex conformations, referred to as sampling, and the discrimination between biologically and non-biologically relevant models referred to as scoring. The latter might also be an integral part of the sampling process, especially when experimental or predicted information is included to bias the sampling (e.g. restraints-driven sampling). We first describe various strategies to simplify the representation of polypeptides and nucleic acids and discuss existing parametrization strategies and force fields. We then focus on how coarse-grained/hybrid approaches can be applied during the sampling and scoring steps for modeling biomolecular interactions and end with a short discussion of backmapping approaches to restore full atomistic representations.
3.1. Simplified representations and topologies
In general, a coarse-grained model aims at decreasing the complexity of a system by grouping several atoms into larger “pseudo-atoms” or “beads”, thereby reducing the number of degrees of freedom. This results both in more efficient computations and a possible smoothening of the energy landscape that might facilitate the identification of relevant states of the system. In the context of proteins, the simplest models introduced are the hydrophobic/polar (HP) models (see Fig. 1). These simplify the representations of a polypeptide chain [14] by considering only two type of beads (H and P), which, to some extent, are an approximation of two types of residues: hydrophobic (H) and polar (P) [15]. Albeit very minimalistic, HP representations have proven useful to study larger conformational changes and longer time scales. These models, and their variants, have been extensively studied in the past decade [16], [17], [18], [19] and reviewed elsewhere [20]. Another example of a low-resolution model to represent proteins is SICHO (Side CHain Only) [21]. In the model developed by Kolinski and Skolnick [21], each amino acid is represented as a unique interaction site, located at the center of the side-chain. It is thus computationally very efficient but completely neglects backbone conformations (φ/ψ dihedrals) [22].
In order to overcome the inaccuracies of very simplistic representations, higher resolution models have been developed. PRIMO/PRIMONA, for proteins and nucleic acids, was proposed as a reduced quasi-atomistic resolution model [23]. Feig and co-workers [23] represent polypeptide backbones with three beads (Cα, N and a combined carbonyl site) and side-chains as a combination of up to five different particles. In the case of nucleotides, adenine, cytosine and uracil are represented by four coarse-grained particles, and guanine and thymine by five. The sugar-phosphate backbone of the PRIMONA model consist of eight different CG beads. In contrast, the HiRE-RNA model designed by Pasquali and Derremaux [24] only considers three of the seven backbone torsional angles (α, β and γ); each RNA nucleotide is represented by six (pyrimidine bases) or seven (purine bases) beads, allowing for a reduction of ~70% of the number of particles compared to a fully atomistic structure. Similar to PRIMO, in the SIRAH model [25] the positions of the nitrogen, α carbon and oxygen from the peptide bonds are kept at pseudo-atomistic resolution, while side chains are treated at a lower degree of detail (from one to five different beads). This model also allows for the study of protein-DNA interactions by molecular dynamics through the use of an explicit/CG solvation scheme [26], [27].
Other coarse-grained models have been designed to be easily transferable and applicable to multiple systems. Among those, MARTINI is probably the most popular one. The current “MARTINIdome” includes: lipids [28], proteins [29], polymers [30], [31], carbohydrate [32], water [33], glycolipids [34], nucleotides [35], [36] and nanoparticles [37]. The systems are represented by four different basic particles – nonpolar (N), polar (P), apolar (C) and charged (Q) – that are further classified based on their degree of polarity and hydrogen bonding properties, giving a total of eighteen unique “building blocks”. The MARTINI force field for proteins, in its latest official release (2.2p), includes off-center charges for polar and charged residues [38]. These represent a good proxy for hydrogen bond and salt bridges formation and thus for molecular recognition. For nucleic acids, much like PRIMONA, the MARTINI model specifically accounts for Watson-Crick base pairing (eight additional beads) to stabilize the DNA double helix structure.
3.2. Parametrization of coarse-grained force fields
3.2.1. Classical parametrization strategies
In the context of molecular modeling, the set of parameters and functions used to calculate the potential energy of a system is commonly referred to as force field. Atomistic force fields provide parameters usually for every type of atom in a system (hydrogen included) but also united atom representations are often used in which non-polar hydrogens are neglected. In contrast, coarse-grained potentials are a cruder representation of the inter- and intra-molecular interactions. Regarding the latter, their parametrization follows two main routes: Hierarchical (bottom-up) and pragmatic (top-down) coarse-graining [39].
The key idea of hierarchical coarse-graining is that, the interactions at a less detailed level are the result of the collective interactions at the more detailed level [40]. As an illustration, in the 1975 abovementioned study by Levitt and Warshel [8], the interactions between coarse-grained sites were derived in a bottom-up way by explicitly summing up all microscopic interactions of an atomistic model. One obvious limitation of these models is that the quality of the coarse-grained model highly depends on the accuracy of the underlying atomistic one. Similarly, the seminal force-matching (FM) method proposed by Ercolessi and Adams [41] and further developed by Voth and co-workers [42], [43] under the name of MS-CG (multiscale coarse-graining) uses atomistic-level interactions to derive coarse-grained potentials. In short, those potentials are systematically fitted to atomistic forces by minimizing the mean-square errors between them. Much like iterative Boltzmann (IB) derived models [44], these force fields are usually more accurate as compared to more generic ones. However, they are typically less transferable and require more parametrization effort. These methods, and their extensions [45], have been recently applied to coarse-grained models for proteins such as the UNRES model [46].
Pragmatic force fields, however, are designed in such a way that they reproduce a given chosen (experimental) property [47]. The earlier lattice models (such as HP) represent a well-studied example of top-down coarse-graining. These models are typically cheaper to parametrize, easily transferable (to similar systems) and use rather simple analytical potentials [48]. In a similar way and as shown in Fig. 1, methodologies based on reproducing thermodynamical properties have been extensively applied in different branches of chemistry such as physical and organic chemistry. Equations of State (EoS), which are mathematical relationships between the thermodynamic variables of a given system, have been shown appropriate to accurately link the macroscopic properties of the system and the force field parameters [49]. As an example, the powerful SAFT-γ EoS, a variation of the Statistical Associating Fluid Theory (SAFT), has been used to estimate the coarse-grained potentials of the Mie force field [50]. This force field has been recently used to calculate solvation free energies of aromatic compounds, which are broadly used in the pharmaceutical industry for drug design purposes [51].
3.2.2. Machine learning-based parametrization
Machine learning, and especially deep learning, is revolutionizing in the last years many areas of science and technology. Certainly, the most significant breakthrough of the decade in the field of protein folding has been the development of AlphaFold [52]. DeepMind, an artificial intelligence company affiliated to Google, has designed a deep learning-based method that represents a substantial advance as compared to classical modeling techniques [53], [54]. These machine learning methods have been also applied in the development of force fields and are usually purely based on existing data. A general approach to design a machine (deep) learning-based force field typically includes: The generation of reference atomic configurations and forces (QM calculations), the identification of specific signatures, the selection of training and test datasets, the mapping of selected signatures to forces using specific algorithms and the assessment of the resulting predictive model [55]. Deep neural networks [56], adversarial machine learning models [57] and genetic algorithm [58] have been recently shown appropriate for the development coarse-grained force fields. Altogether, machine learning-based parametrization methodologies represent an emerging trend to automatize analytical model building from more complex data, which can deliver faster and perhaps more accurate results with minimal human intervention.
3.3. Combining different levels of resolution
An exhaustive, yet accurate, sampling of the conformational landscape is crucial in attempts to model biomolecular interactions and evaluate the underlying energetics. The use of simplified representations offers an effective way of sampling the landscape. However, the reduced accuracy due to the inherent simplifications still limits the systems and processes that can be studied by CG approaches. Hybrid approaches, which typically couple coarse-grained and atomistic-level representations, aim to overcome these limitations by combining different levels of resolution [59]. These combined approaches might be very helpful for quantitative studies (e.g. free energy calculations of large systems [60], [61]), while still reducing the computational cost. They are also particularly useful to include components of a system for which no or only low-resolution structural data are available. A key challenge in hybrid modeling is to integrate the different levels of resolution and to describe the AA/CG interactions. Standard mixing rules [62] have been historically very successful for this task. In short, Lennard-Jones and electrostatic interactions for mixed systems can be averaged and combined with an optimal scaling parameter depending on the size of the system [63]. Besides energetics, it still remains unclear how the interaction between two atoms might be affected by a coarse-grained surrounding as compared to its “native” environment and vice versa [64].
There are several hybrid schemes proposed in the literature, with MARTINI as a popular choice for the coarse-grained representation. One example is the PACE force field [65], [66], which pairs MARTINI (water and lipids) with a united-atom protein model. In this case, the AA/CG parameters are optimized against specific thermodynamic data, which somehow limits its direct applicability to other systems. GROMOS/MARTINI coupling [64] has also been described as a potential alternative. In this work, cross-resolution interactions are calculated via virtual interactions sites on relevant atomistic groups and the standard CG beads, an approach that might lead to unbalanced electrostatics behaviors. For this reason, Wasenaar and coworkers [67] introduced an explicit electrostatic AA/CG coupling on the coarse-grained side. More recently, the CHARMM/PRIMO coupling has been proposed for single hybrid simulation purposes [68]. In the model proposed by Kar and Feig [68], the atomistic segment of the hybrid model was found to structurally deviate more than its corresponding one in a full atomistic model. This suggests that proper mixing of resolutions remains a difficult problem.
In the context of integrative modeling, the integration of experimental data at the various possible levels might have a crucial role for hybrid representations of the system. At the sampling level, data can be used to narrow the conformational search so that binding incompetent and/or irrelevant regions are discarded a priori. This strategy has been shown to be best suited compared to post-simulation filtering approaches. It not only outperforms the scenario where data is solely used to discard models with a high degree of uncertainty, but also reduces significantly the computational cost [69]. Data can be also incorporated at the scoring level via a numerical penalty term or as restraining energy potential [70]. As an example, in HADDOCK [71] the distance restraints are incorporated into the scoring scheme via a soft-harmonic potential where the potential becomes linear for violations longer than 2 Å [72], effectively avoiding large forces for high restraints violations. Therefore, the incorporation of data in the modeling might work as a firewall and somewhat reduce the impact of inaccuracies of hybrid schemes in terms of intra- and inter-molecular interactions.
3.4. Sampling and scoring schemes
Decreasing the computational cost, as well as the complexity of the system, is a major goal of coarse-grained modeling. By lowering the resolution, the energy landscape becomes smoother and it is therefore, in principle, easier to identify the global minimum. In the context of integrative modeling with HADDOCK, we recently showed that introducing the MARTINI coarse-grained force field results in a substantial increase (8–30%) in the number of near-native models generated [73]. We also find CG sampling schemes in ATTRACT [74], [75], [76] (also hybrid scoring), CABS-dock [77], [78] (also scoring), FRODOCK2.0 [79], InterEvDock2 [80], [81] (also scoring), LZerD [82], [83], MAXDo [84], MCDNA [85] (also scoring), MDockPP [86] and RosettaDock [87] (also scoring in RosettaDock 4.0 [88]). Some of the methods used by these software to sample the conformational landscape includes: Rigid-body energy minimization, Fast Fourier Transformation (FFT) or Molecular Dynamics (Monte Carlo). For the purpose of scoring, coarse-grained molecular dynamics simulations have been also evaluated on a heterogeneous benchmark of protein–protein docking models [89]. Other modeling software such as: DOCK/PIERR [90], GALAXY [91], [92], LightDock [93], MEGADOCK 4.0 [94], [95], PPI3D [96], [97], pyDock [98], [99] and V-D2OCK [100] incorporate, to some extent, coarse-grained/hybrid scoring approaches for (quasi)atomistic models.
IMP [101] and PyRy3D (genesilico.pl/pyry3d) are examples of ready-to-use hybrid modeling software for predicting (sampling and scoring) biomolecular assemblies allowing to incorporate experimental data into their calculations. The Integrative Modeling Platform leans on the concept that the resolution of the representation depends on the quantity and quality of the available information. This information is also encoded in a scoring function, whose ultimate goal is to evaluate the uncertainty of the generated models. Andrej Sali and co-workers [2] understand the modeling as an endless cyclic process driven by the continuous acquisition of data. In IMP, the different subunits are represented as a combination of spherical beads of varying sizes (different levels of coarseness). The same subunits can be also be represented as 3D Gaussians (for EM map fitting) and thus combine different resolution scales simultaneously [102]. During the conformational sampling, the relative distances from all the CG beads and Gaussians are either constrained (in rigid bodies) or restrained (in flexible bodies) by the sequence connectivity. For very high degrees of coarse-graining, only geometric considerations, e.g. exclude volume, might be used in the computations. PyRy3D allows for building low-resolution models of large macromolecular assemblies. In the software developed by Kasprzak and Bujnicki (genesilico.pl/pyry3d), proteins and nucleic acids can be represented as rigid-bodies or as flexible shapes. A spatial restraints-driven Monte Carlo approach is used to bring the components together followed by an evaluation via a simple scoring function. For a more detailed list of software that allow for building structural models of multi-subunit macromolecular complexes refer to Table 1.
Table 1.
Modeling platform | System(s) | Characteristics | Link | Reference(s) |
---|---|---|---|---|
ATTRACT | Protein, peptide and DNA | CG sampling and hybrid scoring | attract.ph.tum.de | [74], [75], [76] |
CABS-dock | Peptide | CG sampling and scoring | biocomp.chem.uw.edu.pl | [77], [78] |
DOCK/PIERR | Protein | Hybrid scoring | clsbweb.oden.utexas.edu * | [90] |
FRODOCK2.0 | Protein | 3D grid potential maps | frodock.chaconlab.org | [79] |
GALAXY | Peptide | Hybrid scoring | galaxy.seoklab.org | [91], [92] |
HADDOCK | Protein, peptide and nucleic acids | CG sampling | bianca.science.uu.nl/haddock2.4 | [73], [103], [104] |
IMP | Protein and DNA | Hybrid sampling and scoring | integrativemodeling.org | [101] |
InterEvDock2 | Protein | Sampling by FRODOCK2.0 and CG scoring | bioserv.rpbs.univ-paris-diderot.fr/services/InterEvDock2 | [80], [81] |
LightDock | Protein, peptide and DNA | Hybrid scoring | lightdock.org | [93] |
LZerD** | Protein and peptide | 3DZD representation and hybrid scoring | kiharalab.org/proteindocking | [82], [83] |
MAXDo | Protein | CG sampling | lcqb.upmc.fr/CCDMintseris | [84] |
MCDNA | Protein and DNA | CG sampling and scoring | mmb.irbbarcelona.org/MCDNA | [85] |
MDockPP | Protein | CG sampling | zoulab.dalton.missouri.edu | [86] |
MEGADOCK 4.0 | Protein | Hybrid scoring | bi.cs.titech.ac.jp | [94], [95] |
PPI3D | Protein | Voronoi tessellation-based scoring | bioinformatics.ibt.lt/ppi3d | [96], [97] |
pyDock | Protein | CG scoring | life.bsc.es/pid/pydockweb | [98], [99] |
PyRy3D | Protein and DNA | Hybrid sampling and scoring | genesilico.pl/pyry3d | – |
RosettaDock | Protein | CG sampling and scoring | rosettacommons.org | [87], [88] |
V-D2OCK | Protein | CG scoring | bioinsilico.org/cgi-bin/VD2OCK/ | [100] |
Submission to DOCK/PIERR webserver is no longer supported.
LZerD has an specific protocol for modeling unstructured protein–protein interactions [83].
3.5. Backmapping from coarse-grained to atomistic resolution
The inherent loss of accuracy of coarser representations is a limiting factor when analyzing integrative models of biomolecular complexes. Atomic details, such as specific contacts, are usually essential to understand molecular recognition and it is therefore crucial to accurately reconstruct atomistic models from their CG counterparts [105]. This process is commonly referred in the literature as reverse transformation, inverse mapping or backmapping. There is currently a number of different backmapping protocols proposed, which mostly follow two different stages: (1) The generation of an atomistic structure based on the coarse-grained coordinates, and (2) a relaxation step of the generated AA structure.
For the first step, geometrical interpolation [23], [106], [107], random placement [108] and fragment-based methods [87], [109], [110], [111] are the most used ones. All these methods perform sufficiently well according to backbone deviations (<1.0 Å in general) but side chain reconstruction seems more problematic [112]. Side chain optimization has been extensively studied as it directly applies for protein designing purposes. The most successful methods discretize possible side chain conformations into rotamers and usually require of an exhaustive search algorithm (e.g. Monte Carlo, simulated annealing…) and an effective scoring function for selecting the proper side chain conformation. The backmapped atomistic structures can then be further improved by energy minimization [73], [113] and/or more sophisticated molecular dynamics-based approaches [114]. In HADDOCK, the CG generated models are converted into atomistic resolution by using distance restraints between the atoms and their corresponding coarse-grained beads. Using those restraints, the all-atom models of the individual components of a complex are morphed onto the coarse grained complex by a series of energy minimizations and Cartesian molecular dynamics [73].
4. Application examples of integrative modeling of protein interactions
Ultimately, the true value of any biomolecular model is in the structural information and insights that it provides. When speaking about integrative modeling here, we refer to the branch of structural biology whose aim is to gain structural insights into biomolecular complexes by integrating a wide variety of experimental information into computational calculations. There are various challenges associated with the incorporation and use of that information for the modeling of assemblies. However, a detailed overview of those is beyond the scope of this manuscript and have been reviewed in depth elsewhere [115], [116], [117]. The relevance of integrative models is underscored by the fact that the Protein Data Bank [118], [119] has now started to collect them in a new integrative model database (PDB-dev; pdb-dev.wwpdb.org) [120], [121], which ultimately should be merged into the current PDB database. Since 2014, it is possible to archive structural models obtained by combining traditional structural experimental techniques such as NMR spectroscopy, electron microscopy (3DEM), small angle scattering (SAS), atomic force microscopy (AFM), chemical cross-linking, Förster resonance energy transfer (FRET), electron paramagnetic resonance (EPR), mass spectrometry (MS), Hydrogen/Deuterium exchange (HDX) and various bioinformatic approaches, with computational methods. In this section we highlight several examples of integrative structures of protein complexes that have been determined by combining coarse-grained/hybrid computational approaches with experimental information.
Among all archived structures, we find a number of them determined by coarse-grained/hybrid computational methods in combination with a wide variety of structural data (see Fig. 2). Integrative structures derived from chemical cross-linking data are by far the most abundant ones, including models of the heptameric module of NPC [122], the exosome complex [123], the Complement C3(H2O) [124], the E6AP/UBE3A-p53 enzyme-substrate complex [125], Pol II(G) [126], the Proteasome-Ecm29 complex [127] and the canonical/non-canonical human COP9 Signalosome [128]. Protein cross-links have been also combined with other types of experimental information such as three/two-dimensional Electron Microscopy (2DEM/3DEM) and/or SAS to determine structures like the yeast Mediator complex [129] or the native BBSome [130]. Other sources of information such as mutagenesis and NMR data [131] and single molecule FRET data [132] have been also used.
There are also multiple examples of integrative structures, not deposited in the PDB-dev database, which have been modelled by integrative coarse-graining methods. One of those is the ATP synthase membrane motor. Leone and Faraldo-Gómez [133] proposed a computational integrative model based on chemical cross-links, a cryo-EM map (~7Å of resolution) and evolutionary couplings. The initial homology models of either subunits were refined against the experimentally determined cryo-EM map using Rosetta, which starts its conformational exploration in coarse-grained resolution. The computationally generated models were further validated with co-evolutionary and cross-linking data and revealed important mechanistic insights into the function of the ATP synthase. Another representative example is the ISWI ATPase complex. Using upper bound distance restraints based on BS3, BS2G and UV cross-links, Harrer and coworkers [134] modelled the complex with ATTRACT, which performs a rigid-body energy minimization driven by a coarse-grained force field [109] and the distance restraints provided. The top scoring ISWI models were validated against SAXS data.
The Nuclear Pore Complex (NPC) is probably the largest protein assembly determined by an integrative structural approach to date. It constitutes an eight-fold symmetrical cylindrical complex of 552 copies of 32 different nucleoporin proteins (Nups) [135]. With respect to the computational modeling, the NPC was represented in a multiscale fashion including multiple levels of coarseness. As an illustration, all rigid bodies derived from X-ray, NMR and integrative structures were coarse-grained into two different resolutions. They either mapped single residues or consecutive portions of up to ten different amino acids into larger beads. The modeling was performed using the integrative modeling platform software (IMP) (integrativemodeling.org) [101]. The experimental information available included chemical cross-links, a cryo-ET density map, immuno-electron microscopy localizations, excluded volume, sequence connectivity, the shape of the pore membrane, symmetry and SAXS data, which were used to benefit the sampling, to improve the scoring, to filter out inconsistent models and/or validation purposes. By putting all these data together, they were able to fully describe, at sub-nanometer precision, the structure of the entire NPC.
5. Concluding remarks
Over the past decades, coarse-grained/hybrid modeling has been demonstrated as a powerful approach to model biomolecules and their interactions. It extends the capabilities of traditional atomistic protocols. There are multiple models to simplify the three-dimensional representation of biomolecules, each of those specifically designed to answer a specific research question. The choice between different representations directly affects the sampling and scoring capabilities of current modeling approaches. In other words, the smaller the number of pseudo-atoms or beads, the higher the increase in speed but the lower the accuracy of the resulting models. For cases where higher level of resolution is required, multiscale/hybrid modeling might help to alleviate the inherent loss of accuracy of pure coarse-grained models as demonstrated, for instance, in the modeling of the nuclear pore complex. Nevertheless, there is still an urgent need for improving interaction schemes. Coarse-grained force fields derived from classical molecular mechanics are not easily transferable and therefore, very much system-dependent. On the contrary to bottom-up strategies, top-down approaches aim to generalize structural patterns that have been seen in thousands of known structures and/or to reproduce thermodynamic quantities. Likely, a combination of bottom-up and top-down approaches is a better option. In other words, improving top-down models by inferring additional interaction terms derived by bottom-up coarse-graining might have the most impact in future designs, increasing both their accuracy and applicability range to wider, larger and more complex assemblies. We are now approaching a time where, taking advantage of all scientific and technological advances, one might expect to build reasonable three-dimensional models of cells, which might provide insights into still unknown cellular mechanisms.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
Acknowledgments
This work was supported by the Dutch Foundation for Scientific Research (NWO) (TOP Grant 718.015.001) and by the BioExcel CoE (www.bioexcel.eu), a project funded by the European Union Horizon 2020 program under grant agreements 675728 and 823820.
The authors acknowledge all members from the Computational Structural Biology group at Utrecht University with special mention to Dr. Brian Jiménez-García for fruitful discussions and Dr. Zuzana Jandová for carefully reviewing this manuscript.
Footnotes
There is no supplementary data associated to the present work.
References
- 1.Berggård T., Linse S., James P. Methods for the detection and analysis of protein-protein interactions. Proteomics. 2007;7:2833–2842. doi: 10.1002/pmic.200700131. [DOI] [PubMed] [Google Scholar]
- 2.Rout M.P., Sali A. Principles for Integrative Structural Biology Studies. Cell. 2019;177:1384–1403. doi: 10.1016/j.cell.2019.05.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Koukos P.I., Bonvin A.M.J.J. In press; J. Mol. Biol: 2019. Integrative Modelling of Biomolecular Complexes. [DOI] [PubMed] [Google Scholar]
- 4.Braitbard M., Schneidman-Duhovny D., Kalisman N. Integrative Structure Modeling: Overview and Assessment. Annu Rev Biochem. 2019;88:113–135. doi: 10.1146/annurev-biochem-013118-111429. [DOI] [PubMed] [Google Scholar]
- 5.Singla J. Opportunities and challenges in building a spatiotemporal multi-scale model of the human pancreatic β Cell. Cell. 2018;173:11–19. doi: 10.1016/j.cell.2018.03.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Phillips D.C. Symposium on Three-Dimensional Structure of Macromolecules of Biological Origin. By Invitation of the Committee on Arrangements for the Autumn Meeting. Presented before the Academy on October 19, 1966. Chairman, Walter Kauzmann. Proc Natl Acad Sci. 1967;57:483–495. [Google Scholar]
- 7.Warshel A. Multiscale modeling of biological functions: From enzymes to molecular machines (nobel lecture) Angew Chemie – Int Ed. 2014;53:10020–10031. doi: 10.1002/anie.201403689. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Levitt M., Warshel A. Computer simulation of protein folding. Nature. 1975;253:694–698. doi: 10.1038/253694a0. [DOI] [PubMed] [Google Scholar]
- 9.Levinthal C. How to fold graciously. Mössbauer Spectrosc Biol Syst Proc. 1969;24:22–24. [Google Scholar]
- 10.Chothia C., Janin J. Principles of protein-protein recognition. Nature. 1975;256:705–708. doi: 10.1038/256705a0. [DOI] [PubMed] [Google Scholar]
- 11.Wodak S.J., Janin J. Computer analysis of protein-protein interaction. J Mol Biol. 1978;124:323–342. doi: 10.1016/0022-2836(78)90302-9. [DOI] [PubMed] [Google Scholar]
- 12.Fersht A.R. Analysis of Enzyme Structure and Activity by Protein Engineering. Angew Chemie Int Ed English. 1984;23:467–473. [Google Scholar]
- 13.Kmiecik S. Coarse-grained protein models and their applications. Chem Rev. 2016;116:7898–7936. doi: 10.1021/acs.chemrev.6b00163. [DOI] [PubMed] [Google Scholar]
- 14.Lau K.F., Dill K.A. A lattice statistical mechanics model of the conformational and sequence spaces of proteins. Macromolecules. 1989;22:3986–3997. [Google Scholar]
- 15.Dill K.A. Principles of protein folding – A perspective from simple exact models. Protein Sci. 1995;4:561–602. doi: 10.1002/pro.5560040401. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Šali A., Shakhnovich E., Karplus M. Kinetics of protein folding: A lattice model study of the requirements for folding to the native state. J Mol Biol. 1994;235:1614–1636. doi: 10.1006/jmbi.1994.1110. [DOI] [PubMed] [Google Scholar]
- 17.Dinner A.R., Šali A., Karplus M. The folding mechanism of larger model proteins: Role of native structure. Proc Natl Acad Sci U S A. 1996;93:8356–8361. doi: 10.1073/pnas.93.16.8356. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Locker C.R., Hernandez R. A minimalist model protein with multiple folding funnels. Proc Natl Acad Sci U S A. 2001;98:9074–9079. doi: 10.1073/pnas.161438898. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Kaya H., Chan H.S. Towards a consistent modeling of protein thermodynamic and kinetic cooperativity: How applicable is the transition state picture to folding and unfolding? J Mol Biol. 2002;315:899–909. doi: 10.1006/jmbi.2001.5266. [DOI] [PubMed] [Google Scholar]
- 20.Kolinski A., Skolnick J. Reduced models of proteins and their applications. Polymer (Guildf) 2004;45:511–524. [Google Scholar]
- 21.Kolinski A., Skolnick J. Monte carlo simulations of protein folding. I. Lattice model and interaction scheme. Proteins Struct Funct Bioinforma. 1994;18:338–352. doi: 10.1002/prot.340180405. [DOI] [PubMed] [Google Scholar]
- 22.MacKerell A.D., Feig M., Brooks C.L. Improved treatment of the protein backbone in empirical force fields. J Am Chem Soc. 2004;126:698–699. doi: 10.1021/ja036959e. [DOI] [PubMed] [Google Scholar]
- 23.Gopal S.M., Mukherjee S., Cheng Y.M., Feig M. PRIMO/PRIMONA: A coarse-grained model for proteins and nucleic acids that preserves near-atomistic accuracy. Proteins Struct Funct Bioinforma. 2010;78:1266–1281. doi: 10.1002/prot.22645. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Pasquali S., Derreumaux P. HiRE-RNA: A high resolution coarse-grained energy model for RNA. J Phys Chem B. 2010;114:11957–11966. doi: 10.1021/jp102497y. [DOI] [PubMed] [Google Scholar]
- 25.Darré L. SIRAH: A structurally unbiased coarse-grained force field for proteins with aqueous solvation and long-range electrostatics. J Chem Theory Comput. 2015;11:723–739. doi: 10.1021/ct5007746. [DOI] [PubMed] [Google Scholar]
- 26.Dans P.D., Zeida A., MacHado M.R., Pantano S. A coarse grained model for atomic-detailed DNA simulations with explicit electrostatics. J Chem Theory Comput. 2010;6:1711–1725. doi: 10.1021/ct900653p. [DOI] [PubMed] [Google Scholar]
- 27.Darré L., MacHado M.R., Dans P.D., Herrera F.E., Pantano S. Another coarse grain model for aqueous solvation: WAT FOUR? J Chem Theory Comput. 2010;6:3793–3807. [Google Scholar]
- 28.Marrink S.J., Risselada H.J., Yefimov S., Tieleman D.P., De Vries A.H. The MARTINI force field: Coarse grained model for biomolecular simulations. J Phys Chem B. 2007;111:7812–7824. doi: 10.1021/jp071097f. [DOI] [PubMed] [Google Scholar]
- 29.Monticelli L. The MARTINI coarse-grained force field: Extension to proteins. J Chem Theory Comput. 2008;4:819–834. doi: 10.1021/ct700324x. [DOI] [PubMed] [Google Scholar]
- 30.Lee H., De Vries A.H., Marrink S.J., Pastor R.W. A coarse-grained model for polyethylene oxide and polyethylene glycol: Conformation and hydrodynamics. J Phys Chem B. 2009;113:13186–13194. doi: 10.1021/jp9058966. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Gobbo C. MARTINI model for physisorption of organic molecules on graphite. J Phys Chem C. 2013;117:15623–15631. [Google Scholar]
- 32.López C.A. Martini coarse-grained force field: Extension to carbohydrates. J Chem Theory Comput. 2009;5:3195–3210. doi: 10.1021/ct900313w. [DOI] [PubMed] [Google Scholar]
- 33.Yesylevskyy S.O., Schäfer L.V., Sengupta D., Marrink S.J. Polarizable water model for the coarse-grained MARTINI force field. PLoS Comput Biol. 2010;6:1–17. doi: 10.1371/journal.pcbi.1000810. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.López C.A., Sovova Z., Van Eerden F.J., De Vries A.H., Marrink S.J. Martini force field parameters for glycolipids. J Chem Theory Comput. 2013;9:1694–1708. doi: 10.1021/ct3009655. [DOI] [PubMed] [Google Scholar]
- 35.Uusitalo J.J., Ingólfsson H.I., Akhshi P., Tieleman D.P., Marrink S.J. Martini coarse-grained force field: extension to DNA. J Chem Theory Comput. 2015;11:3932–3945. doi: 10.1021/acs.jctc.5b00286. [DOI] [PubMed] [Google Scholar]
- 36.Uusitalo J.J., Ingólfsson H.I., Marrink S.J., Faustino I. Martini coarse-grained force field: extension to RNA. Biophys J. 2017;113:246–256. doi: 10.1016/j.bpj.2017.05.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.López C.A. MARTINI coarse-grained model for crystalline cellulose microfibers. J Phys Chem B. 2015;119:465–473. doi: 10.1021/jp5105938. [DOI] [PubMed] [Google Scholar]
- 38.De Jong D.H. Improved parameters for the martini coarse-grained protein force field. J Chem Theory Comput. 2013;9:687–697. doi: 10.1021/ct300646g. [DOI] [PubMed] [Google Scholar]
- 39.Noid W.G. Perspective: Coarse-grained models for biomolecular systems. J Chem Phys. 2013;139 doi: 10.1063/1.4818908. [DOI] [PubMed] [Google Scholar]
- 40.Saunders M.G., Voth G.A. Coarse-graining methods for computational biology. Annu Rev Biophys. 2013;42:73–93. doi: 10.1146/annurev-biophys-083012-130348. [DOI] [PubMed] [Google Scholar]
- 41.Ercolesi F., Adams J.B. Interatomic potentials from first-principles calculations: The force-matching method. EPL. 1994;26:583–588. [Google Scholar]
- 42.Izvekov S., Parrinello M., Burnham C.J., Voth G.A. Effective force fields for condensed phase systems from ab initio molecular dynamics simulation: A new method for force-matching. J Chem Phys. 2004;120:10896–10913. doi: 10.1063/1.1739396. [DOI] [PubMed] [Google Scholar]
- 43.Izvekov S., Voth G.A. A Multiscale Coarse-Graining Method for Biomolecular Systems. J Phys Chem B. 2005;109:2469–2473. doi: 10.1021/jp044629q. [DOI] [PubMed] [Google Scholar]
- 44.Soper A.K. Empirical potential Monte Carlo simulation of fluid structure. Chem Phys. 1996;202:295–306. [Google Scholar]
- 45.Lu L., Dama J.F., Voth G.A. Fitting coarse-grained distribution functions through an iterative force-matching method. J Chem Phys. 2013;139 doi: 10.1063/1.4811667. [DOI] [PubMed] [Google Scholar]
- 46.Liwo A., Czaplewski C. Extension of the force-matching method to coarse-grained models with axially symmetric sites to produce transferable force fields: Application to the UNRES model of proteins. J Chem Phys. 2020;152 doi: 10.1063/1.5138991. [DOI] [PubMed] [Google Scholar]
- 47.Ingólfsson H.I. The power of coarse graining in biomolecular simulations Wiley Interdisciplinary Reviews. Comput Mole Sci. 2014;4:225–248. doi: 10.1002/wcms.1169. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Potoyan, D., Papoian G. A. The need for computational Speed: State of the art in DNA coarse graining. In: Coarse-Grained Modeling of Biomolecules 1st ed. Boca Raton: CRC Press, 271– 297 (2017).
- 49.Mejía A., Herdes C., Müller E.A. Force fields for coarse-grained molecular simulations from a corresponding states correlation. Ind. Eng. Chem. Res. 2014;53:4131–4141. [Google Scholar]
- 50.Müller E.A., Jackson G. Force-field parameters from the SAFT-γ equation of state for use in coarse-grained molecular simulations. Annu Rev Chem Biomol Eng. 2014;5:405–427. doi: 10.1146/annurev-chembioeng-061312-103314. [DOI] [PubMed] [Google Scholar]
- 51.Matos I.Q., Abreu C.R.A. Evaluation of the SAFT-γ Mie force field with solvation free energy calculations. Fluid Phase Equilib. 2019;484:88–97. [Google Scholar]
- 52.Senior A.W. Improved protein structure prediction using potentials from deep learning. Nature. 2020;577:706–710. doi: 10.1038/s41586-019-1923-7. [DOI] [PubMed] [Google Scholar]
- 53.Senior A.W. Protein structure prediction using multiple deep neural networks in the 13th Critical Assessment of Protein Structure Prediction (CASP13) Proteins Struct Funct Bioinforma. 2019;87:1141–1148. doi: 10.1002/prot.25834. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Kryshtafovych A., Schwede T., Topf M., Fidelis K., Moult J. Critical assessment of methods of protein structure prediction (CASP) – Round XIII. Proteins Struct Funct Bioinf. 2019;87:1011–1020. doi: 10.1002/prot.25823. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Botu V., Batra R., Chapman J., Ramprasad R. Machine learning force fields: Construction, validation, and outlook. J Phys Chem C. 2017;121:511–522. [Google Scholar]
- 56.Wang J. Machine learning of coarse-grained molecular dynamics force fields. ACS Cent Sci. 2019;5:755–767. doi: 10.1021/acscentsci.8b00913. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Durumeric A.E.P., Voth G.A. Adversarial-residual-coarse-graining: Applying machine learning theory to systematic molecular coarse-graining. J Chem Phys. 2019;151 doi: 10.1063/1.5097559. [DOI] [PubMed] [Google Scholar]
- 58.Chan H. Machine learning coarse grained models for water. Nat Commun. 2019;10:379. doi: 10.1038/s41467-018-08222-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Ayton G.S., Noid W.G., Voth G.A. Multiscale modeling of biomolecular systems: in serial and in parallel. Curr Opin Struct Biol. 2007;17:192–198. doi: 10.1016/j.sbi.2007.03.004. [DOI] [PubMed] [Google Scholar]
- 60.König G., Hudson P.S., Boresch S., Woodcock H.L. Multiscale free energy simulations: An efficient method for connecting classical MD simulations to QM or QM/MM free energies using non-Boltzmann Bennett reweighting schemes. J Chem Theory Comput. 2014;10:1406–1419. doi: 10.1021/ct401118k. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Lee S., Liang R., Voth G.A., Swanson J.M.J. Computationally efficient multiscale reactive molecular dynamics to describe amino acid deprotonation in proteins. J Chem Theory Comput. 2016;12:879–891. doi: 10.1021/acs.jctc.5b01109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Scott R., Allen M.P., Tildesley D.J. Computer simulation of liquids. Math Comput. 1991;57:442. [Google Scholar]
- 63.Michel J., Orsi M., Essex J.W. Prediction of partition coefficients by multiscale hybrid atomic-level/coarse-grain simulations. J Phys Chem B. 2008;112:657–660. doi: 10.1021/jp076142y. [DOI] [PubMed] [Google Scholar]
- 64.Rzepiela A.J., Louhivuori M., Peter C., Marrink S.J. Hybrid simulations: Combining atomistic and coarse-grained force fields using virtual sites. Phys Chem Chem Phys. 2011;13:10437–10448. doi: 10.1039/c0cp02981e. [DOI] [PubMed] [Google Scholar]
- 65.Wan C.K., Han W., Wu Y.D. Parameterization of PACE force field for membrane environment and simulation of helical peptides and helix-helix association. J Chem Theory Comput. 2012;8:300–313. doi: 10.1021/ct2004275. [DOI] [PubMed] [Google Scholar]
- 66.Ward M.D., Nangia S., May E.R. Evaluation of the hybrid resolution PACE model for the study of folding, insertion, and pore formation of membrane associated peptides. J Comput Chem. 2017;38:1462–1471. doi: 10.1002/jcc.24694. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Wassenaar T.A., Ingólfsson H.I., Prieß M., Marrink S.J., Schäfer L.V. Mixing MARTINI: Electrostatic coupling in hybrid atomistic-coarse-grained biomolecular simulations. J Phys Chem B. 2013;117:3516–3530. doi: 10.1021/jp311533p. [DOI] [PubMed] [Google Scholar]
- 68.Kar P., Feig M. Hybrid all-atom/coarse-grained simulations of proteins by direct coupling of CHARMM and PRIMO force fields. J Chem Theory Comput. 2017;13:5753–5765. doi: 10.1021/acs.jctc.7b00840. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Roel-Touris J., Bonvin A.M.J.J., Jiménez-García B. LightDock goes information-driven. Bioinformatics. 2020;36:950–952. doi: 10.1093/bioinformatics/btz642. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Vangone A. Sense and simplicity in HADDOCK scoring: Lessons from CASP-CAPRI round 1. Proteins Struct Funct Bioinforma. 2017;85:417–423. doi: 10.1002/prot.25198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Dominguez C., Boelens R., Bonvin A.M.J.J. HADDOCK: A protein-protein docking approach based on biochemical or biophysical information. J Am Chem Soc. 2003;125:1731–1737. doi: 10.1021/ja026939x. [DOI] [PubMed] [Google Scholar]
- 72.Brünger A.T. Crystallography & NMR system: a new software suite for macromolecular structure determination. Acta Crystallogr Sect D Biol Crystallogr. 1998;54:905–921. doi: 10.1107/s0907444998003254. [DOI] [PubMed] [Google Scholar]
- 73.Roel-Touris J., Don C.G., Honorato R.R., Rodrigues J.P.G.L.M., Bonvin A.M.J.J. Less Is More: coarse-grained integrative modeling of large biomolecular assemblies with HADDOCK. J Chem Theory Comput. 2019;15:6358–6367. doi: 10.1021/acs.jctc.9b00310. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.De Vries S., Zacharias M. Flexible docking and refinement with a coarse-grained protein model using ATTRACT. Proteins Struct Funct Bioinforma. 2013;81:2167–2174. doi: 10.1002/prot.24400. [DOI] [PubMed] [Google Scholar]
- 75.De Vries S.J., Rey J., Schindler C.E.M., Zacharias M., Tuffery P. The pepATTRACT web server for blind, large-scale peptide-protein docking. Nucleic Acids Res. 2017;45:W361–W364. doi: 10.1093/nar/gkx335. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Setny P., Bahadur R.P., Zacharias M. Protein-DNA docking with a coarse-grained force field. BMC Bioinf. 2012;13 doi: 10.1186/1471-2105-13-228. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Kolinski A. Protein modeling and structure prediction with a reduced representation. Acta Biochim Pol. 2004;51:349–371. [PubMed] [Google Scholar]
- 78.Kurcinski M., Jamroz M., Blaszczyk M., Kolinski A., Kmiecik S. CABS-dock web server for the flexible docking of peptides to proteins without prior knowledge of the binding site. Nucleic Acids Res. 2015;43:W419–W424. doi: 10.1093/nar/gkv456. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Ramírez-Aportela E., López-Blanco J.R., Chacón P. FRODOCK 2.0: Fast protein-protein docking server. Bioinformatics. 2016;32:2386–2388. doi: 10.1093/bioinformatics/btw141. [DOI] [PubMed] [Google Scholar]
- 80.Andreani J., Faure G., Guerois R. InterEvScore: A novel coarse-grained interface scoring function using a multi-body statistical potential coupled to evolution. Bioinformatics. 2013;29:1742–1749. doi: 10.1093/bioinformatics/btt260. [DOI] [PubMed] [Google Scholar]
- 81.Quignot C. InterEvDock2: An expanded server for protein docking using evolutionary and biological information from homology models and multimeric inputs. Nucleic Acids Res. 2018;46:W408–W416. doi: 10.1093/nar/gky377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Esquivel-Rodriguez J., Filos-Gonzalez V., Li B., Kihara D. Pairwise and multimeric protein–protein docking using the lzerd program suite. Methods Mol Biol. 2014;1137:209–234. doi: 10.1007/978-1-4939-0366-5_15. [DOI] [PubMed] [Google Scholar]
- 83.Peterson L.X., Roy A., Christoffer C., Terashi G., Kihara D. Modeling disordered protein interactions from biophysical principles. PLoS Comput Biol. 2017;13 doi: 10.1371/journal.pcbi.1005485. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Sacquin-Mora S., Carbone A., Lavery R. Identification of protein interaction partners and protein-protein interaction sites. J Mol Biol. 2008;382:1276–1289. doi: 10.1016/j.jmb.2008.08.002. [DOI] [PubMed] [Google Scholar]
- 85.Walther J. A multi-modal coarse grained model of DNA flexibility mappable to the atomistic level. Nucleic Acids Res. 2020;48 doi: 10.1093/nar/gkaa015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Huang S.Y., Zou X. MDockPP: A hierarchical approach for protein-protein docking and its application to CAPRI rounds 15–19. Proteins Struct Funct Bioinforma. 2010;78:3096–3103. doi: 10.1002/prot.22797. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Gray J.J. Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. J Mol Biol. 2003;331:281–299. doi: 10.1016/s0022-2836(03)00670-3. [DOI] [PubMed] [Google Scholar]
- 88.Roy Burman S.S. Funct. Bioinforma. In press; 2019. Novel sampling strategies and a coarse-grained score function for docking homomers, flexible heteromers, and oligosaccharides using Rosetta in CAPRI rounds 37–45. Proteins Struct. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Hou Q., Lensink M.F., Heringa J., Feenstra K.A. CLUB-MARTINI: Selecting favourable interactions amongst available candidates, a coarse-grained simulation approach to scoring docking decoys. PLoS One. 2016;11 doi: 10.1371/journal.pone.0155251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Viswanath S., Ravikant D.V.S., Elber R. DOCK/PIERR: Web server for structure prediction of protein-protein complexes. Methods Mol Biol. 2014;1137:199–207. doi: 10.1007/978-1-4939-0366-5_14. [DOI] [PubMed] [Google Scholar]
- 91.Shin W.-H., Lee G.R., Heo L., Lee H., Seok C. Prediction of protein structure and interaction by GALAXY protein modeling programs. Biol Des. 2014;2:01–11. [Google Scholar]
- 92.Lee H., Heo L., Lee M.S., Seok C. GalaxyPepDock: A protein-peptide docking tool based on interaction similarity and energy optimization. Nucleic Acids Res. 2015;43:W431–W435. doi: 10.1093/nar/gkv495. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Jiménez-García B. LightDock: A new multi-scale approach to protein-protein docking. Bioinformatics. 2018;49:34–55. doi: 10.1093/bioinformatics/btx555. [DOI] [PubMed] [Google Scholar]
- 94.Ohue, M., Matsuzaki, Y., Ishida, T. & Akiyama, Y. Improvement of the protein-protein docking prediction by introducing a simple hydrophobic interaction model: An application to interaction pathway analysis. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 7632 LNBI, 178–187 (2012).
- 95.Ohue M. MEGADOCK 4.0: An ultra-high-performance protein-protein docking software for heterogeneous supercomputers. Bioinformatics. 2014;30:3281–3283. doi: 10.1093/bioinformatics/btu532. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Olechnovič K., Venclovas Č. Voronota: A fast and reliable tool for computing the vertices of the Voronoi diagram of atomic balls. J Comput Chem. 2014;35:672–681. doi: 10.1002/jcc.23538. [DOI] [PubMed] [Google Scholar]
- 97.Dapkunas J. The PPI3D web server for searching, analyzing and modeling protein-protein interactions in the context of 3D structures. Bioinformatics. 2017;33:935–937. doi: 10.1093/bioinformatics/btw756. [DOI] [PubMed] [Google Scholar]
- 98.Jiménez-García B., Pons C., Fernández-Recio J. pyDockWEB: A web server for rigid-body protein-protein docking using electrostatics and desolvation scoring. Bioinformatics. 2013;29:1698–1699. doi: 10.1093/bioinformatics/btt262. [DOI] [PubMed] [Google Scholar]
- 99.Solernou A., Fernandez-Recio J. PyDockCG: New coarse-grained potential for protein-protein docking. J Phys Chem B. 2011;115:6032–6039. doi: 10.1021/jp112292b. [DOI] [PubMed] [Google Scholar]
- 100.Segura J., Marín-López M.A., Jones P.F., Oliva B., Fernandez-Fuentes N. VORFFIP-driven dock: V-D2OCK, a fast, accurate protein docking strategy. PLoS One. 2015;10 doi: 10.1371/journal.pone.0118107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Russel D. Putting the pieces together: Integrative modeling platform software for structure determination of macromolecular assemblies. PLoS Biol. 2012;10(1) doi: 10.1371/journal.pbio.1001244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Webb B. Integrative structure modeling with the Integrative Modeling Platform. Protein Sci. 2018;27:245–258. doi: 10.1002/pro.3311. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Van Zundert G.C.P. The HADDOCK2.2 web server: user-friendly integrative modeling of biomolecular complexes. J Mol Biol. 2016;428:720–725. doi: 10.1016/j.jmb.2015.09.014. [DOI] [PubMed] [Google Scholar]
- 104.Honorato R.V., Roel-Touris J., Bonvin A.M.J.J. MARTINI-Based Protein-DNA Coarse-Grained HADDOCKing. Front Mol Biosci. 2019;6 doi: 10.3389/fmolb.2019.00102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Badaczewska-Dawid A.E., Kolinski A., Kmiecik S. Computational reconstruction of atomistic protein structures from coarse-grained models. Comput Struct Biotechnol J. 2020;18:162–176. doi: 10.1016/j.csbj.2019.12.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Wassenaar T.A., Pluhackova K., Böckmann R.A., Marrink S.J., Tieleman D.P. Going backward: A flexible geometric approach to reverse transformation from coarse grained to atomistic models. J Chem Theory Comput. 2014;10:676–690. doi: 10.1021/ct400617g. [DOI] [PubMed] [Google Scholar]
- 107.Machado M.R., Pantano S. SIRAH tools: Mapping, backmapping and visualization of coarse-grained models. Bioinformatics. 2016;32:1568–1570. doi: 10.1093/bioinformatics/btw020. [DOI] [PubMed] [Google Scholar]
- 108.Rzepiela A.J. Software news and update reconstruction of atomistic details from coarse-grained structures. J Comput Chem. 2010;31:1333–1343. doi: 10.1002/jcc.21415. [DOI] [PubMed] [Google Scholar]
- 109.Zacharias M. Protein-protein docking with a reduced protein model accounting for side-chain flexibility. Protein Sci. 2003;12:1271–1282. doi: 10.1110/ps.0239303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Shimizu M., Takada S. Reconstruction of atomistic structures from coarse-grained models for Protein-DNA complexes. J Chem Theory Comput. 2018;14:1682–1694. doi: 10.1021/acs.jctc.7b00954. [DOI] [PubMed] [Google Scholar]
- 111.Heath A.P., Kavraki L.E., Clementi C. From coarse-grain to all-atom: Toward multiscale analysis of protein landscapes. Proteins Struct Funct Genet. 2007;68:646–661. doi: 10.1002/prot.21371. [DOI] [PubMed] [Google Scholar]
- 112.Lombardi L.E., Martí M.A., Capece L. CG2AA: Backmapping protein coarse-grained structures. Bioinformatics. 2016;32:1235–1237. doi: 10.1093/bioinformatics/btv740. [DOI] [PubMed] [Google Scholar]
- 113.Joosten R.P., Joosten K., Murshudov G.N., Perrakis A. PDB-REDO: Constructive validation, more than just looking for errors. Acta Crystallogr Sect D Biol Crystallogr. 2012;68:484–496. doi: 10.1107/S0907444911054515. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Peng J., Yuan C., Ma R., Zhang Z. Backmapping from multiresolution coarse-grained models to atomic structures of large biomolecules by restrained molecular dynamics simulations using bayesian inference. J Chem Theory Comput. 2019;15:3344–3353. doi: 10.1021/acs.jctc.9b00062. [DOI] [PubMed] [Google Scholar]
- 115.London N., Raveh B., Schueler-Furman O. Peptide docking and structure-based characterization of peptide binding: from knowledge to know-how. Curr Opin Struct Biol. 2013;23:894–902. doi: 10.1016/j.sbi.2013.07.006. [DOI] [PubMed] [Google Scholar]
- 116.Rodrigues J.P.G.L.M., Bonvin A.M.J.J. Integrative computational modeling of protein interactions. FEBS J. 2014;281:1988–2003. doi: 10.1111/febs.12771. [DOI] [PubMed] [Google Scholar]
- 117.Nithin C., Ghosh P., Bujnicki J. Bioinformatics tools and benchmarks for computational docking and 3D structure prediction of RNA-Protein complexes. Genes (Basel) 2018;9:432. doi: 10.3390/genes9090432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Joosten R.P. A series of PDB related databases for everyday needs. Nucleic Acids Res. 2011;39 doi: 10.1093/nar/gkq1105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Touw W.G. A series of PDB-related databanks for everyday needs. Nucleic Acids Res. 2015;43:D364–D368. doi: 10.1093/nar/gku1028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.Sali A. Outcome of the first wwPDB hybrid/integrative methods task force workshop. Structure. 2015;23:1156–1167. doi: 10.1016/j.str.2015.05.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Berman H.M. Federating structural models and data: outcomes from a workshop on archiving integrative structures. Structure. 2019;27:1745–1759. doi: 10.1016/j.str.2019.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Shi Y. Structural characterization by cross-linking reveals the detailed architecture of a coatomer-related heptameric module from the nuclear pore complex. Mol Cell Proteomics. 2014;13:2927–2943. doi: 10.1074/mcp.M114.041673. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Shi Y. A strategy for dissecting the architectures of native macromolecular assemblies. Nat Methods. 2015;12:1135–1138. doi: 10.1038/nmeth.3617. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 124.Chen Z.A. Structure of complement C3(H2O) revealed by quantitative cross-linking/mass spectrometry and modeling. Mol Cell Proteomics. 2016;15:2730–2743. doi: 10.1074/mcp.M115.056473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125.Sailer C. Structural dynamics of the E6AP/UBE3A-E6-p53 enzyme-substrate complex. Nat Commun. 2018;9:4441. doi: 10.1038/s41467-018-06953-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 126.Jishage M. Architecture of Pol II(G) and molecular mechanism of transcription regulation by Gdown1. Nat Struct Mol Biol. 2018;25:859–867. doi: 10.1038/s41594-018-0118-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 127.Wang X. The proteasome-interacting Ecm29 protein disassembles the 26S proteasome in response to oxidative stress. J Biol Chem. 2017;292:16310–16320. doi: 10.1074/jbc.M117.803619. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 128.Gutierrez C. Structural dynamics of the human COP9 signalosome revealed by cross-linking mass spectrometry and integrative modeling. Proc Natl Acad Sci U S A. 2020;117:4088–4098. doi: 10.1073/pnas.1915542117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 129.Robinson P.J. Molecular architecture of the yeast Mediator complex. Elife. 2015;4 doi: 10.7554/eLife.08719. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 130.Chou H.T. The molecular architecture of native BBSome obtained by an integrated structural approach. Structure. 2019;27:1384–1394.e4. doi: 10.1016/j.str.2019.06.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 131.Bender B.J. Structural model of ghrelin bound to its G protein-coupled receptor. Structure. 2019;27:537–544.e4. doi: 10.1016/j.str.2018.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 132.Dai G., Aman T.K., DiMaio F., Zagotta W.N. The HCN channel voltage sensor undergoes a large downward motion during hyperpolarization. Nat Struct Mol Biol. 2019;26:686–694. doi: 10.1038/s41594-019-0259-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 133.Leone V., Faraldo-Gómez J.D. Structure and mechanism of the ATP synthase membrane motor inferred from quantitative integrative modeling. J Gen Physiol. 2016;148:441–457. doi: 10.1085/jgp.201611679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 134.Harrer N. Structural architecture of the nucleosome remodeler ISWI determined from cross-linking, mass spectrometry, SAXS, and modeling. Structure. 2018;26:282–294.e6. doi: 10.1016/j.str.2017.12.015. [DOI] [PubMed] [Google Scholar]
- 135.Kim S.J. Integrative structure and functional anatomy of a nuclear pore complex. Nature. 2018;555:475–482. doi: 10.1038/nature26003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 136.Goddard T.D. UCSF ChimeraX: Meeting modern challenges in visualization and analysis. Protein Sci. 2018;27:14–25. doi: 10.1002/pro.3235. [DOI] [PMC free article] [PubMed] [Google Scholar]