Key words: Docking, force field, peptide binding, peptide–protein interaction, machine learning, molecular dynamics simulation, scoring
Abstract
Peptides mediate up to 40% of protein interactions, their high specificity and ability to bind in places where small molecules cannot make them potential drug candidates. However, predicting peptide–protein complexes remains more challenging than protein–protein or protein–small molecule interactions, in part due to the high flexibility peptides have. In this review, we look at the advances in docking, molecular simulations and machine learning to tackle problems related to peptides such as predicting structures, binding affinities or even kinetics. We specifically focus on explaining the number of docking programmes and force fields used in molecular simulations, so a prospective user can have an educated guess as to why choose one modelling tool or another to address their scientific questions.
Introduction
Protein–protein interactions (PPIs) are a vital component of pathways regulating the behaviour of cells. In disease, some of these pathways become aberrant; therefore, identifying ways of inhibiting them is of great therapeutic importance. Inhibiting PPI with small molecules is not always possible due to the large interface region, lack of binding cavities and specificity of the interaction. However, between 15 and 40% of PPI (London et al., 2013) are mediated by peptide epitopes, giving rise to opportunities for peptide-based inhibition of PPIs. There is a growing market for peptide-based therapeutic agents (Martins et al., 2021; Wang et al., 2022), and there are already more than 60 peptide drugs approved in the United States (Usmani et al., 2017; Lau and Dunn, 2018). Peptides have well-known degradation pathways, lower toxicity than small molecules and are highly specific. Some of the challenges for peptide-based therapeutics such as the rapid degradation by proteases or limited ability to cross membranes (Fosgerau and Hoffmann, 2015) can be overcome by using modified amino acids and cyclisation techniques (Bechtler and Lamers, 2021). Others, like efficient delivery strategies (e.g., oral delivery), limit broader interest (Ganesh et al., 2021).
The rational design of peptide-based therapeutic requires structural knowledge of peptide–protein complexes at the atomistic level. Experimental studies are challenging and expensive as peptide degrades fast, and the spectrum of target candidate is too broad to explore experimentally. While the use of computational pipelines is well-stablished in the early stages of drug discovery for small molecules, peptides present some unique challenges that have limited the success of computational pipelines for inhibitory peptide design. Structurally, peptides interact with proteins in different ways (Arkin et al., 2014): 1) as coils through specific amino acid interaction; 2) by adopting well-defined secondary structures (e.g. hairpins or helices) and 3) through discontinuous interactions along the peptide chain. In these interactions, peptide flexibility is important, as they are often intrinsically disordered in their free form and adopt well-defined structures upon binding – unlike small molecules where flexibility is more limited.
The reader is refered to several excellent and exhaustive reviews on peptides as therapeutics, and the role of docking to identify PPIs (Fosgerau and Hoffmann, 2015; Ciemny et al., 2018; Apostolopoulos et al., 2021). In this review, we describe three major classes of computational methods that are routinely used to elucidate different aspects of PPIs: 1) docking; 2) molecular dynamics (MD) simulations and 3) machine learning approach. Docking approaches have traditionally been the most successful at exploring the possible orientations and interaction sites between proteins and peptides. Their use for peptide systems has evolved from protein–small molecule and protein–protein docking tools and face different challenges when accounting for the highly flexible nature of peptides. Hence, these methods are described in most detail. MD methods draw from the wealth of enhanced simulation methods available in the literature. Their purpose is to confer detail either about the binding energy landscape or binding mechanism. They are generally not high-throughput methods and complement studies where the bound structure is known. Finally, machine learning approaches are rapidly evolving, thanks to AlphaFold’s recent success in protein structure prediction. These methods have rapidly been ported to other applications such as peptide–protein structure prediction with a success rate to match those of the best performing docking programmes. Although different in methodology, the three classes of methods need to account for the challenges in sampling bound conformations as well as identifying them through some scoring function. Our goal with this review is to identify the current strategies to approach these challenges and provide a broad understanding of the advances and limitations in the field.
Docking section
Docking remains an efficient approach to sample bound conformations given the receptor structure and the ligand. Their success in small molecule-protein docking and their ease-of-use through webservers and standalone software have popularised this method for virtual screening in the early stages of drug discovery (Taylor et al., 2002). Some of this success has been translated into the protein–protein docking field, as seen from the evolution of predictions in the CAPRI (Critical Assessment of PRediction of Interaction) competition (Lensink et al., 2017, 2020). Despite these successes, peptide–protein docking remains a more challenging problem (Ciemny et al., 2018). The success of docking relies on the ability to sample bound conformations, and the ability to identify native-like poses using a scoring function. The flexible nature of the peptides significantly increases the sampling problem with respect to small molecule docking and limits sampling native-like poses (Rentzsch and Renard, 2015). Similarly, the flexibility challenges the ability to transfer standard scoring functions to identify peptide–protein complexes. For example, despite the poly-aminoacidic nature of peptides, a straightforward application of protein scoring functions has limited success. Thus, modifications to protein–protein scoring functions are needed to correctly identify native-like poses (Agrawal et al., 2019; Weng et al., 2020). Innovation in the peptide docking field comes from strategies for more efficient handling of flexibility and overcoming limitations in scoring. Along with these improvements, the curation of peptide–protein databases is crucial for systematic testing, benchmarking and assessing these docking methods (Hauser and Windshügel, 2016). There are already excellent reviews and benchmark studies assessing the performance of different methods (Wang et al., 2016; Ciemny et al., 2018; Agrawal et al., 2019; A. C.-L. Lee et al., 2019; Weng et al., 2020). Hence, we will limit this section to the nature of the databases and different sampling/scoring strategies prevalent in the field.
Databases
Many peptide docking methods have evolved from either protein–protein or protein–small molecule docking tools. Their modification includes better handling of flexibility and specific scoring functions. To test their performance on peptide–protein complexes, several efforts distil the structural information from the PDB, identifying sets of peptide–protein complexes amongst the ~150,000 structures deposited in the RCSB-PDB (Berman et al., 2002). Thus, the emergence of databases for peptide–protein complexes streamlines the process and accelerates the advancement of the field. Several databases are available, each with a specific purpose in mind – ranging from properties such as length of the peptides to types of binding motifs. Here, we provide an overview of the widely used databases. Table 1 provides a quick reference summary.
Table 1.
Summary of the popular protein–peptide complexes datasets that are widely used for testing and benchmarking different docking tool
Dataset | Number of complexes | Length of peptide | Special Features | Specific application | Availability |
---|---|---|---|---|---|
LEADS-PEP | 53 | 3–12 residues | Diverse sequence of peptides, complexes do not interact with nucleic acids | Due to smaller peptide size, suitable for testing tools adapted from small molecule docking tools | www.leads-x.org |
PeptiDB | 105 | 5–15 residues | Diverse secondary structure of peptides including conformational change upon binding, complexes with diverse biological functions | Suitable for testing tools that tackle peptide flexibility | RCSB code of the complexes: https://ars.els-cdn.com/content/image/1-s2.0-S096921260900478X-mmc1.pdf |
PPDbench | 133 | 9–15 residues | Diverse in term peptide sequences (<40% sequence similarity) and biological functionalities | Suitable for testing docking tools on different complexes categorised with different functionalities | https://webs.iiitd.edu.in/raghava/ppdbench/ |
PepPro | 89 | 5–30 residues | Contains 58 unbound receptors structures | Useful for testing tools whether they can predict apo-holo conformational change | http://zoulab.dalton.missouri.edu/PepPro_benchmark |
Propedia | ~20000 | 2–50 residues | Contains subsets of complexes based on clustering on different features such as sequence, interface structure or binding site | Broader range of peptide length allows it to test different type of docking tools. Also, different subset gives flexibility to user on testing their tools | https://bioinfo.dcc.ufmg.br/propedia |
PixelDB | 1966 | NA | Uses machine learning to identify protein and peptide. This helps to overcome the issue of incorrectly identifying them when peptide is larger than the receptor | Broader range of peptide length allows any docking tools to be tested on | https://github.com/KeatingLab/PixelDB |
LEADS-PEP (Hauser and Windshügel, 2016) consists of 53 peptide–protein complexes with peptides ranging from 3 to 12 amino acids with resolution lower than 2 Å. The entries originate from a clustering on sequence space, retaining only complexes that are diverse in terms of sequence – excluding those that interact with DNA or RNA. Due to the short length of the peptides in this database, it is suitable for benchmarking docking tools originating from small molecule docking programmes.
PeptiDB (London et al., 2010) is a non-redundant database of 103 high resolution peptide–protein complexes. These peptides are 5–15 amino acids long with diverse bound conformations (helix, β-strand and coil) and functionalities (such as signal-transduction, antibody binding, protein trafficking and transporting). The set includes complexes with a significant conformational change upon binding. These characteristics make PeptiDB appropriate for benchmarking docking tools that account for peptide flexibility.
PPDbench (Agrawal et al., 2019) database has been used to benchmark 6 common docking programmes which contains 133 peptide–protein complexes with less than 40% sequence similarity. The set is diverse with respect to functionality, but the range of peptide lengths is narrower (9–15 amino acids). The benchmark study with this database by Agarwal et al. showed that different docking methods perform best on different class of peptides, classified in terms of their functionality such as enzymatic, signalling and many others (Agrawal et al., 2019). In another study, Weng et al. created and used the PepSet (Weng et al., 2020) database to benchmark 14 docking programmes. This database contains 185 high resolution peptide–protein complexes with less than 30% sequence similarity and peptide lengths ranging from 5 to 20 amino acids.
Both LEADS-PEP and PeptiDB have been widely used for benchmarking sampling and scoring ability, but they are limited by peptide length. Peptides longer than 20 amino acids are very common in nature. The PepPro (Xu and Zou, 2020) database contains 89 non-redundant peptide–protein complexes with longer peptide sequences (5–30 resides) and diverse peptide secondary structures. As a useful feature of PepPro, the database contains 58 structures of the unbound receptor proteins – making it ideal to benchmark docking methods for predicting apo to holo conformational changes.
While the above databases have a limited number of complexes, a few databases are more inclusive in their search parameters. For example, PepX (Vanhee et al., 2010) contains 1431 non-redundant complexes from the PDB with peptide size ranging from 5 to 35 amino acids and resolution less than 2.5 Å. There are redundancies in the database and the number of complexes can be reduced by clustering on their interaction interface reduces, resulting to 505 unique cluster centres. Similarly, the PepBind (Das et al., 2013) database is built on similar principles to PepX, without accounting for sequence/structural redundancy. PepBind contains 3100 protein peptide complexes. While larger databases may help in assessing the applicability of docking methods on predicting longer peptides, the databases are not well curated (e.g. complexes in the databases might contain non-interacting chains, small molecule ligands or ions which might lead to erroneous assessment of docking tools (Wen et al., 2018)). To overcome this limitation, a curated database PepBDB (Wen et al., 2018) was developed, containing peptides up to 50 amino acids, with nearly 13,000 complexes in the dataset. A more recent database, Propedia (Martins et al., 2021), contains over 20,000 high resolution complexes with peptides ranging from 2 to 50 residues. Propedia features a hybrid clustering based on sequence, interface structure or binding site that retains a lower number of clusters (1,845, 1,891 or 1,466, respectively), allowing the user to be flexible for benchmarking purposes. Finally, PixelDB (Frappier et al., 2018) contains close to 2,000 high resolution and non-redundant complexes. Unlike previous databases, this one relies on a machine learning algorithm along with a chain length cutoff to identify the receptor and peptide in a complex. This overcomes the issue of defining protein and peptide, for cases where the receptor size is smaller than its peptide binder.
Sampling
Docking methods can be classified in two major categories depending on their use of templates for modelling the complex (template-based and template-free docking).
Template-based docking
Template-based docking methods take advantage of known structures of either protein monomers or complexes and extract structural features to predict the unknown peptide–protein complex structure. We summarise these methods in Table 2 and provide further detail below. Template-based methods are grounded on the premise that the PPI interface is conserved and similar to either the PPI interface or the different interacting fragments in a protein. Based on conserved interfaces, these methods build a modelling scaffold for the target systems. Indeed, 80% of the peptide–protein interfaces can be derived from fragment interactions in monomeric proteins (Obarska-Kosinska et al., 2016). The first step of these class of methods is finding suitable templates for the target system from different databases. The most popular template-based docking method is GalaxyPepDock (H. Lee et al., 2015). It takes a peptide sequence and a receptor structure as inputs to search for structural similarity in the PepBind database. It then uses a score (S_complex) for each hit in the database, which is calculated combining the TM score of the receptor in the database with respect to the target receptor, and an interaction similarity score which is calculated based on the protein structure, peptide sequence and the interacting residue pairs (H. Lee et al., 2015; ‘Modeling Peptide–Protein Interactions, Methods and Protocols’, 2017). The top 10 scoring templates with scores higher than 90% of the maximum score in the database are selected. These templates are then used to build models with GalaxyTBM predicting an estimated accuracy for each model. Testing on the PeptiDB database where it predicted 37 out of 57 complexes with acceptable or higher quality. In general, the GalaxyPepDock approach is quite reliably when the receptor TM score is greater than 0.7 (H. Lee and Seok, 2017). GalaxyPepDock takes 2–3 hours to complete a prediction, making it fast compared to methods that rely on computationally expensive molecular simulations. The recently developed InterPep2 (and InterPep2-Refined) also belongs to the template-based category, with a similar performance to GalaxyPepDock when tested on unbound receptors set and slightly better when tested on the bound dataset (Johansson-Åkhe et al., 2020).
Table 2.
Summary of highlighted templated based docking tools
Tool | Input | Link to Server/ Standalone | Peptide Flexibility | Receptor Flexibility | Specific applications/ Best cases to apply on |
---|---|---|---|---|---|
Galaxy PepDock | Protein structure +peptide sequence | Server: https://galaxy.seoklab.org/cgi-bin/submit.cgi?type=PEPDOCK | Full flexibility at the refinement stage | Full flexibility at the refinement stage | Tested on the PepBind dataset. Predictions are reliable when templates can be found with TM score > 0.7 |
PepComposer | Binding site information | Server: http://biocomputing.it/pepcomposer/webserver | Sidechain rotamer and small change in backbone | Sidechain rotamer and small change in backbone | Suitable for small peptides, when tested on the LEADS-PEP dataset with 50% successes. Can also be used as inhibitor peptide design tool. |
InterPep2-Refined | Protein structure +peptide sequence | Standalone: http://wallnerlab.org/InterPep2 | SC flexibility at the refinement stage | Full flexibility at the refinement stage | Predictions are reliable when templates can be found with TM score > 0.7. Overall performs slightly better than GalaxyPepDock |
PepComposer (Obarska-Kosinska et al., 2016) is an example of tool which uses structural knowledge of complexes to design a novel peptide sequence and dock it to the given receptor. This method, in the first step, finds a structurally similar fragment based on a given binding site and retrieves continuous backbone fragments from a structural database based on contacts to the prior fragment. In the next step, it predicts novel peptide sequences and bound complex structures using Monte Carlo moves embedded in a python-based tool (pyRosetta (Chaudhury et al., 2010)). Testing the method on the LEADS-PEP dataset returned a 50% success rate considering only the top model. However, the caveat here is that the designed peptides are generally shorter than the native which decreases the RMSD (Root Mean Square Deviation) value (Obarska-Kosinska et al., 2016).
Template-Free Docking
This class of methods samples different peptide–protein orientations and positions as well as generates a diverse set of peptide conformations. Depending on the available knowledge of the binding site, these methods can be further divided into two subclasses: local and global docking. In local docking, the binding site is known, reducing search space. In global docking, no prior knowledge of the binding site is used, and the peptides explore the whole receptor surface (see Fig. 1). We will discuss these two subgroups in two different sections.
Fig. 1.
Pipeline in popular template-free docking methods. (A) Input peptide conformations are generated in 3 major ways: 1) using peptide builder to generate major 3 conformations (alpha, polyproline II, extended); 2) molecular simulations are used to generate an ensemble of peptide conformations and 3) fragment pickers are used to select peptide fragments in the structural databases based on the peptide sequence. (B) If the binding site known, peptides are guided towards the binding site (local docking); else, peptides explore the whole protein surface (global docking). (C) Ensemble of docked poses. (D) Top score docked model representing the native structure.
Local Docking
As input, local docking methods need a set of user-defined information about the binding site, restricting the search of the ligand to the vicinity of this region, so we only need to sample the peptide conformations and orientations. Success of this group of docking methods depends on the accuracy of the initial information (Ciemny et al., 2018). Knowledge of the binding site can come from diverse sources such as protein–protein complex interfaces, a docked pose from another docking tool, hotspot prediction or even experimental data. Each method has its own specific input and types of information it can handle, requiring the right fit between prior information and local docking programme used. There are also limitations regarding how much sampling of the protein and peptide conformations are needed. As an example, HADDOCK uses Ambiguous Interaction Restraints based on hotspot residues (e.g. from NMR chemical shift perturbation data) on the protein surface to guide sampling – but requires different peptide conformations as input (e.g. alpha helix, extended and polyproline II) to limit sampling to the relative position of the protein/peptide without sampling the peptide conformations (Trellet et al., 2013, 2014; Geng et al., 2017). Once the preferred bound conformation is found, HADDOCK introduces peptide backbone flexibility to sample diverse conformations. HADDOCK has a 14.5% success rate when tested on the PeptiDB database. HPEPDOCK-local, using binding site hotspot information and shape complementarity followed by energy minimisation, produces fast and accurate predictions with 33.9% success rate when tested under the same condition as of HADDOCK (P. Zhou et al., 2018; Johansson-Åkhe et al., 2019). HPEPDOCK relies on MODPEP to generate an ensemble of unbound peptide conformations. An advantage of HADDOCK over HPEPDOCK is its ability to handle ambiguous data for the binding site (Williamson, 2013; Deplazes et al., 2016). Finally, Rosetta FlexPepDock ab initio can also produce a diverse set of peptide conformations for binding, but this comes at a higher computationally expense (Raveh et al., 2010, 2011).
The second group of local docking methods are derived from small molecule docking programmes (such as AutoDock Vina (Rentzsch and Renard, 2015), GOLD (Verdonk et al., 2003) or Surflex-Dock (Spitzer and Jain, 2012) to name a few). Although no initial conformation of the peptide is required, the accuracy rapidly decays when sampling beyond 10 flexible bonds, limiting peptide size (Ciemny et al., 2018). In this approach, the peptide is placed in the binding site and peptide conformations are sampled using either Monte Carlo moves (AutoDock Vina and GOLD) or rotamer libraries (Surflex-Dock). AutoDock is most reliable with peptide lengths between 2 and 4 amino acids. Several methods have been developed to tackle longer peptides through incremental docking approaches (e.g. DINC 2.0 (Antunes et al., 2017), DLPepDock (Sun et al., 2021)) (Antunes et al., 2017). The incremental pipeline in DINC 2.0 has several stages: 1) dock a small fragment (preferably 6 rotatable bonds, roughly 2 amino acids) with AutoDock 4; 2) increase the peptide size by adding 3 more rotatable bonds and freezing 3 of the 6 previous rotatable bonds and 3) dock the peptide again (Antunes et al., 2017). Using this approach DINC 2.0 has been successful with up to 25 flexible bonds. The selection of the initial fragment is done based on heuristics, while the extension of the fragment follows the potential to maximise H-bonding with the receptor. The benchmark test included a custom dataset of 73 protein peptide complexes with multiple successes, including the docking of a B2 chicken MHC class I receptor and an 8-mer chicken peptide (1.61 Å RMSD from the native structure). The Glide SP-PEP method also uses fragment-based docking with Iterative Residue Docking and Linking to dock peptides smaller than 8 amino acids (Diharce et al., 2019). It uses Glide’s SP-PEP module to dock each residue iteratively to the binding site and then uses the covalent module to create bonds between them. The success rate for this method was high in a custom-made benchmark set with 10 out of 11 successful docking examples.
The third group of local docking methods can be termed as refinement methods (DynaDock (Antes, 2010), PepCrawler (Donsky and Wolfson, 2011), Rosetta FlexPepDock (Raveh et al., 2010)) or peptide inhibitor design methods (PepCrawler) rather than strictly docking tools. These methods need input structures of either coarse peptide–protein complexes (for refinement) or protein–protein complexes (for inhibitor design) (Ciemny et al., 2018). DynaDock, in the first step, generates broad sampling of the peptide conformation at the binding site by performing random rotation of backbone torsion and sidechain. In the next step, it uses an MD (OPMD) based refinement of the bound modes which allow the full flexibility to the receptor (Antes, 2010). Rosetta FlexPepDock uses Monte Carlo moves to sample diverse peptide conformations with full receptor flexibility with on-the-fly energy minimisation (Raveh et al., 2010). Unlike these two methods, PepCrawler can be used in two ways – refinement and inhibitor peptide design. For refinement, it uses an initial protein peptide complex structure as input and samples a diverse range of peptide conformations with a Rapidly exploring Random Trees (RRT) based algorithm. To design inhibitor peptides, it uses provided protein–protein complexes to generate the peptide fragment with lowest binding energy in the first step, followed by the RRT algorithm as refinement to dock the diverse peptide conformation. PepCrawler allows peptide and protein sidechains flexibility and only the peptide backbone flexibility (Donsky and Wolfson, 2011). This group of methods produce best results for short peptides (< 15 amino acids) and when the initial conformation of the peptide is below 5 Å RMSD from the native protein peptide complex (Ciemny et al., 2018).
Some of the popular local docking tools are summarised in Table 3, together with suggested applications where each method is most successful. The first two groups of docking methods can be used when we only have binding site information and do not have any structural information of the complex. They are often used to generate initial models which can be later refined with other methods. Tools such as PeptiMap (Lavi et al., 2013), PepSite (Trabuco et al., 2012), PEP-SiteFinder (Saladin et al., 2014), SPRINT-str (Taherzadeh et al., 2017), ANCHORSmap (Ben-Shimon and Eisenstein, 2010) or InterPep (Johansson-Åkhe et al., 2019) can predict the binding site, when no other structural information is available. The most recent method, InterPep uses template-based knowledge and a machine learning-based model to predict the binding site, outperforming most the other existing tools (Johansson-Åkhe et al., 2019). Experiments such as Chemical Shift Perturbation, alanine scan mutagenesis or ligand foot printing mass spectrometry provide information about the binding site and can be used alternatively to binding site predictors.
Table 3.
Summary of highlighted ‘local docking’ tools. Here, acronyms are used as follows: Pstr, protein structure; pseq, peptide sequence; pconf, initial peptide conformation; BB, backbone; SC, sidechain
Tool | Input | Link to server/standalone | Peptide flexibility | Receptor flexibility | Specific applications/best cases to apply on |
---|---|---|---|---|---|
HADDOCK | Pstr+pconf ensemble + ambiguous information of binding site | Server: https://wenmr.science.uu.nl/haddock2.4/submit/1 | SC are flexible but can be extended to the BB of the provided binding site residues | Fully flexible | HADDOCK can use ambiguous information about binding residues on protein and/or peptide. Reliable when there is no significant peptide conformational change upon binding. |
HPEPDOCK -local | Pstr + pconf ensemble+ Information of binding site |
Server: http://huanglab.phys.hust.edu.cn/hpepdock/ | Not flexible | Flexibility is considered generating an ensemble of peptide conformation | 34% success rate on PeptiDB database compared to HADDOCK’s 14.5% but needs accurate information of binding residues |
AutoDock Vina | Pstr+pseq+Binding site coordinate | Standalone: https://github.com/ccsb-scripps/AutoDock-Vina | SC flexibility is default but can be extended to the BB | Fully flexible | Reliable when binding peptide length is less than 5 residues |
DINC 2.0 | Pstr + pconf + Binding site coordinate | Server: http://dinc.kavrakilab.org | No flexibility | Fully flexible | AutoDock based method with fragmentation of peptide. This allows it to tackle peptides up to 8 residues |
PepCrawler | Initial coarse protein–peptide with peptide at the binding site/protein–protein complex | Server: http://bioinfo3d.cs.tau.ac.il/PepCrawler/php.php | SC flexibility | Fully flexible | Can be used as a refinement method. Predictions are reliable when the starting model is with 5 Å RMSD from the experimental structure, and peptide is shorter than 15 residues |
Rosetta FlexPep Dock | Initial coarse protein peptide complex with peptide at the binding site | Server: https://www.sciencedirect.com/science/article/pii/S1359644617305937#bib0165 | SC flexibility but can be extended to the BB | Full flexible | Can be used as a refinement method. Predictions are reliable when the starting model is with 5 Å RMSD from the experimental structure, and peptide is shorted than 15 residues |
Global Docking
This class of template-free docking programmes becomes specially useful when there is no information about the binding site (see Fig.1). This class of methods is the most general as it requires the least amount of information provided by the user. However, the additional computational effort required to simultaneously sample the binding site as well as peptide conformations limits the success rate when compared to the previous classes of methods. Many of these approaches use a two-step procedure composed of a fast rigid docking stage to identify the bound state followed by a refinement strategy. Thus, the local docking methods described above can be used as part of the refining strategy.
There are several strategies to generate initial peptide conformations for the rigid docking stage: 1) methods such as MDockPeP (Yan et al., 2016), MDockPeP2 (Xu and Zou, 2022), Cluspro PeptiDock (Porter et al., 2017), use a MODELLER (Webb and Sali, 2014) based algorithm and PIPER-FlexPepDock (PFPD) (Alam et al., 2017) uses the Rosetta fragment picker to extract a fragment from an interacting partner of a protein–protein complex with similar sequence. MDockPep2 additionally considers the physiochemical environment similarity of the binding interface along with sequence similarity in the fragment picking stage; 2) pepATTRACT (Schindler et al., 2015; Vries et al., 2017) threads through three major secondary conformations (i.e. alpha, beta and coil) using PeptideBuilder (Tien et al., 2013); and 3) HPEPDOCK global uses MODPEP to generate an ensemble of peptide conformations (P. Zhou et al., 2018). Once the peptide conformation is identified, each method relies on their own rigid docking strategies. Cluspro PeptiDock and PFPD use a PIPER-based protocol (Kozakov et al., 2006); MDockPeP uses a modified version of AutoDock vina whereas MDockPeP2 uses ZDock (protein–protein docking tool) to carry out the rigid docking step (Xu and Zou, 2022; Yan et al., 2016); pepATTRACT uses ATTRACT to carry out rigid body docking of the peptide with the ATTRAC coarse-grained representation of the protein and peptide and HPEPDOCK uses a modified version of MDock making it suitable for protein–peptide systems to perform rigid body docking. At the end of the rigid docking step, these methods have their own ways to include flexibility in the system and refine the docked structures. These strategies include using local docking methods, MD or MC simulations or other energy minimisers. For instance, PFPD uses Rosetta FlexPepDock; pepATTRACT uses iATTRACT and AMBER MD simulation for refinement and HPEPDOCK uses a SIMPLEX energy minimiser as the fully flexible refinement step (Schindler et al., 2015; Alam et al., 2017; P. Zhou et al., 2018).
Methods like CABS-dock (Kurcinski et al., 2015, 2019), AnchorDock (Ben-Shimon and Niv, 2015) and AutoDock CrankPep (ADCP) (Zhang and Sanner, 2019) allow flexibility to the peptide during the whole docking process. CABS dock generates peptide conformations in explicit solvent in the presence of the interacting partner, allowing the peptide to adopt its bound conformation. Thus, this allows full flexibility on both the receptor and peptide side. It uses a coarse-grained representation of each amino acid where backbone and sidechains are represented by two pseudo atoms each for computationally efficiency (Kurcinski et al., 2015, 2019). The limitation in this case is the need for the peptide’s secondary structure, which is not available in most cases. PSIPRED (McGuffin et al., 2000) is used when the secondary structure is not known – even though it is not ideal for predicting the secondary structure of peptides, which are typically intrinsically disordered in their free form (Yan et al., 2016). Unlike other global docking tools, AnchorDock uses the prediction tool ANCHORSmap to identify the anchoring spot and then performs an anchor-guided MD simulation (Ben-Shimon and Niv, 2015). This strategy combines the speed-up of the restraints with full flexibility of the peptide/protein system. A recent method, ADCP, uses Monte Carlo moves to sample peptide conformations under the influence of the potential landscape generated by the receptor which helps it to find correct fold upon binding making it highly successful allowing fully flexible docking (Zhang and Sanner, 2019).
Methods which use molecular simulations (extensive MD or MC) either in the docking stage or in refinement stage generally have higher accuracy than the other methods (Agrawal et al., 2019; J. Wang et al., 2019; Weng et al., 2020). For example, despite the global nature of pepATTRACT, it can be as successful as some of the local docking approaches, with a local version of the method (pepATTRACT-local) having a higher success rate (Schindler et al., 2015). PeptiDock+ Gaussian accelerated molecular dynamics (GaMD) which refines Cluspro PeptiDock results with Gaussian accelerated MD (discussed in the MD section), performs significantly better than the traditional Cluspro PeptiDock (J. Wang et al., 2019). AnchorDock correctly predicted 10 of 13 complexes (RMSD <2.2 Å) in a custom dataset (Ben-Shimon and Niv, 2015). ADCP has shown better performance than most of the other existing docking tool for peptide ranging 16–20 residues with an 87% success rate (considering 10 models) when tested on LEADS-PEP dataset (Zhang and Sanner, 2019). Finally, PFPD, considered one of the state-of-the-art methods, can produce near native complex structures in 70% for bound test sets and 40% for unbound test sets (Alam et al., 2017). However, due to the nature of the simulation, these methods are significantly slower compared to others. The computational resource requirements increase with longer peptides while maintaining the accuracy (e.g. simulations running for hours on GPUs or longer in CPUs) (Ciemny et al., 2018). These expensive simulations are needed to achieve the higher accuracy as shown by the pepATTRACT web server which removes the refinement step (Vries et al., 2017). When using the PeptiDB benchmark set, the webserver predicts 14 out of 80 complexes correctly (within 2 Å RMSD of the experimental structure), whereas the full pepATTRACT version correctly predicts 38 complexes (Vries et al., 2017).
Recently, the Furman lab introduced patchMAN, a motif search method (using the MASTER algorithm) combined with Rosetta FlexPepDock refinement, which outperforms other methods (Khramushin et al., 2022). At the first step, it searches for the receptor surface motifs in a non-redundant protein database followed the finding peptide templates that interact with these motifs and the target peptide sequence is threaded through the models. In the end, Rosetta FlexPepDock is used to refine all the models. This method allows fully flexibility of the binding site of the receptor as well as the peptide. When tested on the custom-made PFPD database, it outperforms PFPD and even recent machine learning-based revolutionary AlphaFold (Jumper et al., 2021), considering the success criteria as a 2 Å RMSD cutoff from the native. However, on a different dataset (LNR) its performance is comparable to AlphaFold (Khramushin et al., 2022). Table 4 summarises the global docking tools listing their features and suitable application cases.
Table 4.
Summary of highlighted ‘global docking’ tools. Here, acronyms are used as follows: Pstr, protein structure; pseq, peptide sequence; BB, backbone; SC, sidechain
Tool | Input | Link to Server/standalone | Peptide Flexibility | Receptor Flexibility | Specific applications/ Best cases to apply on |
---|---|---|---|---|---|
MDockPeP | Pstr+pseq | Server: https://zougrouptoolkit.missouri.edu/mdockpep/ | Small change in conformation at the refinement stage | Full flexibility at the refinement stage | Performs well on smaller peptides with <15 residues |
MDockPeP2 | Pstr+pseq | Standalone: https://zougrouptoolkit.missouri.edu/mdockpep2/download.html | Full flexible | Fully flexible at the refinement stage | Can be applied on peptides up to 29 residues but success rate decreases beyond 15 residues |
Anchor Dock | Pstr+pconf | Not available | Fully flexible | Fully flexible | Uses expensive molecular simulations. Suitable for large peptides (>15 residues) which show conformational changes |
Pep ATTRACT | Pstr+pseq | Server: https://bioserv.rpbs.univ-paris-diderot.fr/services/pepATTRACT/ | Fully flexible in the full pepATTRACT version but no flexibility in the web server | Fully flexible in the full pepATTRACT version but the server just uses 3 major peptide conformations to dock | Full version uses expensive molecular simulations. Suitable for large peptides (>15 residues) which shows conformational changes. Web version is useful for smaller peptides |
CABS-dock | Pstr+pseq+Bound peptide secondary structure (optional) | Server: http://biocomp.chem.uw.edu.pl/CABSdock | Fully flexibly at the peptide conformation generation stage | Fully flexible | Suitable when bound peptide conformation is known |
PIPER-FlexPep Dock | Pstr+pseq | Server: http://piperfpd.furmanlab.cs.huji.ac.il | Fully flexible at the refinement stage | Fully flexible at the refinement stage | Uses expensive molecular simulations. Suitable for large peptides (>15 residues) which shows conformational changes |
AutoDock CrankPep | Pstr+pseq | Standalone: https://github.com/ccsb-scripps/ADCP | Fully flexible | Fully flexible | Uses expensive molecular simulations. Suitable for large peptides (>15 residues) which shows conformational changes |
patchMAN | Pstr+pseq | Server: https://furmanlab.cs.huji.ac.il/patchman/ | Fully flexible at the refinement stage | Fully flexible at the refinement stage | Most successful when tested on custom made PFPD dataset outperforming AlphaFold |
Scoring
At the sampling step, docking methods obtain an ensemble of docked poses – some of them are native-like, while some are far from native. The state of the art in peptide docking is reliable at sampling the correct binding site. For example, when we only consider sampling efficiency, MDockPeP has success rate of 95% when starting from bound conformations and 93% when starting with challenging unbound structures (Yan et al., 2016). The recent method patchMAN can sample within 5 Å RMSD from the native complex in 100% cases (Khramushin et al., 2022). This implies that currently, the limitation and overall successes of the docking tools can be attributed to the scoring stage majorly. Thus, the next crucial step is to find the best docked model, representing the native complex, in the ensembles of the docked poses (Ciemny et al., 2018; Weng et al., 2019) (see Fig. 1). These methods can be classified in several major groups such as using a knowledge-based scoring function, energy-based method, clustering-based method and integrative or combinational approach. One important feature of a good scoring method is that it should consider entropic contribution due to conformational change as well as the interaction energy.
There are a series of scoring function used successfully in small molecule and protein–protein docking field. End point methods like MM/PBSA and MM/GBSA, mentioned in the MD section, are widely used for small molecule binding-free affinity calculation and scoring (Hou et al., 2011; Pu et al., 2017; E. Wang et al., 2019; Weng et al., 2019). When applied to peptide–protein systems, if these methods are used with appropriate parameter, they outperform pepATTRACT (which uses ATTRACT scoring function) and produce similar quality as HPEPDOCK-local (which uses an iterative knowledge-based scoring function coming from protein–protein docking tool MDock) (Weng et al., 2019). However, these methods do not consider entropic contributions due to peptide conformational changes, limiting their success to binding processes without significant changes in the peptide conformation. Ideally, these methods should be modified or combined with others for generalised use (Spiliotopoulos et al., 2016; Tao et al., 2020). As an improvement, BiPPred and HADDOCK use a dampened version of MMPBSA named dMMPBSA algorithm as a scoring function to calculate the free energy and rank docked poses (Spiliotopoulos et al., 2016). In this approach, they reduce the Coulombic interaction and polar solvation term by factor of 5 to compensate the overestimation of free energy due to the omission of entropy. Another recent approach by H. Tao et al. combines MM-GBSA scoring function with knowledge-based scoring function ITScorePP to consider the conformational entropy part (Tao et al., 2020). ITScorePP is derived from atomic distance-based energies parametrised iteratively using statistical mechanics. Their work has shown rescoring the pose clustering with this combined scoring function makes the results significantly better when compared against pepATTRACT, CABS-dock and HPEPDOCK result with LEADS-PEP dataset.
A group of methods use clustering algorithms on the ensemble of docked poses (or filtered docked poses) based on structural RMSD – but they have their own way to use clustering for selecting structure. Cluspro PeptiDock assigns highest score to the most populated cluster’s medoid, whereas CABS-dock selects the consensus medoid obtained from different clustering protocols as the best model (Kurcinski et al., 2015, 2019; Porter et al., 2017). Rosetta FlexPepDock and PFPD perform clustering and score the top clusters with a modified Rosetta ab-initio energy function (Alam et al., 2017; Raveh et al., 2010). The modified version of the Rosetta ab-initio energy function has been shown be successful as it combines standard all atom Rosetta energy with internal peptide energy and interaction energy. AnchorDock also uses clustering algorithm on all snapshots from molecular simulation trajectories and scores the clusters based on the average potential energies of the best 15 models (in terms of binding energy) in each cluster (Ben-Shimon and Niv, 2015). pepATTRACT’s ATTRACT scoring function is based on modified Lenard–Jones function to select 1000 models and those are further refined by AMBER followed by performing clustering on simulated trajectory. The clusters are ranked based on the average ATTRACT energy of four lowest energy models (Schindler et al., 2015).
Integrative scoring methods combine external information like agreement with co-evolutionary data or mutagenesis data with energy-based or clustering-based scoring which have performed very well in the recent CAPRI competitions (Yu et al., 2017; Lensink et al., 2020). Machine learning is recently becoming one of the methods of choice to derive scoring functions. InterPepRank (Johansson-Åkhe et al., 2021) is such an example which uses deep graph-based neural network mapping the protein peptide complex, and DockQ (Johansson-Åkhe et al., 2020) uses a Random Forest model to score and rank the docked poses. Recently, the field of docking has started using combined multiple scoring method together to compensate each other limitation. For example, when InterPepRank is combined with PFPD pipeline as rescoring, success rate increases significantly (40% for the high-quality prediction). Simultaneously, it filters out some of the non-native dock poses using an InterPepRank score cutoff, reducing the number of hits for refinement steps which increases the computation efficiency (Johansson-Åkhe et al., 2021).
Summary
The wealth and diversity of available docking programmes and servers for peptide–protein complexes pose a barrier of entry to newcomers to the field or would-be users. It is difficult to answer a question like, ‘which is the best?’ as it will depend on the system and available information. Each method has advantages and disadvantages such as the ability to work on peptides of different size, or the ability to explore large conformational changes upon binding. In general, local docking tools outperform the global tools, but the latter does not need any information about the binding site. Most of the global docking tools can be used as local docking tools when the binding site information is provided leading to higher success rates (Schindler et al., 2015; P. Zhou et al., 2018). A benchmark study by Weng et al. showed that performance of local docking methods that are based on AutoDock drops significantly depending on erroneous inputs of the size of binding sites, especially for peptides longer than 10 residues (Weng et al., 2020). In summary, for a user when the target receptor can find a template with enough homology (TM score > 0.7), template-based methods generally offer the highest success rates. Alternatively, when the TM score is lower than 0.7, template-free methods should be used. For peptide smaller than 5 residues, the docking tools coming from small molecule docking work the best (Ciemny et al., 2018). For the longer peptides, computationally extensive molecular simulation-based tools such as PFPD, or ADCP and patchMAN become more relevant. Running short MD simulation-based refinements have shown higher successes; however, computational expenses also rise. In future, coarse graining on the non-interacting residues might be a way to reduce the computer time required for these molecular simulation-based tools. Moreover, the docking field has started to apply combined approaches like combined predictions from InterPep2-refined and PFPD which has shown better performance than the individual one (Johansson-Åkhe et al., 2020).
MD section
The nature of MD simulations allows, in principle, to obtain thermodynamics, kinetics and mechanistic understanding of the peptide–protein binding and unbinding process. As in the application to other biological problems, MD is limited by the accuracy of the physics model used and the ability to sample the complex energy landscape, which typically requires computational resources beyond our current capacity.
Modelling PPIs with modern force fields
The interactions of molecules can be theoretically determined using quantum mechanics but remain unaffordable in practical terms for large biomolecules. In practice, an empirical force field is used to model such interactions along with Newton’s equations of motion to simulate the dynamics. MD simulations have been shown to accurately predict the binding potency of diverse small molecule binders (L. Wang et al., 2015). However, there are key differences between protein binding with small molecule and peptide that makes the latter more computationally challenging. First, while small molecules have a few hotspot interactions that dominate recognition, peptide recognition can stem from many weak interactions. Second, the structure of peptides can be highly flexible, requiring finely tuned parameters that are sensitive to the change in conformational preferences between the free conformation of peptide and its bound form. Thus, peptides have been specially affected by known biases in secondary structure preferences present in some force fields (Perez et al., 2015; Robinson et al., 2016).
Early force field development was an art, guided by great scientific insights (e.g. some parameters originated from ‘guesses’ that have remained as part of the force field for decades), and carried out by a few expert groups. One of the challenges is the unexpected consequence of parameter changes as modifying one parameter might affect the accuracy of another parameter that was not adjusted due to the coupling between different terms. Even long MD simulations on a set of systems with times series, distributions, behaviour and stability analysis might not be enough to capture all possible issues. Some issues might arise in timescales beyond those studied during development or in systems not included in the benchmark test. While some groups might have made ad hoc modifications to deal with problems in specific systems, these modifications were not often properly benchmarked and rarely made it back into the main force field branch. Such trend has dramatically changed in recent years as measured by the number of force fields as well as involvement of many groups representing these improvements. Despite this, a general problem is the lack of a golden standard benchmark set for parameter development (e.g. for proteins, nucleic acids, lipids). The availability of open sharing resources would make the preparation and dissemination of such a benchmark test an easy endeavour. Community efforts such as the OpenFF are already on their way for continuous optimisation of small molecule force fields (Qiu et al., 2021).
Here, we focus on recent efforts to improve the description of peptides and intrinsically disordered proteins (IDPs) while maintaining the stable properties for folded proteins (Fig. 2), more extensive surveys can be found elsewhere (Rahman et al., 2020; Mu et al., 2021). Force field development follows three main strategies: 1) modifying the dihedral angle parameters against experiments and/or quantum mechanical calculation; 2) adjusting non-bonded protein–water interactions and 3) balancing dihedrals with grid-based energy correction maps (CMAP).
Fig. 2.
Overview of protein force field development after 2000. Each protein force field is classified by the year of publication, target systems for optimisation (folded, disordered or both) and additional underscores indicating whether it is a modification version of previous force fields using strategies including dihedral parameter adjustment (blue), CMAP correction (red) or parameter modification for protein–water interaction (gold).
Most effort is concentrated on directly refining the global backbone dihedral parameters including AMBER ff99SB* and ff03* (Best and Hummer, 2009), ff14SB (Maier et al., 2015), ff99SB-ILDN (Lindorff-Larsen et al., 2010), CHARMM22* (Piana et al., 2011), OPLS-AA/M (Robertson et al., 2015) and OPLS3 (Harder et al., 2016), while further improvements also involve refined side-chain dihedral terms. The residue specific force field (RSFF) approach involves residue-specific dihedral parameter refitting to achieve better agreement with experimental data. RSFF1 force field (Jiang et al., 2014) is developed based on OPLS/AA (Kaminski et al., 2001), while RSFF2 (C.-Y. Zhou et al., 2015) is based on the ff99SB force field (Hornak et al., 2006). Protein–water interactions are actively involved in peptide–protein binding process, where the ensemble of peptide conformations in its free form is highly sensitive to the solvation model used (e.g., overall compactness as defined by the radius of gyration). Not surprisingly, the AMBER ff03ws combined with a refined TIP4P/2005 solvation model (Best et al., 2014) and the CHARMM36m with an optimised TIP3P model (Huang et al., 2017) have shown improvement in modelling IDPs ensembles. While some combinations of solvent force fields are designed to work with specific protein force fields, others such as the TIP4P-D (Piana et al., 2015) aim to improve general deficiencies such as underestimation of London interactions by developing larger water dispersion coefficient, resulting in improved agreement with experimental observables over a broad range of force fields. The CMAP strategy was first introduced as a grid-based correction to the CHARMM22 force field (MacKerell et al., 1998) to account for coupling of ψ/φ torsion angles (CHARMM22/CMAP) (Mackerell et al., 2004). The latest iteration (CHARMM36) (Huang and MacKerell, 2013) has been the starting point for new CMAP potentials that better balance between folded proteins and IDPs (CHARMM36m). The CMAP approximation has been adopted in other families of force fields such as AMBER (Perez et al., 2015; Tian et al., 2019) or OPLS (Yang et al., 2019). Over recent years, the CMAP strategy has led amino acid specific potentials, rather than using a few transferable potentials for all (e.g., non-glycine or proline). Thus, ff99IDPs (W. Wang et al., 2014) and ff14IDPs (Song et al., 2017) develop specific CMAP parameters for eight disordered-promoting amino acids. More recently, ff14IDPSFF (Song et al., 2017) and CHARMM36IDPSFF (Liu et al., 2018) add a different CMAP correction to each of the 20 amino acids.
Although such optimisations can better describe the more extended conformation of disordered peptides, some modified force fields generate unstable structures for folded proteins. Ideally, a force field that allows accurate descriptions of both folded and unfolded ensembles is preferable because it would better simulate transitions of peptides between disordered state to ordered state. The a99SB-disp force field developed by Robustelli et al. modified the ff99SB-ILDN parameters and adjusted TIP4P-D water model against experimental measurements, and the resulting force field has shown great improvement for modelling disordered ensembles and still maintains the accuracy for folded proteins (Robustelli et al., 2018). Another environment specific force field (ESFF1) was recently developed based on CMAP corrections of 71 different sequence environments (Song et al., 2020). These force fields have demonstrated an improved balance between modelling IDPs and folded proteins. With the number of choices available, it might be daunting to choose the right force field for your system. Many MD packages such as AMBER (Case et al., 2005), CHARMM (Brooks et al., 2009), NAMD (Phillips et al., 2005), Gromacs (Abraham et al., 2015), Tinker (Lagardère et al., 2017) or OpenMM (Eastman et al., 2017) provide user a wide selection of force fields, even originating from different force field families (e.g., AMBER or CHARMM family of force fields). It is important that for whichever force field is selected the compatible solvent model and ion parameters tested by benchmark studies should also be used correspondingly.
Characterising peptide binding poses and affinities by MD simulations
MD is not high-throughput enough in its own to routinely determine the structures of peptide–protein complexes when the bound state is unknown. The three major applications for MD are:
1) Refinement of docking results; 2) estimating binding affinities based on known bound complexes and 3) use of integrative modelling strategies to determine structures of the complexes.
Docking approaches described in the previous section favour speed at the expense of accuracy, while MD approaches are accurate, but inefficient at identifying where and how a peptide binds from conventional MD simulations. Thus, short MD simulations are often the last step of docking calculations to eliminate steric overlap, account for local conformational changes and identify structures based on physico-chemical principles rather than relying on a scoring function. While this application is standard, it does not leverage the full potential of MD such as calculating binding affinities. Thus, recent integration of docking and MD-based techniques such as the combination of ClusPro PeptiDock with GaMD goes beyond refining the structure to provide free energy profiles (J. Wang et al., 2019). In this work, the authors benchmarked their method on three distinct model peptides achieving 0.6–2.7 Å improvement in peptide backbone structures. Moreover, the unbiased free energy profiles help identify key residues involved in significant conformational changes upon binding that can later be used for peptide sequence optimisation and design.
When the experimental structure of the complex is known, end-point methods based on MD are typically used to determine binding affinities. MM/GBSA and MM/PBSA are among the most popular methods in this category, introduced by the Kollman group two decades ago, the method is grounded on robust physico-chemical principles (E. Wang et al., 2019). The method has been well received by the community and still favoured over empirical and semi-empirical scoring functions designed for protein–peptide docking (see discussion in docking section). Despite its robust theoretical framework, its practical implementation results in approximations (such as flexibility, solvation and entropy) that limit the accuracy of the results. For example, prediction of peptide binding affinities for peptide–MHC complexes is highly desirable for vaccine design, but the flexible nature of the peptides make routine affinity prediction using bioinformatic pipelines insufficient. The inclusion of structural information is crucial to explore relevant molecular conformations of the peptide–protein complex, and therefore, key to understand its dynamic behaviour. Wan et al. combined the MM/PBSA and the conformational entropy method to compute peptide–MHC binding affinities from MD simulations where both the bound and unbound peptide were simulated (Wan et al., 2015). The method achieves highly correlated binding affinity rankings with experimental estimates after normalising ΔGMM/PBSA with the hydrophobicity of peptides. Ochoa et al. generated conformations of the complexes from MD simulation, then using a scoring function to predict binding affinities in better agreement with experiments than either sequence-based predictions or single docking scoring methods (Ochoa et al., 2019). Pathway-based free energy calculation methods such as free energy perturbation (FEP) have achieved unprecedented accuracy in modelling protein binding with small molecule for a large set of ligands (L. Wang et al., 2015). However, directly transferring such approaches to estimate protein–peptide binding free energy is challenging due to the flexibility and size of peptides. Kilburg et al. introduced a single-decoupling alchemical method that successfully calculated the free energy for HIV1-IN binding with a series of cyclic peptides (Kilburg and Gallicchio, 2018). The calculation convergence is largely affected by the ladder parameters in Hamiltonian and temperature replica exchange; specifically, more dense parameters are required to increase the overlap between phase space of alchemical states in large ligand system. FEP-based simulations have also been applied to estimate the mutation effect in binding specificity change of PDZ-peptide system (Panel et al., 2018) and help select a potent blocker for Kv1.3 channel (Rashid et al., 2013).
Integrative approaches combine computational modelling with experimental information to determine structures of peptide–protein complexes. Methods such as Rosetta and MODELLER are examples involving different types of modelling strategies. Other methods such as maximum entropy (Pitera and Chodera, 2012) or Bayesian inference (MacCallum et al., 2015) aim to identify distributions that agree with experimental data. Analysing such distributions yields the number of states that best represent the data. Our use of the modelling employing limited data approach for peptide binding has been successful for harnessing chemical shift perturbation NMR (Mondal et al., 2022)and ALA scan mutagenesis data in predicting conformations of the bound complex (Morrone et al., 2017b). Furthermore, its physico-chemical foundation allows the user to recover relative binding-free energies using a competitive binding protocol. These simulations sample peptides conformations in binding while allowing their full flexibility and accurately match experimental results for a series of peptides inhibiting the p53-MDM2 and MDMX interaction (Morrone et al., 2017a,b).
Unveiling peptide binding/unbinding kinetics through enhanced sampling
Accurate prediction of peptide–protein binding/unbinding kinetics from MD simulations requires extensive sampling of bound/unbound states, the transitions between them and possible intermediate states. However, the structural flexibility of many peptides challenges estimation of association and dissociation rates, represented by kon and koff, respectively, as complicated binding mechanism arises including folding upon binding of peptides to the receptor, and inherent structural heterogeneity arises from weak interactions in the binding interface. Recent simulation studies on PPIs employ various advanced sampling methods with critical thermodynamic and kinetic analysis. Markov state models (MSMs) have been widely applied to estimate kinetic quantities of biomolecular conformational dynamics from a set of short atomistic MD simulations (Chodera and Noé, 2014). Paul et al. used multi-ensemble Markov models, which combine conventional MD with Hamiltonian replica exchange enhanced sampling simulations, to characterise peptide–protein binding mechanism and kinetics beyond the seconds timescale of a nano-molar peptide inhibitor PMI to the MDM2 receptor (Paul et al., 2017). Zhou et al. studied the p53 binding with MDM2 by running near 1 ms unbiased simulations on a distributed computing platform (G. Zhou et al., 2017). Two key intermediate states were identified from a four-state kinetic model using MSM analysis and kon was predicted in good agreement with experimental estimation. Zwier et al. generated hundreds of continuous binding pathways from weighted ensemble simulations and obtained similar on-rate estimates (Zwier et al., 2016). In addition, they identified residue F19 from p53 might be a kinetically important residue for binding as the majority of conformations involve its partial or complete burial.
Metadynamics employs biasing potential as a function of collective variables by which the system is allowed to cross high-energy barriers that are conventionally difficult to sample (Bussi and Laio, 2020). Zou et al. investigated the folding and binding process of p53 to MDM2 using two metadynamics-based methods yielding a reasonable estimation for the on/off-rate constants and the binding-free energy profile (Zou et al., 2020). The anchor residues F19 and W23 of p53 were identified to follow the stepwise binding pattern. This finding helps explain certain mutants can be regulated by weak non-native interactions near bound state due to the disorder nature of p53. The consequence of secondary interactions on the binding mechanism was also addressed by extensive unbiased simulations combined with umbrella sampling to perform MSM analysis for a coronavirus-derived peptide, bound to a prevalent MHC receptor in humans (Abella et al., 2020). The model reaffirms the major role of anchor positions in the peptide for establishing stable interactions and reveals the underestimated importance of a non-anchor position. The conclusion was confirmed by simulating the impact of specific peptide mutations and validated these predictions through competitive binding assays where stark differences in unbinding pathways were identified by comparing the MSM of the wild-type system with those of the D4A and D4P mutants.
Machine Learning
The role of machine learning in structural biology was greatly accelerated by the success of AlphaFold (AF) in the 13th installment of the Critical Assessment for Structure Prediction (CASP) event (Senior et al., 2020). Two years later, after the field had replicated all the previous successes, a complete re-design of AF produced even higher accuracy structure predictions that surpassed any previous expectation (Baek et al., 2021; Jumper et al., 2021). The accuracy of such ML predictions is sometimes in better agreement with NMR data than the models generated by standard NMR pipelines (Tejero et al., 2022). Not surprisingly, the field was soon ready to test the limits and possibilities of the AF approach. Early on, adding poly-glycine linker successfully tricked AF into predicting the structures of complexes where the linker remained unstructured. This strategy has produced a level of accuracy for peptide–protein complex structure prediction that surpasses state-of-the-art docking programmes in a recent benchmark test, especially for complexes with binding motifs. A retraining of AF for complexes was soon published online (AF-multimer), but the weights have not been as extensively refined as the ones for the original AF (Evans et al., 2021).
The success of such approaches begs the question of why AF is performing well on complexes such as peptides. One observation points to the structural complementarity between peptides and proteins, where many peptides adopt well-defined secondary structures upon binding. Indeed, peptides that bind as coils are not predicted as well, although generally the binding site is still identified. The field of docking uses scoring functions to rank the different poses and compounds – a strategy that is very successful for small molecules, but which has not reached the same level of maturity for peptides (as described above in the docking section). AF’s pLDDT measure also lacks the possibility of ranking different peptides as different sets of peptides might be predicted with similar pLDDT scores despite very different binding affinities. Other ML approaches directly use structural ensembles and a measure of accuracy such as RMSD to assess the quality of the predicted structures (Townshend et al., 2021). This begs the question of whether AF or other ML algorithms can learn something about the biophysical energy function that governs binding (or folding) and how it can be used towards predicting peptide–protein complex structures.
AlQuraishi’s group first addressed this question using a bespoke Hierarchical Statistical Modelling ML approach to learn the biophysical function that scores multiple peptides binding a receptor motif (Cunningham et al., 2020). Unfortunately, lack of data meant that this approach could only be used for eight protein families. Could AF capture such as biophysical function from its training? Recent work from Ovchinnikov’s group suggests that indeed AF has learnt such a function (Roney and Ovchinnikov, 2022). According to this study, MSAs serve the purpose of global sampling, focusing the search space in regions of interest, and the biophysical energy learned through the network is able to identify the best local structure. This is especially interesting for peptide–protein systems where the problem can be separated into two parts: (1) a template or MSA for the receptor and (2) a single sequence for the peptide. In this way, the learned function is responsible for finding where the peptide interacts, its conformation and any conformational changes required for the receptor. We put this notion to the test by using competitive binding in AF to determine which peptides had higher affinity to a series of receptors. Surprisingly, the method was very successful in ranking the strongest binders. As the differences in binding affinity become small, the method reflects this uncertainty, and the method is not suitable when both peptides are weak binders. A caveat of the above developments is the requirement to fit within the original hypothesis of peptide–protein complementarity – when the structure of the complex is not correctly predicted, then the competitive binding will also not work.
Turning this argument on its head, the Baker group uses the idealised version of proteins that neural networks learn to design new proteins. The process of deep neural network hallucination has already produced several structures that have been confirmed experimentally (Anishchenko et al., 2021). Adding constraints into the hallucination process can direct the design into areas of complementarity or desired functionality. As such, the complementary nature of peptides to receptor binding motifs can lead to the design of peptides or mini-proteins based on constrained hallucination to the known binding site (J. Wang et al., 2021).
The advances in ML for structural prediction and accurate scoring give rise to the ability to query increasing large libraries of peptides (Chang and Perez, 2022). Along these lines, PPI predictors are also starting to emerge to predict which peptides will interact with a certain protein and give insight into the peptide residues involved in the interaction (Casadio et al., 2022; Lei et al., 2021). Furthermore, ML is also offering ways to identify peptide sequences which are likely to have high biological activity against a particular pathology (Wu et al., 2019) (e.g. anticancer (Chen et al., 2021) or antimicrobial (Dee, 2022; E. Y. Lee et al., 2017; Plisson et al., 2020) peptides). Thus, we expect combining ML pipelines that act at the sequence level with those at the structural level will be able to create peptide libraries specific to a type of disease that can then be screened to predict structures and their relative binding affinities. This is a rapidly changing field, where ML is already having a big impact, and where many questions regarding the interpretability and applicability of the current technology need to be answered.
Conclusions
Our focus in this review has been to identify the different approaches that docking, molecular simulations and machine learning use to study peptide–protein systems. We expect new synergies between these three types of technologies will lead to more robust methodologies to capture peptide–protein systems. For example, docking can reliably identify the binding region and might provide good templates for ML to refine, to predict structures and screen peptides for binding-free energies. Meanwhile, the emphasis of new force fields in correctly describing intrinsically disordered peptides together with enhanced sampling can benefit from initial models to determine kinetic constants and binding affinities through an orthogonal approach (on a more limited set of systems to refine). A current limitation in machine learning is the dependence on natural amino acids. MD on the other hand has transferable potentials and can be used as an end stage in peptide optimisation for studying the effect of controlling flexibility (e.g., through chemical staples) or replacing some residues with non-natural amino acids such as peptoids.
Acknowledgement
The authors thank XSEDE resources.
Author contributions
Conceived the review: AP.; Performed research and analysed data: AM, LC.; Wrote manuscript: A.M.,L.C. and AP.
Conflict of interest
The authors declare no conflicts of interest.
Open Peer Review
To view the open peer review materials for this article, please visit http://doi.org/10.1017/qrd.2022.14.
References
- Abella JR, Antunes D, Jackson K, Lizée G, Clementi C and Kavraki LE (2020) Markov state modeling reveals alternative unbinding pathways for peptide–MHC complexes. Proceedings of the National Academy of Sciences 117(48), 30610–30618. 10.1073/pnas.2007246117 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Abraham MJ, Murtola T, Schulz R, Páll S, Smith JC, Hess B and Lindahl E (2015) GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1, 19–25. 10.1016/j.softx.2015.06.001 [DOI] [Google Scholar]
- Agrawal P, Singh H, Srivastava HK, Singh S, Kishore G and Raghava GPS (2019) Benchmarking of different molecular docking methods for protein–peptide docking. BMC Bioinformatics 19(Suppl 13), 426. 10.1186/s12859-018-2449-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alam N, Goldstein O, Xia B, Porter KA, Kozakov D and Schueler-Furman O (2017) High-resolution global peptide–protein docking using fragments-based PIPER-FlexPepDock. PLoS Computational Biology 13(12), e1005905. 10.1371/journal.pcbi.1005905 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anishchenko I, Pellock SJ, Chidyausiku TM, Ramelot TA, Ovchinnikov S, Hao J, Bafna K, Norn C, Kang A, Bera A K, DiMaio F, Carter L, Chow CM, Montelione GT and Baker D (2021) De novo protein design by deep network hallucination. Nature, 1–6. 10.1038/s41586-021-04184-w [DOI] [PMC free article] [PubMed] [Google Scholar]
- Antes I (2010) DynaDock: a new molecular dynamics-based algorithm for protein–peptide docking including receptor flexibility. Proteins: Structure, Function, and Bioinformatics 78(5), 1084–1104. 10.1002/prot.22629 [DOI] [PubMed] [Google Scholar]
- Antunes DA, Moll M, Devaurs D, Jackson KR, Lizée G and Kavraki LE (2017) DINC 2.0: a new protein–peptide docking webserver using an incremental approach. Cancer Research 77(21), e55–e57. 10.1158/0008-5472.can-17-0511 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Apostolopoulos V, Bojarska J, Chai T-T, Elnagdy S, Kaczmarek K, Matsoukas J, New R, Parang K, Lopez OP, Parhiz H, Perera CO, Pickholz M, Remko M, Saviano M, Skwarczynski M, Tang Y, Wolf WM, Yoshiya T, Zabrocki J, … Toth I (2021) A global review on short peptides: Frontiers and perspectives. Molecules 26(2), 430. 10.3390/molecules26020430 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Arkin MR, Tang Y and Wells JA (2014) Small-molecule inhibitors of protein-protein interactions: progressing toward the reality. Chemistry & Biology 21(9), 1102–1114. 10.1016/j.chembiol.2014.09.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baek, M., DiMaio, F., Anishchenko, I., Dauparas, J., Ovchinnikov, S., Lee, G.R., Wang, J., Cong, Q., Kinch, L.N., Schaeffer, R.D., Millán, C., Park, H., Adams, C., Glassman, C.R., DeGiovanni, A., Pereira, J.H., Rodrigues, A.V., Dijk, A.A. van, Ebrecht, A.C., Opperman, D.J., Sagmeister, T., Buhlheller, C., Pavkov-Keller, T., Rathinaswamy, M.K., Dalwadi, U., Yip, C.K., Burke, J.E., Garcia, K.C., Grishin, N.V., Adams, P.D., Read, R.J. and Baker, D.(2021) Accurate prediction of protein structures and interactions using a three-track neural network. Science, 373(6557), pp. 871–876. 10.1126/science.abj8754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bechtler C and Lamers C (2021) Macrocyclization strategies for cyclic peptides and peptidomimetics. RSC Medicinal Chemistry 12(8), 1325–1351. 10.1039/d1md00083g [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ben-Shimon A and Eisenstein M (2010) Computational mapping of anchoring spots on protein surfaces. Journal of Molecular Biology 402(1), 259–277. 10.1016/j.jmb.2010.07.021 [DOI] [PubMed] [Google Scholar]
- Ben-Shimon A and Niv MY (2015) AnchorDock: blind and flexible anchor-driven peptide docking. Structure 23(5), 929–940. 10.1016/j.str.2015.03.010 [DOI] [PubMed] [Google Scholar]
- Berman HM, Battistuz T, Bhat TN, Bluhm WF, Bourne PE, Burkhardt K, Feng Z, Gilliland GL, Iype L, Jain S, Fagan P, Marvin J, Padilla D, Ravichandran V, Schneider B, Thanki N, Weissig H, Westbrook J and Zardecki C (2002) The protein data Bank. Acta Crystallographica Section D: Biological Crystallography 58(6), 899–907. 10.1107/s0907444902003451 [DOI] [PubMed] [Google Scholar]
- Best RB and Hummer G (2009) Optimized molecular dynamics force fields applied to the helix−coil transition of polypeptides. The Journal of Physical Chemistry B 113(26), 9004–9015. 10.1021/jp901540t [DOI] [PMC free article] [PubMed] [Google Scholar]
- Best RB, Zheng W and Mittal J (2014) Balanced protein–water interactions improve properties of disordered proteins and non-specific protein association. Journal of Chemical Theory and Computation 10(11), 5113–5124. 10.1021/ct500569b [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brooks BR, Brooks CL, Mackerell AD, Nilsson L, Petrella RJ, Roux B, Won Y, Archontis G, Bartels C, Boresch S, Caflisch A, Caves L, Cui Q, Dinner AR, Feig M, Fischer S, Gao J, Hodoscek M, Im W, … Karplus M (2009) CHARMM: the biomolecular simulation program. Journal of Computational Chemistry 30(10), 1545–1614. 10.1002/jcc.21287 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bussi G and Laio A (2020) Using metadynamics to explore complex free-energy landscapes. Nature Reviews Physics 2(4), 200–212. 10.1038/s42254-020-0153-0 [DOI] [Google Scholar]
- Casadio R, Martelli PL and Savojardo C (2022) Machine learning solutions for predicting protein–protein interactions. Wiley Interdisciplinary Reviews: Computational Molecular Science. 10.1002/wcms.1618 [DOI] [Google Scholar]
- Case DA, Cheatham TE, Darden T, Gohlke H, Luo R, Merz KM, Onufriev A, Simmerling C, Wang B and Woods RJ (2005) The Amber biomolecular simulation programs. Journal of Computational Chemistry 26(16), 1668–1688. 10.1002/jcc.20290 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chang L and Perez A (2022) AlphaFold encodes the principles to identify high affinity peptide binders. BioRxiv. 10.1101/2022.03.18.484931 [DOI] [Google Scholar]
- Chaudhury S, Lyskov S and Gray JJ (2010) PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta. Bioinformatics 26(5), 689–691. 10.1093/bioinformatics/btq007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen J, Cheong HH and Siu SWI (2021) xDeep-AcPEP: deep learning method for anticancer peptide activity prediction based on convolutional neural network and multitask learning. Journal of Chemical Information and Modeling 61(8), 3789–3803. 10.1021/acs.jcim.1c00181 [DOI] [PubMed] [Google Scholar]
- Chodera JD and Noé F (2014) Markov state models of biomolecular conformational dynamics. Current Opinion in Structural Biology 25, 135–144. 10.1016/j.sbi.2014.04.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ciemny M, Kurcinski M, Kamel K, Kolinski A, Alam N, Schueler-Furman O and Kmiecik S (2018) Protein–peptide docking: opportunities and challenges. Drug Discovery Today 23(8), 1530–1537. 10.1016/j.drudis.2018.05.006 [DOI] [PubMed] [Google Scholar]
- Cunningham JM, Koytiger G, Sorger PK and AlQuraishi M (2020) Biophysical prediction of protein–peptide interactions and signaling networks using machine learning. Nature Methods 17(2), 175–183. 10.1038/s41592-019-0687-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Das AA, Sharma OP, Kumar MS, Krishna R and Mathur PP (2013) PepBind: a comprehensive database and computational tool for analysis of protein–peptide interactions. Genomics, Proteomics & Bioinformatics 11(4), 241–246. 10.1016/j.gpb.2013.03.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dee W (2022) LMPred: predicting antimicrobial peptides using pre-trained language models and deep learning. Bioinformatics Advances 2(1). 10.1093/bioadv/vbac021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Deplazes E, Davies J, Bonvin AMJJ, King GF and Mark AE (2016) Combination of ambiguous and unambiguous data in the restraint-driven docking of flexible peptides with HADDOCK: the binding of the spider toxin PcTx1 to the acid sensing ion channel (ASIC) 1a. Journal of Chemical Information and Modeling 56(1), 127–138. 10.1021/acs.jcim.5b00529 [DOI] [PubMed] [Google Scholar]
- Diharce J, Cueto M, Beltramo M, Aucagne V and Bonnet P (2019) In silico peptide ligation: iterative residue docking and linking as a New approach to predict protein-peptide interactions. Molecules 24(7), 1351. 10.3390/molecules24071351 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Donsky E and Wolfson HJ (2011) PepCrawler: a fast RRT-based algorithm for high-resolution refinement and binding affinity estimation of peptide inhibitors. Bioinformatics 27(20), 2836–2842. 10.1093/bioinformatics/btr498 [DOI] [PubMed] [Google Scholar]
- Eastman P, Swails J, Chodera JD, McGibbon RT, Zhao Y, Beauchamp KA, Wang L-P, Simmonett AC, Harrigan MP, Stern CD, Wiewiora RP, Brooks BR and Pande VS (2017) OpenMM 7: rapid development of high performance algorithms for molecular dynamics. PLoS Computational Biology 13(7), e1005659. 10.1371/journal.pcbi.1005659 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Evans R, O’Neill M, Pritzel A, Antropova N, Senior A, Green T, Žídek A, Bates R, Blackwell S, Yim J, Ronneberger O, Bodenstein S, Zielinski M, Bridgland A, Potapenko A, Cowie A, Tunyasuvunakool K, Jain R, Clancy E, … Hassabis D (2021) Protein complex prediction with AlphaFold-Multimer. BioRxiv. 10.1101/2021.10.04.463034 [DOI] [Google Scholar]
- Fosgerau K and Hoffmann T (2015) Peptide therapeutics: current status and future directions. Drug Discovery Today 20(1), 122–128. 10.1016/j.drudis.2014.10.003 [DOI] [PubMed] [Google Scholar]
- Frappier V, Duran M and Keating AE (2018) PixelDB: protein–peptide complexes annotated with structural conservation of the peptide binding mode. Protein Science 27(1), 276–285. 10.1002/pro.3320 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ganesh AN, Heusser C, Garad S and Sánchez-Félix MV (2021) Patient-centric design for peptide delivery: trends in routes of administration and advancement in drug delivery technologies. Medicine in Drug Discovery 9, 100079. 10.1016/j.medidd.2020.100079 [DOI] [Google Scholar]
- Geng C, Narasimhan S, Rodrigues JPGLM and Bonvin AMJJ (2017) Modeling peptide-protein interactions, methods and protocols. Methods in Molecular Biology 1561, 109–138. 10.1007/978-1-4939-6798-8_8 [DOI] [PubMed] [Google Scholar]
- Harder E, Damm W, Maple J, Wu C, Reboul M, Xiang JY, Wang L, Lupyan D, Dahlgren MK, Knight JL, Kaus JW, Cerutti DS, Krilov G, Jorgensen WL, Abel R and Friesner RA (2016) OPLS3: a force Field providing broad coverage of drug-like small molecules and proteins. Journal of Chemical Theory and Computation 12(1), 281–296. 10.1021/acs.jctc.5b00864 [DOI] [PubMed] [Google Scholar]
- Hauser AS and Windshügel B (2016) LEADS-PEP: a benchmark data set for assessment of peptide docking performance. Journal of Chemical Information and Modeling 56(1), 188–200. 10.1021/acs.jcim.5b00234 [DOI] [PubMed] [Google Scholar]
- Hornak V, Abel R, Okur A, Strockbine B, Roitberg A and Simmerling C (2006) Comparison of multiple Amber force fields and development of improved protein backbone parameters. Proteins: Structure, Function, and Bioinformatics 65(3), 712–725. 10.1002/prot.21123 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hou T, Wang J, Li Y and Wang W (2011) Assessing the performance of the MM/PBSA and MM/GBSA methods. 1. The accuracy of binding free energy calculations based on molecular dynamics simulations. Journal of Chemical Information and Modeling 51(1), 69–82. 10.1021/ci100275a [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang J and MacKerell AD (2013) CHARMM36 all-atom additive protein force field: validation based on comparison to NMR data. Journal of Computational Chemistry 34(25), 2135–2145. 10.1002/jcc.23354 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang J, Rauscher S, Nawrocki G, Ran T, Feig M, Groot BL d, Grubmüller H and MacKerell AD (2017) CHARMM36m: an improved force field for folded and intrinsically disordered proteins. Nature Methods 14(1), 71–73. 10.1038/nmeth.4067 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jiang F, Zhou C-Y and Wu Y-D (2014) Residue-specific force Field based on the protein coil library. RSFF1: modification of OPLS-AA/L. The Journal of Physical Chemistry B 118(25), 6983–6998. 10.1021/jp5017449 [DOI] [PubMed] [Google Scholar]
- Johansson-Åkhe I, Mirabello C and Wallner B (2019) Predicting protein–peptide interaction sites using distant protein complexes as structural templates. Scientific Reports 9(1), 4267. 10.1038/s41598-019-38498-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johansson-Åkhe I, Mirabello C and Wallner B (2020) InterPep2: global peptide–protein docking using interaction surface templates. Bioinformatics 36(8), 2458–2465. 10.1093/bioinformatics/btaa005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johansson-Åkhe I, Mirabello C and Wallner B (2021) InterPepRank: assessment of docked peptide conformations by a deep graph network. Frontiers in Bioinformatics 1, 763102. 10.3389/fbinf.2021.763102 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Žídek A, Potapenko A, Bridgland A, Meyer C, Kohl SAA, Ballard AJ, Cowie A, Romera-Paredes B, Nikolov S, Jain R, Adler J, … Hassabis D (2021) Highly accurate protein structure prediction with AlphaFold. Nature 596(7873), 583–589. 10.1038/s41586-021-03819-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kaminski GA, Friesner RA, Tirado-Rives J and Jorgensen WL (2001) Evaluation and reparametrization of the OPLS-AA force field for proteins via comparison with accurate quantum chemical calculations on peptides. The Journal of Physical Chemistry B, 105(28), 6474–6487. 10.1021/jp003919d [DOI] [Google Scholar]
- Khramushin A, Ben-Aharon Z and Schueler-Furman O (2022) Matching protein surface structural patches for high-resolution blind peptide docking. PNAS. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kilburg D and Gallicchio E (2018) Assessment of a single decoupling alchemical approach for the calculation of the absolute binding free energies of protein-peptide complexes. Frontiers in Molecular Biosciences 5, 22. 10.3389/fmolb.2018.00022 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kozakov D, Brenke R, Comeau SR and Vajda S (2006) PIPER: An FFT-based protein docking program with pairwise potentials. Proteins: Structure, Function, and Bioinformatics 65(2), 392–406. 10.1002/prot.21117 [DOI] [PubMed] [Google Scholar]
- Kurcinski M, Ciemny MP, Oleniecki T, Kuriata A, Badaczewska-Dawid AE, Kolinski A and Kmiecik S (2019) CABS-dock standalone: a toolbox for flexible protein–peptide docking. Bioinformatics 35(20), 4170–4172. 10.1093/bioinformatics/btz185 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kurcinski M, Jamroz M, Blaszczyk M, Kolinski A and Kmiecik S (2015) CABS-dock web server for the flexible docking of peptides to proteins without prior knowledge of the binding site. Nucleic Acids Research 43(Web Server issue), W419–W424. 10.1093/nar/gkv456 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lagardère L, Jolly L-H, Lipparini F, Aviat F, Stamm B, Jing ZF, Harger M, Torabifard H, Cisneros GA, Schnieders MJ, Gresh N, Maday Y, Ren PY, Ponder JW and Piquemal J-P (2017) Tinker-HP: a massively parallel molecular dynamics package for multiscale simulations of large complex systems with advanced point dipole polarizable force fields. Chemical Science 9(4), 956–972. 10.1039/c7sc04531j [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lau JL and Dunn MK (2018) Therapeutic peptides: historical perspectives, current development trends, and future directions. Bioorganic & Medicinal Chemistry 26(10), 2700–2707. 10.1016/j.bmc.2017.06.052 [DOI] [PubMed] [Google Scholar]
- Lavi A, Ngan CH, Movshovitz-Attias D, Bohnuud T, Yueh C, Beglov D, Schueler-Furman O and Kozakov D (2013) Detection of peptide-binding sites on protein surfaces: the first step toward the modeling and targeting of peptide-mediated interactions. Proteins: Structure, Function, and Bioinformatics 81(12), 2096–2105. 10.1002/prot.24422 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee AC-L, Harris JL, Khanna KK and Hong J-H (2019) A comprehensive review on current advances in peptide drug development and design. International Journal of Molecular Sciences 20(10), 2383. 10.3390/ijms20102383 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee EY, Lee MW, Fulan BM, Ferguson AL and Wong GCL (2017) What can machine learning do for antimicrobial peptides, and what can antimicrobial peptides do for machine learning? Interface Focus 7(6), 20160153. 10.1098/rsfs.2016.0153 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee H, Heo L, Lee MS and Seok C (2015) GalaxyPepDock: a protein–peptide docking tool based on interaction similarity and energy optimization. Nucleic Acids Research 43(Web Server issue), W431–W435. 10.1093/nar/gkv495 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee H and Seok C (2017) Modeling peptide-protein interactions, methods and protocols. Methods in Molecular Biology 1561, 37–47. 10.1007/978-1-4939-6798-8_4 [DOI] [PubMed] [Google Scholar]
- Lei Y, Li S, Liu Z, Wan F, Tian T, Li S, Zhao D and Zeng J (2021) A deep-learning framework for multi-level peptide–protein interaction prediction. Nature Communications 12(1), 5465. 10.1038/s41467-021-25772-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lensink MF, Nadzirin N, Velankar S and Wodak SJ (2020) Modeling protein–protein, protein–peptide, and protein–oligosaccharide complexes: CAPRI 7th edition. Proteins: Structure, Function, and Bioinformatics, 88(8), 916–938. 10.1002/prot.25870 [DOI] [PubMed] [Google Scholar]
- Lensink MF, Velankar S and Wodak SJ (2017) Modeling protein–protein and protein–peptide complexes: CAPRI 6th edition. Proteins: Structure, Function, and Bioinformatics, 85(3), 359–377. 10.1002/prot.25215 [DOI] [PubMed] [Google Scholar]
- Lindorff-Larsen K, Piana S, Palmo K, Maragakis P, Klepeis JL, Dror RO and Shaw DE (2010) Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins 78(8), 1950–1958. 10.1002/prot.22711 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu H, Song D, Lu H, Luo R and Chen H (2018) Intrinsically disordered protein-specific force field CHARMM36IDPSFF. Chemical Biology & Drug Design 92(4), 1722–1735. 10.1111/cbdd.13342 [DOI] [PubMed] [Google Scholar]
- London N, Movshovitz-Attias D and Schueler-Furman O (2010) The structural basis of peptide-protein binding strategies. Structure 18(2), 188–199. 10.1016/j.str.2009.11.012 [DOI] [PubMed] [Google Scholar]
- London N, Raveh B and Schueler-Furman O (2013) Druggable protein–protein interactions – from hot spots to hot segments. Current Opinion in Chemical Biology 17(6), 952–959. 10.1016/j.cbpa.2013.10.011 [DOI] [PubMed] [Google Scholar]
- MacCallum JL, Perez A and Dill KA (2015) Determining protein structures by combining semireliable data with atomistic physical models by Bayesian inference. Proceedings of the National Academy of Sciences 112(22), 6985–6990. 10.1073/pnas.1506788112 [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacKerell AD, Bashford D, Bellott M, Dunbrack RL, Evanseck JD, Field MJ, Fischer S, Gao J, Guo H, Ha S, Joseph-McCarthy D, Kuchnir L, Kuczera K, Lau FTK, Mattos C, Michnick S, Ngo T, Nguyen DT, Prodhom B, … Karplus M (1998) All-atom empirical potential for molecular modeling and dynamics studies of proteins. The Journal of Physical Chemistry B, 102(18), 3586–3616. 10.1021/jp973084f [DOI] [PubMed] [Google Scholar]
- Mackerell AD, Feig M and Brooks CL (2004) Extending the treatment of backbone energetics in protein force fields: Limitations of gas-phase quantum mechanics in reproducing protein conformational distributions in molecular dynamics simulations. Journal of Computational Chemistry 25(11), 1400–1415. 10.1002/jcc.20065 [DOI] [PubMed] [Google Scholar]
- Maier JA, Martinez C, Kasavajhala K, Wickstrom L, Hauser KE and Simmerling C (2015) ff14SB: improving the accuracy of protein side chain and backbone parameters from ff99SB. Journal of Chemical Theory and Computation 11(8), 3696–3713. 10.1021/acs.jctc.5b00255 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Martins PM, Santos LH, Mariano D, Queiroz FC, Bastos LL, Gomes I d S, Fischer PHC, Rocha REO, Silveira SA, Lima LHF d, Magalhães MTQ d, Oliveira MGA and Melo-Minardi RC d (2021) Propedia: a database for protein–peptide identification based on a hybrid clustering algorithm. BMC Bioinformatics 22(1), 1. 10.1186/s12859-020-03881-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- McGuffin LJ, Bryson K and Jones DT (2000) The PSIPRED protein structure prediction server. Bioinformatics 16, 404–405. 10.1093/bioinformatics/16.4.404 [DOI] [PubMed] [Google Scholar]
- Modeling Peptide–Protein Interactions, Methods and Protocols. (2017). Methods in Molecular Biology. 10.1007/978-1-4939-6798-8 [DOI] [PubMed]
- Mondal A, Swapna GVT, Hao J, Ma L, Roth MJ, Montelione GT and Perez A (2022) Structure determination of protein–peptide complexes from NMR chemical shift data using MELD. BioRxiv. 10.1101/2021.12.31.474671 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Morrone JA, Perez A, Deng Q, Ha SN, Holloway MK, Sawyer TK, Sherborne BS, Brown FK and Dill KA (2017a) Molecular simulations identify binding poses and approximate affinities of stapled α-helical peptides to MDM2 and MDMX. Journal of Chemical Theory and Computation 13(2), 863–869. 10.1021/acs.jctc.6b00978 [DOI] [PubMed] [Google Scholar]
- Morrone JA, Perez A, MacCallum J and Dill KA (2017b) Computed binding of peptides to proteins with MELD-accelerated molecular dynamics. Journal of Chemical Theory and Computation 13(2), 870–876. 10.1021/acs.jctc.6b00977 [DOI] [PubMed] [Google Scholar]
- Mu J, Liu H, Zhang J, Luo R and Chen H-F (2021) Recent force field strategies for intrinsically disordered proteins. Journal of Chemical Information and Modeling 61(3), 1037–1047. 10.1021/acs.jcim.0c01175 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Obarska-Kosinska A, Iacoangeli A, Lepore R and Tramontano A (2016) PepComposer: computational design of peptides binding to a given protein surface. Nucleic Acids Research 44(Web Server issue), W522–W528. 10.1093/nar/gkw366 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ochoa R, Laio A and Cossio P (2019) Predicting the affinity of peptides to major histocompatibility complex class II by scoring molecular dynamics simulations. Journal of Chemical Information and Modeling 59(8), 3464–3473. 10.1021/acs.jcim.9b00403 [DOI] [PubMed] [Google Scholar]
- Panel N, Villa F, Fuentes EJ and Simonson T (2018) Accurate PDZ/peptide binding specificity with additive and polarizable free energy simulations. Biophysical Journal 114(5), 1091–1102. 10.1016/j.bpj.2018.01.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Paul F, Wehmeyer C, Abualrous ET, Wu H, Crabtree MD, Schöneberg J, Clarke J, Freund C, Weikl TR and Noé F (2017) Protein–peptide association kinetics beyond the seconds timescale from atomistic simulations. Nature Communications 8(1), 1095. 10.1038/s41467-017-01163-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Perez A, MacCallum JL, Brini E, Simmerling C and Dill KA (2015) Grid-based backbone correction to the ff12SB protein force Field for implicit-solvent simulations. Journal of Chemical Theory and Computation 11(10), 4770–4779. 10.1021/acs.jctc.5b00662 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Phillips JC, Braun R, Wang W, Gumbart J, Tajkhorshid E, Villa E, Chipot C, Skeel RD, Kalé L and Schulten K (2005) Scalable molecular dynamics with NAMD. Journal of Computational Chemistry 26(16), 1781–1802. 10.1002/jcc.20289 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Piana S, Donchev AG, Robustelli P and Shaw DE (2015) Water dispersion interactions strongly influence simulated structural properties of disordered protein states. The Journal of Physical Chemistry B 119(16), 5113–5123. 10.1021/jp508971m [DOI] [PubMed] [Google Scholar]
- Piana S, Lindorff-Larsen K and Shaw DE (2011) How robust are protein folding simulations with respect to force Field parameterization? Biophysical Journal 100(9), L47–L49. 10.1016/j.bpj.2011.03.051 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pitera JW and Chodera JD (2012) On the use of experimental observations to bias simulated ensembles. Journal of Chemical Theory and Computation 8(10), 3445–3451. 10.1021/ct300112v [DOI] [PubMed] [Google Scholar]
- Plisson F, Ramírez-Sánchez O and Martínez-Hernández C (2020) Machine learning-guided discovery and design of non-hemolytic peptides. Scientific Reports 10(1), 16581. 10.1038/s41598-020-73644-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Porter KA, Xia B, Beglov D, Bohnuud T, Alam N, Schueler-Furman O and Kozakov D (2017) ClusPro PeptiDock: efficient global docking of peptide recognition motifs using FFT. Bioinformatics 33(20), 3299–3301. 10.1093/bioinformatics/btx216 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pu C, Yan G, Shi J and Li R (2017) Assessing the performance of docking scoring function, FEP, MM-GBSA, and QM/MM-GBSA approaches on a series of PLK1 inhibitors. MedChemComm 8, 1452–1458. 10.1039/c7md00184c [DOI] [PMC free article] [PubMed] [Google Scholar]
- Qiu Y, Smith DGA, Boothroyd S, Jang H, Hahn DF, Wagner J, Bannan CC, Gokey T, Lim VT, Stern CD, Rizzi A, Tjanaka B, Tresadern G, Lucas X, Shirts MR, Gilson MK, Chodera JD, Bayly CI, Mobley DL and Wang L-P (2021) Development and benchmarking of open force Field v1.0.0—the parsley small-molecule force Field. Journal of Chemical Theory and Computation 17(10), 6262–6280. 10.1021/acs.jctc.1c00571 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rahman MU, Rehman AU, Liu H and Chen H-F (2020) Comparison and evaluation of force fields for intrinsically disordered proteins. Journal of Chemical Information and Modeling 60(10), 4912–4923. 10.1021/acs.jcim.0c00762 [DOI] [PubMed] [Google Scholar]
- Rashid MH, Heinzelmann G, Huq R, Tajhya RB, Chang SC, Chhabra S, Pennington MW, Beeton C, Norton RS and Kuyucak S (2013) A potent and selective peptide blocker of the Kv1.3 channel: prediction from free-energy simulations and experimental confirmation. PLoS One 8(11), e78712. 10.1371/journal.pone.0078712 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Raveh B, London N and Schueler-Furman O (2010) Sub-angstrom modeling of complexes between flexible peptides and globular proteins. Proteins: Structure, Function, and Bioinformatics 78(9), 2029–2040. 10.1002/prot.22716 [DOI] [PubMed] [Google Scholar]
- Raveh B, London N, Zimmerman L and Schueler-Furman O (2011) Rosetta FlexPepDock ab-initio: simultaneous folding, docking and refinement of peptides onto their receptors. PLoS One 6(4), e18934. 10.1371/journal.pone.0018934 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rentzsch R and Renard BY (2015) Docking small peptides remains a great challenge: an assessment using AutoDock Vina. Briefings in Bioinformatics 16(6), 1045–1056. 10.1093/bib/bbv008 [DOI] [PubMed] [Google Scholar]
- Robertson MJ, Tirado-Rives J and Jorgensen WL (2015) Improved peptide and protein torsional energetics with the OPLS-AA force Field. Journal of Chemical Theory and Computation 11(7), 3499–3509. 10.1021/acs.jctc.5b00356 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robinson MK, Monroe JI and Shell MS (2016) Are AMBER force fields and implicit solvation models additive? A folding study with a balanced peptide test set. Journal of Chemical Theory and Computation 12(11), 5631–5642. 10.1021/acs.jctc.6b00788 [DOI] [PubMed] [Google Scholar]
- Robustelli P, Piana S and Shaw DE (2018) Developing a molecular dynamics force field for both folded and disordered protein states. Proceedings of the National Academy of Sciences of the United States of America 115(21), E4758–E4766. 10.1073/pnas.1800690115 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roney JP and Ovchinnikov S (2022) State-of-the-art estimation of protein model accuracy using AlphaFold. BioRxiv. 10.1101/2022.03.11.484043 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saladin A, Rey J, Thévenet P, Zacharias M, Moroy G and Tufféry P (2014) PEP-SiteFinder: a tool for the blind identification of peptide binding sites on protein surfaces. Nucleic Acids Research 42(W1), W221–W226. 10.1093/nar/gku404 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schindler CEM, de Vries SJ and Zacharias M (2015) Fully blind peptide-protein docking with pepATTRACT. Structure 23(8), 1507–1515. 10.1016/j.str.2015.05.021 [DOI] [PubMed] [Google Scholar]
- Senior AW, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, Qin C, Žídek A, Nelson AWR, Bridgland A, Penedones H, Petersen S, Simonyan K, Crossan S, Kohli P, Jones DT, Silver D, Kavukcuoglu K and Hassabis D (2020) Improved protein structure prediction using potentials from deep learning. Nature 577(7792), 706–710. 10.1038/s41586-019-1923-7 [DOI] [PubMed] [Google Scholar]
- Song D, Liu H, Luo R and Chen H-F (2020) Environment-specific force field for intrinsically disordered and ordered proteins. Journal of Chemical Information and Modeling 60(4), 2257–2267. 10.1021/acs.jcim.0c00059 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Song D, Luo R and Chen H-F (2017) The IDP-specific force field ff14IDPSFF improves the conformer sampling of intrinsically disordered proteins. Journal of Chemical Information and Modeling 57(5), 1166–1178. 10.1021/acs.jcim.7b00135 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Spiliotopoulos D, Kastritis PL, Melquiond ASJ, Bonvin AMJJ, Musco G, Rocchia W and Spitaleri A (2016) dMM-PBSA: a new HADDOCK scoring function for protein-peptide docking. Frontiers in Molecular Biosciences 3, 46. 10.3389/fmolb.2016.00046 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Spitzer R and Jain AN (2012) Surflex-dock: docking benchmarks and real-world application. Journal of Computer-Aided Molecular Design 26(6), 687–699. 10.1007/s10822-011-9533-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sun L, Fu T, Zhao D, Fan H and Zhong S (2021) Divide-and-link peptide docking: a fragment-based peptide docking protocol. Physical Chemistry Chemical Physics 23, 22647–22660. 10.1039/d1cp02098f [DOI] [PubMed] [Google Scholar]
- Taherzadeh G, Zhou Y, Liew AW-C and Yang Y (2017) Structure-based prediction of protein–peptide binding regions using random Forest. Bioinformatics 34(3), 477–484. 10.1093/bioinformatics/btx614 [DOI] [PubMed] [Google Scholar]
- Tao H, Zhang Y and Huang S-Y (2020) Improving protein–peptide docking results via pose-clustering and rescoring with a combined knowledge-based and MM–GBSA scoring function. Journal of Chemical Information and Modeling 60(4), 2377–2387. 10.1021/acs.jcim.0c00058 [DOI] [PubMed] [Google Scholar]
- Taylor RD, Jewsbury PJ and Essex JW (2002) A review of protein-small molecule docking methods. Journal of Computer-Aided Molecular Design 16(3), 151–166. 10.1023/a:1020155510718 [DOI] [PubMed] [Google Scholar]
- Tejero R, Huang YJ, Ramelot TA and Montelione GT (2022) AlphaFold models of small proteins rival the accuracy of solution NMR structures. BioRxiv. 10.1101/2022.03.09.483701 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tian C, Kasavajhala K, Belfon KAA, Raguette L, Huang H, Migues AN, Bickel J, Wang Y, Pincay J, Wu Q and Simmerling C (2019) ff19SB: amino-acid-specific protein backbone parameters trained against quantum mechanics energy surfaces in solution. Journal of Chemical Theory and Computation 16(1), 528–552. 10.1021/acs.jctc.9b00591 [DOI] [PubMed] [Google Scholar]
- Tien MZ, Sydykova DK, Meyer AG and Wilke CO (2013) PeptideBuilder: a simple python library to generate model peptides. PeerJ 1, e80. 10.7717/peerj.80 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Townshend RJL, Eismann S, Watkins AM, Rangan R, Karelina M, Das R and Dror RO (2021) Geometric deep learning of RNA structure. Science 373(6558), 1047–1051. 10.1126/science.abe5650 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trabuco LG, Lise S, Petsalaki E and Russell RB (2012) PepSite: prediction of peptide-binding sites from protein surfaces. Nucleic Acids Research 40(W1), W423–W427. 10.1093/nar/gks398 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trellet M, Melquiond ASJ and Bonvin AMJJ (2013) A unified conformational selection and induced fit approach to protein-peptide docking. PLoS One 8(3), e58769. 10.1371/journal.pone.0058769 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trellet M, Melquiond ASJ and Bonvin AMJJ (2014) Computational peptidology. Methods in Molecular Biology 1268, 221–239. 10.1007/978-1-4939-2285-7_10 [DOI] [PubMed] [Google Scholar]
- Usmani SS, Bedi G, Samuel JS, Singh S, Kalra S, Kumar P, Ahuja AA, Sharma M, Gautam A and Raghava GPS (2017) THPdb: database of FDA-approved peptide and protein therapeutics. PLoS One 12(7), e0181748. 10.1371/journal.pone.0181748 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vanhee P, Reumers J, Stricher F, Baeten L, Serrano L, Schymkowitz J and Rousseau F (2010) PepX: a structural database of non-redundant protein–peptide complexes. Nucleic Acids Research 38(Database issue), D545–D551. 10.1093/nar/gkp893 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Verdonk ML, Cole JC, Hartshorn MJ, Murray CW and Taylor RD (2003) Improved protein–ligand docking using GOLD. Proteins: Structure, Function, and Bioinformatics 52(4), 609–623. 10.1002/prot.10465 [DOI] [PubMed] [Google Scholar]
- Vries SJ d, Rey J, Schindler CEM, Zacharias M and Tuffery P (2017) The pepATTRACT web server for blind, large-scale peptide–protein docking. Nucleic Acids Research 45(Web Server issue), W361–W364. 10.1093/nar/gkx335 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wan S, Knapp B, Wright DW, Deane CM and Coveney PV (2015) Rapid, precise, and reproducible prediction of peptide–MHC binding affinities from molecular dynamics that correlate well with experiment. Journal of Chemical Theory and Computation 11(7), 3346–3356. 10.1021/acs.jctc.5b00179 [DOI] [PubMed] [Google Scholar]
- Wang E, Sun H, Wang J, Wang Z, Liu H, Zhang JZH and Hou T (2019) End-point binding free energy calculation with MM/PBSA and MM/GBSA: strategies and applications in drug design. Chemical Reviews 119(16), 9478–9508. 10.1021/acs.chemrev.9b00055 [DOI] [PubMed] [Google Scholar]
- Wang J, Alekseenko A, Kozakov D and Miao Y (2019) Improved modeling of peptide-protein binding through global docking and accelerated molecular dynamics simulations. Frontiers in Molecular Biosciences 6, 112. 10.3389/fmolb.2019.00112 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang J, Lisanza S, Juergens D, Tischer D, Anishchenko I, Baek M, Watson JL, Chun JH, Milles LF, Dauparas J, Expòsit M, Yang W, Saragovi A,Ovchinnikov S and Baker D (2021) Deep learning methods for designing proteins scaffolding functional sites. BioRxiv. 10.1101/2021.11.10.468128 [DOI] [Google Scholar]
- Wang L, Wang N, Zhang W, Cheng X, Yan Z, Shao G, Wang X, Wang R and Fu C (2022) Therapeutic peptides: current applications and future directions. Signal Transduction and Targeted Therapy 7(1), 48. 10.1038/s41392-022-00904-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang L, Wu Y, Deng Y, Kim B, Pierce L, Krilov G, Lupyan D, Robinson S, Dahlgren MK, Greenwood J, Romero DL, Masse C, Knight JL, Steinbrecher T, Beuming T, Damm W, Harder E, Sherman W, Brewer M, … Abel R (2015) Accurate and reliable prediction of relative ligand binding potency in prospective drug discovery by way of a modern free-energy calculation protocol and force Field. Journal of the American Chemical Society 137(7), 2695–2703. 10.1021/ja512751q [DOI] [PubMed] [Google Scholar]
- Wang W, Ye W, Jiang C, Luo R and Chen H (2014) New force Field on modeling intrinsically disordered proteins. Chemical Biology & Drug Design 84(3), 253–269. 10.1111/cbdd.12314 [DOI] [PubMed] [Google Scholar]
- Wang Z, Sun H, Yao X, Li D, Xu L, Li Y, Tian S and Hou T (2016) Comprehensive evaluation of ten docking programs on a diverse set of protein–ligand complexes: the prediction accuracy of sampling power and scoring power. Physical Chemistry Chemical Physics 18(18), 12964–12975. 10.1039/c6cp01555g [DOI] [PubMed] [Google Scholar]
- Webb B and Sali A (2014) Comparative protein structure modeling using MODELLER. Current Protocols in Bioinformatics 47(1), 5.6.1–5.6.32. 10.1002/0471250953.bi0506s47 [DOI] [PubMed] [Google Scholar]
- Wen Z, He J, Tao H and Huang S-Y (2018) PepBDB: A comprehensive structural database of biological peptide–protein interactions. Bioinformatics 35(1), 175–177. 10.1093/bioinformatics/bty579 [DOI] [PubMed] [Google Scholar]
- Weng G, Gao J, Wang Z, Wang E, Hu X, Yao X, Cao D and Hou T (2020) Comprehensive evaluation of fourteen docking programs on protein–peptide complexes. Journal of Chemical Theory and Computation 16(6), 3959–3969. 10.1021/acs.jctc.9b01208 [DOI] [PubMed] [Google Scholar]
- Weng G, Wang E, Chen F, Sun H, Wang Z and Hou T (2019) Assessing the performance of MM/PBSA and MM/GBSA methods. 9. Prediction reliability of binding affinities and binding poses for protein–peptide complexes. Physical Chemistry Chemical Physics 21(19), 10135–10145. 10.1039/c9cp01674k [DOI] [PubMed] [Google Scholar]
- Williamson MP (2013) Using chemical shift perturbation to characterise ligand binding. Progress in Nuclear Magnetic Resonance Spectroscopy 73, 1–16. 10.1016/j.pnmrs.2013.02.001 [DOI] [PubMed] [Google Scholar]
- Wu C, Gao R, Zhang Y and Marinis YD (2019) PTPD: predicting therapeutic peptides by deep learning and word2vec. BMC Bioinformatics 20(1), 456. 10.1186/s12859-019-3006-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xu X and Zou X (2020) PepPro: a nonredundant structure data set for benchmarking peptide–protein computational docking. Journal of Computational Chemistry 41(4), 362–369. 10.1002/jcc.26114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xu X and Zou X (2022) Predicting protein–peptide complex structures by accounting for peptide flexibility and the physicochemical environment. Journal of Chemical Information and Modeling 62(1), 27–39. 10.1021/acs.jcim.1c00836 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yan C, Xu X and Zou X (2016) Fully blind docking at the atomic level for protein-peptide complex structure prediction. Structure 24(10), 1842–1853. 10.1016/j.str.2016.07.021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang S, Liu H, Zhang Y, Lu H and Chen H (2019) Residue-specific force Field improving the sample of intrinsically disordered proteins and folded proteins. Journal of Chemical Information and Modeling 59(11), 4793–4805. 10.1021/acs.jcim.9b00647 [DOI] [PubMed] [Google Scholar]
- Yu J, Andreani J, Ochsenbein F and Guerois R (2017) Lessons from (co-)evolution in the docking of proteins and peptides for CAPRI rounds 28–35. Proteins: Structure, Function, and Bioinformatics 85(3), 378–390. 10.1002/prot.25180 [DOI] [PubMed] [Google Scholar]
- Zhang Y and Sanner MF (2019) AutoDock CrankPep: Combining folding and docking to predict protein–peptide complexes. Bioinformatics 35(24), 5121–5127. 10.1093/bioinformatics/btz459 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou C-Y, Jiang F and Wu Y-D (2015) Residue-specific force Field based on protein coil library. RSFF2: Modification of AMBER ff99SB. The Journal of Physical Chemistry B 119(3), 1035–1047. 10.1021/jp5064676 [DOI] [PubMed] [Google Scholar]
- Zhou G, Pantelopulos GA, Mukherjee S and Voelz VA (2017) Bridging microscopic and macroscopic mechanisms of p53-MDM2 binding with kinetic network models. Biophysical Journal 113(4), 785–793. 10.1016/j.bpj.2017.07.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou P, Jin B, Li H and Huang S-Y (2018) HPEPDOCK: a web server for blind peptide–protein docking based on a hierarchical algorithm. Nucleic Acids Research 46(Web Server issue), W443–W450. 10.1093/nar/gky357 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zou R, Zhou Y, Wang Y, Kuang G, Ågren H, Wu J and Tu Y (2020) Free energy profile and kinetics of coupled folding and binding of the intrinsically disordered protein p53 with MDM2. Journal of Chemical Information and Modeling 60(3), 1551–1558. 10.1021/acs.jcim.9b00920 [DOI] [PubMed] [Google Scholar]
- Zwier MC, Pratt AJ, Adelman JL, Kaus JW, Zuckerman DM and Chong LT (2016) Efficient atomistic simulation of pathways and calculation of rate constants for a protein–peptide binding process: Application to the MDM2 protein and an intrinsically disordered p53 peptide. The Journal of Physical Chemistry Letters 7(17), 3440–3445. 10.1021/acs.jpclett.6b01502 [DOI] [PMC free article] [PubMed] [Google Scholar]