Skip to main content
. 2022 Sep 19;3:e17. doi: 10.1017/qrd.2022.14

Table 1.

Summary of the popular protein–peptide complexes datasets that are widely used for testing and benchmarking different docking tool

Dataset Number of complexes Length of peptide Special Features Specific application Availability
LEADS-PEP 53 3–12 residues Diverse sequence of peptides, complexes do not interact with nucleic acids Due to smaller peptide size, suitable for testing tools adapted from small molecule docking tools www.leads-x.org
PeptiDB 105 5–15 residues Diverse secondary structure of peptides including conformational change upon binding, complexes with diverse biological functions Suitable for testing tools that tackle peptide flexibility RCSB code of the complexes: https://ars.els-cdn.com/content/image/1-s2.0-S096921260900478X-mmc1.pdf
PPDbench 133 9–15 residues Diverse in term peptide sequences (<40% sequence similarity) and biological functionalities Suitable for testing docking tools on different complexes categorised with different functionalities https://webs.iiitd.edu.in/raghava/ppdbench/
PepPro 89 5–30 residues Contains 58 unbound receptors structures Useful for testing tools whether they can predict apo-holo conformational change http://zoulab.dalton.missouri.edu/PepPro_benchmark
Propedia ~20000 2–50 residues Contains subsets of complexes based on clustering on different features such as sequence, interface structure or binding site Broader range of peptide length allows it to test different type of docking tools. Also, different subset gives flexibility to user on testing their tools https://bioinfo.dcc.ufmg.br/propedia
PixelDB 1966 NA Uses machine learning to identify protein and peptide. This helps to overcome the issue of incorrectly identifying them when peptide is larger than the receptor Broader range of peptide length allows any docking tools to be tested on https://github.com/KeatingLab/PixelDB