Abstract
Predicting functional binding sites in proteins is crucial for understanding protein–protein interactions (PPIs) and identifying drug targets. While various computational approaches exist, many fail to assess PPI ligandability, which often involves conformational changes. We introduce InDeepNet, a web-based platform integrating InDeep, a deep-learning model for binding site prediction, with InDeepHolo, which evaluates a site’s propensity to adopt a ligand-bound (holo) conformation. InDeepNet provides an intuitive interface for researchers to upload protein structures from in-house data, the Protein Data Bank (PDB), or AlphaFold, predicting potential binding sites for proteins or small molecules. Results are presented as interactive 3D visualizations via Mol*, facilitating structural analysis. With InDeepHolo, the platform helps select conformations optimal for small-molecule binding, improving structure-based drug design. Accessible at https://indeep-net.gpu.pasteur.cloud/, InDeepNet removes the need for specialized coding skills or high-performance computing, making advanced predictive models widely available. By streamlining PPI target assessment and ligandability prediction, it assists research and supports therapeutic development targeting PPIs.
Graphical Abstract
Graphical Abstract.
Introduction
Protein–protein interactions (PPIs) play fundamental roles in a wide range of biological processes, mediating signal transduction, enzymatic regulation, and macromolecular assembly. Given their critical involvement in disease pathways, PPIs have become increasingly attractive targets for drug discovery, with numerous successful small-molecule inhibitors emerging in recent years [1, 2]. However, targeting PPIs remains challenging due to their inherently large and often shallow binding interfaces, which lack well-defined cavities typically seen in conventional drug targets [3] such as enzymes or G-protein-coupled receptors. As a result, assessing the ligandability of PPIs and identifying suitable binding sites for small-molecule intervention require specialized computational tools.
Computational methods for PPI target assessment
Numerous computational approaches have been developed to assess the druggability of proteins and identify binding sites. Traditional geometric-based pocket detection methods such as Fpocket [4, 5], VolSite [6], and mkgridXf [7] have demonstrated effectiveness in detecting small-molecule binding sites, particularly for well-structured targets. Fragment-based approaches like FTMap [8] provide insights into potential hot spots by evaluating the interaction of small molecular fragments with protein surfaces. Meanwhile, deep learning-based tools such as DeepSite [9], Kalasanty [10], and DeepSurf [11] have further improved binding site detection by leveraging neural networks trained on protein structural data. While these tools provide valuable predictions, few have been specifically designed to assess the ligandability of PPIs. PPIs often undergo significant conformational rearrangements upon ligand binding, and static structure-based methods alone may fail to identify suitable pockets in their apo or complexed states [12, 13]. To address this, ensemble docking and molecular dynamics (MD) simulations have been used to sample alternative conformations and identify transiently ligandable states [14, 15]. However, selecting holo-like conformations from MD trajectories remains a challenge, as most scoring methods were developed for traditional drug targets and do not account for the unique properties of PPI interfaces [16, 17].
InDeepNet: a web server for PPI binding site analysis
To improve PPI target assessment and facilitate access to predictive tools, we introduce InDeepNet, an interactive web server integrating (Fig. 1) our tool InDeep [18], a deep-learning-based model for functional binding site prediction, and a newly developed tool, InDeepHolo, designed to evaluate the holo-likeness of binding sites. InDeep was originally developed to predict ligandable and epitope-binding sites within PPIs, leveraging a 3D convolutional neural network (CNN) trained on curated datasets from inhibitor protein–protein interactions (iPPIs)-(DB) database [19, 20]. While InDeep successfully identifies binding pockets at or near PPI interfaces even within APO (unbound) conformations, the conformational flexibility of these sites remains a critical factor in their suitability for small-molecule binding.
Figure 1.
InDeepNet website interface, illustrating the following steps: (i) Loading a structure from your computer or a public database; (ii) visualizing and interacting with the structure using Mol*, and selecting the residues for running an InDeep calculation; (iii) visualizing and interacting with InDeep results, and running an InDeepHolo calculation; and (iv) visualizing InDeepHolo results and exporting an archive or a Mol* visualization state.
To address this limitation, we introduce InDeepHolo, a deep-learning model trained to assess whether a given protein conformation is likely to resemble a ligand-bound (holo) state. InDeepHolo builds upon a dataset of known PPI inhibitors and their corresponding bound/unbound structures, leveraging machine learning to predict root mean square deviation (RMSD) from a holo conformation. While a comprehensive study on InDeepHolo is forthcoming, we provide an initial validation of its performance within InDeepNet, demonstrating its ability to prioritize holo-like conformations for molecular docking simulations.
By integrating these predictive tools into a web server, InDeepNet enables researchers to:
Upload and analyze protein structures to predict binding sites relevant to drug discovery.
Evaluate the ligandability of these sites using deep-learning-based predictions.
Assess holo-likeness to prioritize conformations for molecular docking and virtual screening.
With InDeepNet, we provide a publicly accessible platform that enhances PPI-targeted drug discovery by offering both binding site predictions and holo-likeness assessment in a single interface. As targeting PPIs remains a significant challenge in drug development, InDeepNet represents an important step toward making advanced predictive models more accessible and actionable for researchers worldwide.
Materials and methods
Indeep protocol
InDeep is a deep-learning-based tool for predicting functional binding sites in proteins, focusing on iPPIs and epitope binding sites. It employs a 3D fully convolutional U-Net architecture, trained on structural data from iPPI-DB, using a voxel-based protein representation categorized into five functional atom types (α-carbon, hydrogen bond donors/acceptors, hydrophobic/aromatic, positively charged, negatively charged). Through multi-task learning, InDeep simultaneously predicts ligandable binding sites (PL ProteinLigand interactions) for small-molecule binding and interactability patches (HD - HeteroDimer interactions) for PPIs. After processing through shared convolutional layers, it splits into two branches: one using sigmoid activation for ligandability prediction and the other using softmax activation for interactability prediction. Benchmarking on several datasets demonstrated that InDeep outperforms state of the art tools for ligand binding prediction and for interactability detection. It proved particularly effective in detecting iPPI-specific binding pockets, performing well on both holo (bound) and apo (unbound) structures, and in epitope binding site prediction and spatial localization of protein partners.
InDeepHolo protocol
InDeepHolo predicts holo-likeness by training on structural ensembles derived from the same dataset as InDeep, sourced from iPPI-DB [19] (target-centric mode), and split based on CATH folds to avoid structural overlap. The dataset includes hetero-dimeric complexes and compound-bound protein complexes, where ligands bind at PPI interfaces. Structural ensembles were generated using Rosetta’s backrub protocol, running 1000 000 Monte Carlo steps at 2.5 kT, producing up to 16 models per protein. Binding site residues were identified based on changes in solvent-accessible surface upon ligand binding, measured using Naccess software. The binding sites were then encoded into 3D grids of size 20 ų with 1 Å voxel resolution, where atomic coordinates were categorized into five functional types (α-carbon, hydrogen bond donors/acceptors, hydrophobic/aromatic, positively charged, negatively charged) and transformed into probability distributions to create input tensors for deep learning. A CNN was used, consisting of three convolutional blocks (3D convolutions, BatchNorm, ELU activation) alternating with MaxPooling layers, followed by fully connected layers. The model was trained for 500 epochs using mean squared error loss, Adam optimization, and 90-degree rotational data augmentation. Model validation relied on computing heavy-atom binding site RMSD against experimental holo structures, using α-carbon alignment for comparison, with performance assessed through Pearson’s correlation coefficient and mean absolute error calculated via Scipy and Sklearn Python libraries (Supplementary Fig. S1).
Server Functionalities and Usage: The InDeepNet server offers a range of functionalities to facilitate InDeep usability and InDeep results visualization and analysis. Users can upload structures and trajectories from public databases such as RCSB [21] (http://www.rcsb.org/), PDBe [22] (https://www.ebi.ac.uk/pdbe/), or Alphafold [23] (https://www.alphafold.ebi.ac.uk), or from their local computer. The server supports the upload of PDB structures and dynamic trajectories and provides a range of visualization tools powered by the Mol* plugin [24] (https://molstar.org/), including structure superposition, residue selection, and more.
InDeep Calculation: The InDeepNet server handles InDeep jobs with all the necessary configuration for GPU calculation and parameters selection. Users can provide their email address (but it is not mandatory) to receive updates on the status of their job or save the given web link to retrieve their results later. When the job is complete, all results are loaded into the frontend application, and users can interact with the Mol* plugin to visualize and manipulate the results. Users can also load other structures and superimpose them with their results, by example some structures already liganded.
Results Visualization and Conservation: Once a job is complete, users can download an archive of results containing all pocket files calculated, pockets volume files and provided structure or trajectory. Additionally, users can save a project state file with the configuration of the visualization, allowing them to reload the visualization configuration for approximately 30 days and retrieve the same visualization state saved. To optimize calculation time and conserve resources, each user’s job is saved for a short period (30 days), allowing users to retrieve their results if the job has already been completed.
InDeepHolo calculation
When an InDeep job is complete on a system, users can select specific residues to perform an InDeepHolo calculation (see the InDeepHolo protocol).
Server implementation
An overview of the InDeepNet server implementation and the technologies employed is shown below. The content displayed on the client-side is built using the ReactJS framework (https://react.dev/) and incorporates the Mol* JavaScript library for protein structure visualization and interaction. The dynamic forms are formatted with the Tailwind CSS framework (https://tailwindcss.com/). For the backend, we developed a complete REST API using Django Rest Framework (https://www.django-rest-framework.org/). Input data sent to the server are securely saved in a PostgreSQL database (https://www.postgresql.org) using the concept of Object Relational Mapping provided by the Django framework. To perform and monitor the execution of InDeep jobs, the Django Rest Framework server communicates with the Celery (https://docs.celeryq.dev) task scheduler, which manages job execution. A web link is provided to the user to consult the status of their job in real-time. Finally, the results are displayed in the frontend application. If requested by the user, an email is sent when the job is completed.
A description of the server functionalities is provided in the online documentation. We also provide a step-by-step web tour tutorial to help users get started with the application. InDeepNet is easy to use and completely free, eliminating the need for a complex installation of InDeep with GPU configuration.
Jobs submitted via InDeepNet are executed on the Institut Pasteur Kubernetes (K8S) (https://kubernetes.io/) GPU cluster, which provides a high-performance computing environment. Typical execution times for InDeep jobs are around 15 min. The tus (https://tus.io/) server is used for transferring large files (typically trajectory files), and the application is served via an Nginx (https://nginx.org) server.
This ecosystem (Fig. 2) ensures a fast and secure user experience.
Figure 2.
Ecosystem of InDeepNet. The client-side uses ReactJS styled with Tailwind CSS and Mol* [1]. The backend features a REST API built with Django Rest Framework [2], utilizing PostgreSQL for data storage [3]. Job execution and monitoring are managed via Celery [4]. The server is powered by the Institut Pasteur K8S GPU cluster [5], with file transfers handled by the tus server [6]. This ecosystem is served via Nginx [7]. The code is available on GitLab.
Case studies
InDeep was applied to assess ligandability and interactability across multiple protein structures, demonstrating its ability to identify drug-binding sites and PPI interfaces with high accuracy [18]. To illustrate the predictive capacity of InDeep within InDeepNet Fig. 3A depicts the Ligandability channel’s ability to detect structurally characterized ligand-binding sites in four distinct protein targets that fall outside the training set: PHD2 (Prolyl Hydroxylase Domain-containing Protein 2), WDR5 (WD Repeat Domain 5), MDM2 (Mouse Double Minute 2 homolog), and the SARS-CoV-2 main protease. These proteins serve as independent validation examples. In each case, InDeepNet successfully identified the known binding site as the primary ligandable region, represented by red blobs, using a ligandability threshold of 0.95. These results highlight InDeepNet’s ability to pinpoint potential small-molecule binding pockets, aiding drug discovery efforts targeting both enzyme active sites and protein interaction modulators. Figure 3B focuses on PPI interface identification, leveraging different InDeepNet channels to analyze binding site characteristics in diverse protein complexes. The Interactability channel (purple blobs) was employed to identify key PPI regions in 7CYQ (the SARS-CoV-2 replication complex) and 2MPS (the MDM2-p53 complex), detecting the primary interaction interfaces. Additionally, the carbon alpha (CA) channel (blue blobs) and Hydrophobic channel (green blobs) were utilized to provide deeper structural insights into protein backbone localization and hydrophobic hot spots, respectively. Specifically, in 1QHR (alpha-thrombin in complex with hirugen), InDeepNet successfully identified the exosite I region, the primary binding site of hirugen. Exosite I serves as a substrate recognition site and plays a major role in thrombin’s regulation of fibrinogen cleavage. Similarly, for 2MPS (MDM2-p53 complex), hydrophobic hot spots within the p53-binding cleft of MDM2 were detected, reinforcing the importance of hydrophobic interactions in stabilizing PPIs and guiding small-molecule drug design. Overall, these analyses show InDeepNet’s capability to provide high-resolution structural insights into both ligandable binding sites and PPI interfaces, making it a valuable tool for structure-based drug discovery and protein interaction analysis. By enabling the identification of druggable sites, interactability regions, and hydrophobic hot spots, InDeepNet supports the rational design of small-molecule inhibitors and PPI disruptors.
Figure 3.
(A) Ligandability assessment by InDeepNet. Utilization of the Ligandability channel within InDeepNet to identify ligandable pockets. In all examples, namely PHD2, WDR5, MDM2, and SARS-CoV2 main protease, which are independent of the training set, InDeepNet successfully identified the known binding site of structurally characterized ligands as the primary ligandable region (volume maps), using a ligandability threshold of 0.95. (B) Interactability assessment by InDeepNet. Utilization of the Interactability channel (volume map) to identify regions mediating PPIs in structures 7CYQ and 2MPS. Additionally, the CA channel (main blobs) and Hydrophobic channel (small blobs) were used to identify backbone localization of the partner and hydrophobic hot spots in structures 1QHR and 2MPS.
Conclusion and future work
InDeepNet is a powerful web platform that enhances the prediction of functional binding sites in proteins, particularly for PPIs and ligandability assessments. By integrating InDeep, a deep-learning-based predictor of binding sites, with InDeepHolo, a tool for evaluating holo-like conformations, InDeepNet provides researchers with an advanced and accessible solution for drug discovery and structural biology. Through interactive 3D visualizations and seamless integration with Mol*, the platform facilitates in-depth analysis of binding pockets, making complex computational tools widely available without requiring specialized expertise or high-performance computing resources.
Validation across multiple datasets demonstrates that the implementation of InDeep within InDeepNet represents a robust solution in accurately identifying ligandable sites and PPI interaction regions in diverse proteins. Its success in recognizing hydrophobic hotspots, protein backbone localization, and transiently ligandable pockets further reinforces its potential for structure-based drug design. By providing a publicly accessible, high-performance computing environment, InDeepNet accelerates PPI-targeted therapeutic development, supporting the rational design of small-molecule inhibitors, PPI disruptors, and enzyme modulators. Future improvements will focus on allowing the use of InDeepHolo along MD trajectories to fetch more automatically pertinent pocket conformations for structure-based drug design. Moreover, we plan to make future developments of InDeep available within InDeepNet.
Supplementary Material
Acknowledgements
This work used the computational and storage services (Kubernetes cluster) provided by the IT department at Institut Pasteur, Paris. We thank Bryan Brancott for this help.
Author contributions: Fabien Mareuil (Conceptualization [equal], Software [equal]), Rachel Torchet (Conceptualization [equal]), Luis Checa Ruano (Validation [equal]), Vincent Mallet (Conceptualization [equal], Software [equal]), Michael Nilges (Resources [equal]), Guillaume Bouvier (Conceptualization [equal], Software [equal]), Olivier Sperandio (Conceptualization [equal], Project administration [equal], Writing—original draft [equal]).
Contributor Information
Fabien Mareuil, Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, F-75015 Paris, France.
Rachel Torchet, Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, F-75015 Paris, France.
Luis Checa Ruano, Structural Bioinformatics Unit, Department of Structural Biology and Chemistry, Institut Pasteur, Université de Paris, CNRS UMR3528, Paris F-75015, France.
Vincent Mallet, Structural Bioinformatics Unit, Department of Structural Biology and Chemistry, Institut Pasteur, Université de Paris, CNRS UMR3528, Paris F-75015, France.
Michael Nilges, Structural Bioinformatics Unit, Department of Structural Biology and Chemistry, Institut Pasteur, Université de Paris, CNRS UMR3528, Paris F-75015, France.
Guillaume Bouvier, Structural Bioinformatics Unit, Department of Structural Biology and Chemistry, Institut Pasteur, Université de Paris, CNRS UMR3528, Paris F-75015, France.
Olivier Sperandio, Structural Bioinformatics Unit, Department of Structural Biology and Chemistry, Institut Pasteur, Université de Paris, CNRS UMR3528, Paris F-75015, France.
Supplementary data
Supplementary data is available at NAR online.
Conflict of interest
None declared.
Funding
We gratefully acknowledge the financial support of Dassault Systèmes La Fondation, whose sponsorship has been instrumental in the development of InDeepNet. Their support has enabled advancements in deep-learning-driven protein binding site prediction, facilitating research in PPI-targeted drug discovery and computational structural biology. Funding to pay the Open Access publication charges for this article was provided by Fondation de France (PFR7).
Data availability
InDeepNet is freely accessible at https://indeep-net.gpu.pasteur.cloud/.
References
- 1. Arkin MR, Tang Y, Wells JA Small-molecule inhibitors of protein–protein interactions: progressing toward the reality. Chem Biol. 2014; 21:1102–14. 10.1016/j.chembiol.2014.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Mareuil F, Moine-Franel A, Kar A et al. Protein interaction explorer (PIE): a comprehensive platform for navigating protein–protein interactions and ligand binding pockets. Bioinformatics. 2024; 40:btae414. 10.1093/bioinformatics/btae414. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Sperandio O, Reynès CH, Camproux AC et al. Rationalizing the chemical space of protein–protein interaction inhibitors. Drug Discov Today. 2010; 15:220–9. 10.1016/j.drudis.2009.11.007. [DOI] [PubMed] [Google Scholar]
- 4. Schmidtke P, Le Guilloux V, Maupetit J et al. fpocket: online tools for protein ensemble pocket detection and tracking. Nucleic Acids Res. 2010; 38:W582–9. 10.1093/nar/gkq383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Le Guilloux V, Schmidtke P, Tuffery P Fpocket: an open source platform for ligand pocket detection. BMC Bioinformatics. 2009; 10:W582–9. 10.1186/1471-2105-10-168. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Da Silva F, Desaphy J, Rognan D IChem: a versatile toolkit for detecting, comparing, and predicting protein-ligand interactions. ChemMedChem. 2018; 13:507–10. 10.1002/cmdc.201700505. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Monet D, Desdouits N, Nilges M et al. MkgridXf: consistent identification of plausible binding sites despite the elusive nature of cavities and grooves in protein dynamics. J Chem Inf Model. 2019; 59:3506–18. 10.1021/acs.jcim.9b00103. [DOI] [PubMed] [Google Scholar]
- 8. Kozakov D, Grove LE, Hall DR et al. The FTMap family of web servers for determining and characterizing ligand-binding hot spots of proteins. Nat Protoc. 2015; 10:733–55. 10.1038/nprot.2015.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Jiménez J, Doerr S, Martínez-Rosell G et al. DeepSite: protein-binding site predictor using 3D-convolutional neural networks. Bioinformatics. 2017; 33:3036–42. 10.1093/bioinformatics/btx350. [DOI] [PubMed] [Google Scholar]
- 10. Stepniewska-Dziubinska MM, Zielenkiewicz P, Siedlecki P Improving detection of protein-ligand binding sites with 3D segmentation. Sci Rep. 2020; 10:5035. 10.1038/s41598-020-61860-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Mylonas SK, Axenopoulos A, Daras P DeepSurf: a surface-based deep learning approach for the prediction of ligand binding sites on proteins. Bioinformatics. 2021; 37:1681–90. 10.1093/bioinformatics/btab009. [DOI] [PubMed] [Google Scholar]
- 12. Johnson DK, Karanicolas J Druggable protein interaction sites are more predisposed to surface pocket formation than the rest of the protein surface. PLoS Comput Biol. 2013; 9:e1002951. 10.1371/journal.pcbi.1002951. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Mohseni Behbahani Y, Laine E, Carbone A Deep Local Analysis deconstructs protein–protein interfaces and accurately estimates binding affinity changes upon mutation. Bioinformatics. 2023; 39:i544–52. 10.1093/bioinformatics/btad231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Amaro RE, Baudry J, Chodera J et al. Ensemble docking in drug discovery. Biophys J. 2018; 114:2271–8. 10.1016/j.bpj.2018.02.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Basciu A, Malloci G, Pietrucci F et al. Holo-like and druggable protein conformations from enhanced sampling of binding pocket volume and shape. J Chem Inf Model. 2019; 59:1515–28. 10.1021/acs.jcim.8b00730. [DOI] [PubMed] [Google Scholar]
- 16. Eyrisch S, Helms V Transient pockets on protein surfaces involved in protein–protein interaction. J Med Chem. 2007; 50:3457–64. 10.1021/jm070095g. [DOI] [PubMed] [Google Scholar]
- 17. Eyrisch S, Helms V What induces pocket openings on protein surface patches involved in protein - protein interactions?. J Comput Aided Mol Des. 2009; 23:73–86. 10.1007/s10822-008-9239-y. [DOI] [PubMed] [Google Scholar]
- 18. Mallet V, Checa Ruano L, Moine Franel A et al. InDeep: 3D fully convolutional neural networks to assist in silico drug design on protein–protein interactions. Bioinformatics. 2022; 38:1261–8. 10.1093/bioinformatics/btab849. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Moine-Franel A, Mareuil F, Nilges M et al. A comprehensive dataset of protein–protein interactions and ligand binding pockets for advancing drug discovery. Sci Data. 2024; 11:402. 10.1038/s41597-024-03233-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Torchet R, Druart K, Ruano LC et al. The iPPI-DB initiative: a community-centered database of protein–protein interaction modulators. Bioinformatics. 2021; 37:89–96. 10.1093/bioinformatics/btaa1091. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Berman HM, Westbrook J, Feng Z et al. The Protein Data Bank. Nucleic Acids Res. 2000; 28:235–42. 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Armstrong DR, Berrisford JM, Conroy MJ et al. PDBe: improved findability of macromolecular structure data in the PDB. Nucleic Acids Res. 2020; 48:D335–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Jumper J, Evans R, Pritzel A et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021; 596:583–9. 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Sehnal D, Bittrich S, Deshpande M et al. Mol* viewer: modern web app for 3D visualization and analysis of large biomolecular structures. Nucleic Acids Res. 2021; 49:W431–7. 10.1093/nar/gkab314. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
InDeepNet is freely accessible at https://indeep-net.gpu.pasteur.cloud/.




