Table 1.
Method | Approach | Features | # Features | P centroid | P residues | P score | P ranking | R score | R threshold | Cluster | Algorithm | Threshold (Å) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
VN-EGNN | EGNN + VN | ESM-2 embeddings | 1280 | ✓ | ✕ | ✓ | ✓ | ✕ | – | – | – | – |
IF-SitePred | LightGBM | ESM-IF1 embeddings | 512 | ✓ | ✕ | ✓ | ✓ | ✕ | 0.5 (ALL 40) | Cloud points | DBSCAN | 1.7 |
GrASP | GAT-GNN | Atom, residue, bond… | 17 | ✓ | ✕ | ✓ | ✓ | ✓ | 0.3 | Atoms | Average | 15 |
PUResNet | DRN + 3D-CNN | Atom + one-hot encoding | 18 | ✕ | ✓ | ✕ | ✕ | ✕ | 0.34 | Atoms | DBSCAN | 5.5 |
DeepPocket | fpocket + 3D-CNN | Atom | 14 | ✓ | ✓ | ✓ | ✓ | ✕ | – | – | – | – |
P2RankCONS | Random Forest | Atom and residue | 36 | ✓ | ✓ | ✓ | ✓ | ✓ | 0.35 | SAS points | Single | 3 |
P2Rank | Random Forest | Atom and residue | 35 | ✓ | ✓ | ✓ | ✓ | ✓ | 0.35 | SAS points | Single | 3 |
fpocketPRANK | fpocket + Random Forest | Atom and residue | 34 | ✓ | ✓ | ✓ | ✓ | ✓ | – | – | – | – |
fpocket | α-spheres | – | – | ✕ | ✓ | ✓ | ✓ | ✕ | – | α-spheres | Multiple | 1.7 |
PocketFinder+ | LJ potential | – | – | ✕ | ✕ | ✕ | ✕ | ✓ | – | – | – | – |
Ligsite+ | Cubic grid | – | – | ✕ | ✕ | ✕ | ✕ | ✓ | – | – | – | – |
Surfnet+ | Gap regions | – | – | ✕ | ✕ | ✕ | ✕ | ✓ | – | – | – | – |
All these methods were used with their default settings. Check marks (✓) indicate that a method provides a given output and crosses (✕) the contrary. Dashes (–) indicate a field is not applicable for a given method, e.g., features for non-machine learning-based methods. Approach: the techniques applied by the method; Features/#Features: the features and their number if the method is machine learning-based; P centroid/P residues/P score/P ranking/R score: whether the method reports the pocket centroid, pocket residues, pocket score, pocket ranking and residue ligandability score. Information about their clustering strategies is also relevant: whether the method uses a residue ligandability threshold (R threshold), the instances they cluster (Cluster) to define the distinct pockets, the clustering algorithm used (Algorithm) and threshold employed (Threshold). For example, P2Rank uses a random forest classifier on SAS points represented by 35 atom and residue features. Points with a score > 0.35 are later clustered into binding sites using single linkage and a threshold of 3 Å. DeepPocket and fpocketPRANK use fpocket predictions as a starting point and later employ different technologies to re-score or re-define pockets. EGNN + VN: equivariant graph neural network + virtual nodes; LightGBM: light gradient boosting machine; GAT: graph attention network; GNN: graph neural network; DRN: deep residual network; 3D-CNN: three-dimensional convolutional neural network; LJ potential: Lennard–Jones potential