Table 3.
Method | Coverage (%) | # Total pockets | # Pockets per protein | (Å) | MCD (Å) | MRO | Redundancy (%) |
---|---|---|---|---|---|---|---|
LIGYSIS (reference) | 2775 | 6882 | 1, 1, 27 | 5.9 | 14.1 | 0 | 2.3 |
(d) VN-EGNN | 2764 (99.6) | 13,582 (×2.0) | 1, 5, 7 | 5.9 | 1.1* | 0.85* | 66.8* |
(d) IF-SitePred | 2075 (74.8*) | 44,948 (×6.5) | 1, 20*, 129 | 5.9 | 3.4 | 0.55 | 49.5 |
(d) GrASP | 2771 (99.9) | 4694 (×0.7) | 1, 1, 12 | 7.9 | 21.4 | 0 | 0.0 |
(d) PUResNet | 2360 (85.1) | 2621 (×0.4) | 1, 1, 4 | 8.1 | 27.0 | 0 | 0.0 |
(d) DeepPocketSEG | 2349 (84.7) | 21,718 (×3.2) | 1, 6, 196 | 7.7 | 4.6 | 0.4 | 31.1 |
(d) P2RankCONS | 2759 (92.9) | 12,412(×1.8) | 1, 3, 57 | 7.1 | 13.9 | 0.05 | 0.7 |
(d) P2Rank | 2402 (86.6) | 10,180 (×1.5) | 1, 3, 85 | 7.1 | 13.8 | 0.05 | 0.6 |
(d) fpocket | 2759 (99.4) | 57,859 (×8.4*) | 1, 17, 349* | 6.3 | 9.7 | 0.15 | 0.7 |
(d) PocketFinder+ | 2775 (100) | 8913 (×1.3) | 1, 3, 23 | 8.6 | 18.7 | 0.05 | 0.0 |
(d) Ligsite+ | 2775 (100) | 6903 (×1.0) | 1, 2, 12 | 9.1* | 16.7 | 0.09 | 0.0 |
(d) Surfnet+ | 2775 (100) | 9043 (×1.3) | 1, 3, 40 | 8.4 | 17.2 | 0.07 | 0.0 |
LIGYSIS is not a ligand binding site predictor, but a reference dataset curated from experimentally determined structures of biologically relevant protein–ligand complexes. These predictions result from the default prediction of the methods, indicated by (d) preceding method names. Coverage represents the number of protein chains for which the different methods return at least one prediction. Percentage is relative to LIGYSIS protein chains. VN-EGNN failed with an error for PDB: 6BCU chain: A [86]. The rest of the methods ran successfully for all protein chains; # total pockets and ratio of predicted pockets by reference pockets in parenthesis, e.g., for each LIGYSIS site, fpocket predicts on average 8.4 pockets; minimum, median and maximum number of pockets per protein; Median pocket radius of gyration Rg (Å); median minimum centroid distance (MCD) (Å) for all pockets. For proteins where multiple pockets are predicted, MCD represents the distance to the closest pocket centroid for each of the different predicted pockets within a protein. This is a measure of how close predicted pockets are to each other; maximum residue overlap (MRO). For a given pocket, MRO is the maximum residue overlap with other pockets’ residues within a protein. MRO is a measure of how similar, in terms of shared residues, the predicted pockets are (see “Methods” for detailed explanation). For example, the median overlap between VN-EGNN predicted pockets is 85%. Redundancy represents the percentage of predicted pockets that are redundant, i.e., the closest pocket centroid is within 5 Å, or overlap is at least 3/4 (≥ 75%) residues. This is the case for 67% of VN-EGNN pockets and 0% for GrASP pockets. Bold font and “*” indicate the most extreme values within each column