Skip to main content
[Preprint]. 2025 Jan 30:2024.11.09.622800. Originally published 2024 Nov 11. [Version 3] doi: 10.1101/2024.11.09.622800

Table 2:

Performance metrics of KaML-CBtree for acid and base pKa and protonation state predictions in comparison to the baseline modelsa

KaML-CBtree KaML-GAT PROPKA3 Null
acid base acid + base acid base acid base
PCC 0.88 ± 0.03 0.92 ± 0.03 0.93 ± 0.02 0.74 ± 0.06 0.90 ± 0.04 0.55 ± 0.06 0.86 ± 0.04
RMSE 0.76 ± 0.13 0.79 ± 0.10 0.90 ± 0.08 1.28 ± 0.15 0.96 ± 0.18 1.36 ± 0.11 1.04 ± 0.10
MAXE 3.17 ± 0.61 2.60 ± 0.70 3.74 ± 0.49 3.72 ± 0.29 5.04 ± 0.46 5.55 ± 1.24 2.80 ± 0.67

Classification of protonation states at pH 7b
Pre (prot) 0.91 0.99 0.92 0.66 0.97 0.63 0.97
Rec (prot) 0.82 0.97 0.92 0.78 0.88 0.39 0.78
Pre (dep) 0.99 0.95 0.98 0.98 0.97 0.95 0.77
Rec (dep) 0.99 0.99 0.98 0.97 0.85 0.98 0.97
CERc 34/2099 12/536 70/2822 90/2055 53/618 141/2106 101/716
a

Baseline models include PROPKA315 and null model which returns the model pKa’s:50 3.7 for Asp, 4.2 for Glu, 6.5 for His, 8.5 for Cys, 9.5 for Tyr, and 10.4 for Lys.

b

Prediction is based on the probability of protonation given a predicted pKa (see main text).

c

Critical error rate (CER) refers to the percentage of miss-classifying protonated as deprotonated or vice versa. All classification metrics were calculated after accumulating the predictions from all 20 holdout test sets.