Table 1.
Template / Method | Unique and novel / % | Novelty score ≥ 0.65 / % | RAScore ≥ 0.5 / % | QSAR score ≤ 1 μM / % | All criteria / % |
---|---|---|---|---|---|
PPARγ | |||||
RNN-SMILES | 75.4 ( ± 2.7) | 28.7 ( ± 1.1) | 67.9 ( ± 2.3) | 29.6 ( ± 2.4) | 5.1 ( ± 0.2) |
DRAGONFLY-SMILES | 91.8 ( ± 0.3) | 47.9 ( ± 1.4) | 86.0 ( ± 0.3) | 34.7 ( ± 0.3) | 9.4 ( ± 0.0) |
DRAGONFLY-SELFIES | 99.8 ( ± 0.1) | 77.4 ( ± 0.1) | 82.2 ( ± 0.2) | 31.9 ( ± 0.1) | 13.3 ( ± 0.0) |
LXRβ | |||||
RNN-SMILES | 92.4 ( ± 2.5) | 65.9 ( ± 2.6) | 87.9 ( ± 2.8) | 28.6 ( ± 0.9) | 11.3 ( ± 0.4) |
DRAGONFLY-SMILES | 94.3 ( ± 0.5) | 80.2 ( ± 1.2) | 89.1 ( ± 0.5) | 26.2 ( ± 0.2) | 11.8 ( ± 0.1) |
DRAGONFLY-SELFIES | 100 ( ± 0.0) | 91.3 ( ± 0.5) | 84.2 ( ± 0.3) | 27.9 ( ± 0.2) | 11.1 ( ± 0.1) |
RARα | |||||
RNN-SMILES | 69.7 ( ± 5.9) | 41.9 ( ± 3.3) | 57.2 ( ± 4.3) | 30.1 ( ± 1.8) | 11.1 ( ± 0.7) |
DRAGONFLY-SMILES | 92.2 ( ± 0.4) | 62.4 ( ± 0.7) | 75.6 ( ± 0.5) | 32.4 ( ± 0.7) | 12.7 ( ± 0.2) |
DRAGONFLY-SELFIES | 99.8 ( ± 0.0) | 87.5 ( ± 0.3) | 77.1 ( ± 0.2) | 29.6 ( ± 0.3) | 14.0 ( ± 0.1) |
BRAF | |||||
RNN-SMILES | 89.2 ( ± 3.5) | 35.1 ( ± 3.1) | 85.9 ( ± 3.0) | 35.0 ( ± 1.3) | 6.7 ( ± 0.3) |
DRAGONFLY-SMILES | 87.9 ( ± 0.6) | 46.0 ( ± 0.8) | 80.9 ( ± 0.5) | 42.9 ( ± 0.5) | 10.7 ( ± 0.1) |
DRAGONFLY-SELFIES | 99.7 ( ± 0.1) | 81.1 ( ± 0.6) | 77.3 ( ± 0.4) | 34.3 ( ± 0.1) | 12.4 ( ± 0.0) |
BTK | |||||
RNN-SMILES | 82.0 ( ± 4.4) | 64.5 ( ± 4.1) | 61.9 ( ± 4.7) | 20.7 ( ± 1.8) | 4.5 ( ± 0.2) |
DRAGONFLY-SMILES | 88.9 ( ± 0.7) | 53.2 ( ± 0.4) | 69.6 ( ± 0.9) | 36.3 ( ± 0.7) | 8.8 ( ± 0.1) |
DRAGONFLY-SELFIES | 100 ( ± 0.0) | 85.8 ( ± 0.7) | 68.2 ( ± 1.0) | 25.8 ( ± 0.1) | 5.8 ( ± 0.0) |
JAK2 | |||||
RNN-SMILES | 88.8 ( ± 3.9) | 60.2 ( ± 4.2) | 79.9 ( ± 3.4) | 35.0 ( ± 2.2) | 14.5 ( ± 0.8) |
DRAGONFLY-SMILES | 84.8 ( ± 1.0) | 39.4 ( ± 0.9) | 69.0 ( ± 1.0) | 55.9 ( ± 1.5) | 14.8 ( ± 0.2) |
DRAGONFLY-SELFIES | 99.2 ( ± 0.0) | 73.3 ( ± 0.8) | 70.5 ( ± 0.5) | 50.5 ( ± 1.0) | 18.3 ( ± 0.2) |
Bold indicates whether the SELFIES- or SMILES-based models achieve a higher value for the investigated property in both structure- and ligand-based models. The values are presented as mean and standard deviation, based on three runs (N = 3), each sampling 2000 SMILES-strings. The complete list of 20 investigated targets can be found in Tables S2–S6. JAK Janus kinase, PPAR Peroxisome proliferator-activated receptor, BRAF Serine/threonine-protein kinase B-Raf (rapidly accelerated fibrosarcoma), BTK Bruton’s tyrosine kinase, RAR Retinoic acid receptor, LXR Liver X receptor.