Table 3.
Dataset | Arch. | Valid (%) | Unique (%) | Novel (%) | Active (%) | Recovered actives/total actives (%) | Recovered neighbors |
---|---|---|---|---|---|---|---|
EGFR | GAN | 86 | 56 | 97 | 71 | 5.26 | 196 |
RNN | 96 | 46 | 95 | 65 | 7.74 | 238 | |
HTR1A | GAN | 86 | 66 | 95 | 71 | 5.05 | 284 |
RNN | 96 | 50 | 90 | 81 | 7.28 | 384 | |
S1PR1 | GAN | 89 | 31 | 98 | 44 | 0.93 | 24 |
RNN | 97 | 35 | 97 | 65 | 3.72 | 43 |
Dataset used (Dataset), Architecture used (Arch.), Percent of valid molecules in the sampled set (Valid), Percent of valid unique compounds (Unique), Percent of unique novel (not present in the training set) compounds (Novel), Percent of unique active compounds (Active), Recovered actives from the test set given the entire number of actives in the test set (Recovered actives/Total Actives), Recovered neighbors of active compounds using FCFP6 fingerprint with 2048 bits and a threshold Tanimoto similarity of 0.7