Table 3.
Distributional results on QM9. CharacterVAE49, GrammarVAE50, GraphVAE23 and MolGAN51 results are taken from Cao and Kipf51.
Model | Valid | Uniq | Novel | KL Div | Fréchet Dist | |
---|---|---|---|---|---|---|
SMILES | CharacterVAE | 0.103 | 0.675 | 0.900 | N/A | N/A |
GrammarVAE | 0.602 | 0.093 | 0.809 | N/A | N/A | |
LSTM (ours) | 0.980 | 0.962 | 0.138 | 0.998 | 0.984 | |
Transformer Sml (ours) | 0.947 | 0.963 | 0.203 | 0.987 | 0.927 | |
Transformer Reg (ours) | 0.965 | 0.957 | 0.183 | 0.994 | 0.958 | |
Graph | GraphVAE | 0.557 | 0.760 | 0.616 | N/A | N/A |
MolGAN | 0.981 | 0.104 | 0.942 | N/A | N/A | |
NAT GraphVAE (ours) | 0.875 | 0.317 | 0.895 | 0.843 | 0.509 | |
MGM (ours proposed) | 0.886 | 0.978 | 0.518 | 0.966 | 0.842 |
NAT GraphVAE25 stands for non-autoregressive graph VAE. Models labelled as ‘ours’ were trained by us and subsequently used to carry out generation. Our masked graph model results correspond to a 10% masking rate and training graph initialization, which has the highest geometric mean for all five benchmark metrics. (See the Supplementary Discussion section of the Supplementary Information for details.) Values of validity(↑), uniqueness(↑), novelty(↑), KL Div(↑) and Fréchet Dist(↑) metrics are between 0 and 1.