. 2021 May 26;12:3156. doi: 10.1038/s41467-021-23415-2

Table 4.

Distributional results on ChEMBL. LSTM, Graph MCTS⁵², AAE⁶⁷, ORGAN⁶² and VAE⁴⁹ (with a bidirectional GRU⁵³ as encoder and autoregressive GRU⁵³ as decoder) results are taken from Brown et al.⁴⁶.

	Model	Valid	Uniq	Novel	KL Div	Fréchet Dist
SMILES	AAE	0.822	1.000	0.998	0.886	0.529
	ORGAN	0.379	0.841	0.687	0.267	0.000
	VAE	0.870	0.999	0.974	0.982	0.863
	LSTM	0.959	1.000	0.912	0.991	0.913
	Transformer Sml (ours)	0.920	0.999	0.939	0.968	0.859
	Transformer Reg (ours)	0.961	1.000	0.846	0.977	0.883
Graph	Graph MCTS	1.000	1.000	0.994	0.522	0.015
	NAT GraphVAE	0.830	0.944	1.000	0.554	0.016
	MGM (ours proposed)	0.849	1.000	0.722	0.987	0.845

NAT GraphVAE²⁵ stands for non-autoregressive graph VAE. Models labelled as ‘ours’ were trained by us and subsequently used to carry out generation. Our masked graph model results correspond to a 1% masking rate and training graph initialization, which has the highest geometric mean for all five benchmark metrics. (See the Supplementary Discussion section of the Supplementary Information for details.) Values of validity(↑), uniqueness(↑), novelty(↑), KL Div(↑) and Fréchet Dist(↑) metrics are between 0 and 1.