Skip to main content
. 2021 May 26;12:3156. doi: 10.1038/s41467-021-23415-2

Table 5.

Conditional generation results on QM9. Results for MGM are chosen from a range of sampling iterations and both initialization strategies.

Target Condition Model G-mean Unique Count Property Value KLD Score
MolWt = 120 NAT GraphVAE 0.623 3048 124.47 ± 7.58 0.843
MGM 0.522 8800 120.02 ± 7.66 0.811
MGM - Final Step 0.404 8509 119.42 ± 7.67 0.761
Dataset 0.679
MolWt = 125 NAT GraphVAE 0.565 2326 127.21 ± 7.05 0.827
MGM 0.561 9983 125.00 ± 8.48 0.850
MGM - Final Step 0.354 9293 122.48 ± 7.20 0.936
Dataset 0.835
MolWt = 130 NAT GraphVAE 0.454 1204 129.12 ± 6.79 0.614
MGM 0.501 9465 128.85 ± 8.85 0.705
MGM - Final Step 0.369 8892 126.85 ± 7.43 0.789
Dataset 0.695
LogP = -0.4 NAT GraphVAE 0.601 2551 −0.409 ± 0.775 0.739
MGM 0.424 9506 −0.349 ± 0.503 0.803
MGM - Final Step 0.300 9495 −0.337 ± 0.523 0.876
Dataset 0.811
LogP = 0.2 NAT GraphVAE 0.562 2188 0.051 ± 0.746 0.803
MGM 0.378 9524 0.200 ± 0.468 0.846
MGM - Final Step 0.376 9487 0.202 ± 0.462 0.895
Dataset 0.816
LogP = 0.8 NAT GraphVAE 0.515 1837 0.588 ± 0.759 0.807
MGM 0.418 9360 0.769 ± 0.473 0.826
MGM - Final Step 0.300 9294 0.745 ± 0.442 0.857
Dataset 0.797

The results shown here correspond to the best mean property value (MGM) or the final sampling iteration with initialization chosen according to the better geometric mean among the five GuacaMol metrics (MGM—Final Step). Results for the NAT GraphVAE baseline model25 that we trained are also shown. ‘Dataset’ rows refer to molecules sampled from the dataset with MolWt within ± 1 for the MolWt conditions and LogP within ± 0.1 for the LogP conditions. G-mean refers to the geometric mean of validity, uniqueness and novelty.