Skip to main content
. 2020 Apr 10;12:22. doi: 10.1186/s13321-020-00425-8

Table 4.

Augmentation effect on architecture C biLSTM–biLSTM with layer sizes 64/64 and 4 concatenated encoding layers

Smiles Augm. Best model epoch# Validity% Uniqueness% Training% Length match%a HAC match%b
Canonical 1 9, 9, 7 96.6 ± 0.5 99.9 ± 0.1 16.2 ± 1.5 93.3 ± 0.3 92.0 ± 0.5
Random 1 10, 14, 16 97.0 ± 0.3 99.9 ± 0.0 11.9 ± 0.6 98.5 ± 0.3 97.4 ± 0.5
Random 2 5, 5, 5 97.3 ± 0.1 99.9 ± 0.0 13.9 ± 0.5 97.7 ± 0.4 94.5 ± 0.8
Random 3 4, 6, 4 97.9 ± 0.3 99.9 ± 0.0 13.6 ± 0.5 98.8 ± 0.1 96.5 ± 0.2
Random 4 4, 3, 4 98.2 ± 0.4 99.9 ± 0.0 11.6 ± 0.5 98.8 ± 0.3 97.1 ± 0.2
Random 5 4, 4, 4 98.3 ± 0.3 99.9 ± 0.0 11.2 ± 0.5 97.3 ± 0.7 96.6 ± 0.3
Random 10 4, 4, 4 98.3 ± 0.3 99.9 ± 0.0 14.2 ± 0.5 98.4 ± 0.4 98.2 ± 0.5

aLength match for SMILES length distributions of the training set and generated set (See “Methods”)

bHAC match for the atom count distributions of the generated set and training set (See “Methods”)