Skip to main content
. 2017 Jun 15;7:3582. doi: 10.1038/s41598-017-02303-0

Figure 5.

Figure 5

Classification based on chemical-linguistic descriptors. (a) An example showing two organic molecules and their maximal common substructure – such substructures computed over millions of molecule-molecule pairs can be used as chemical-linguistic descriptors, CLDs. (b) Examples of some smaller and larger CLDs used as descriptors to predict reaction yields and times. Dashed lines denote aromatic bonds. (c) Performance of a random forest classifier based on various numbers of CLDs. Even for 5,000 descriptors, the misclassification error is still ca. 40%.