Skip to main content
. Author manuscript; available in PMC: 2019 Mar 14.
Published in final edited form as: Science. 2019 Jan 18;363(6424):eaau5631. doi: 10.1126/science.aau5631

Fig. 1. Summary of chemoinformatics-guided workflow.

Fig. 1.

(A) An in silico library of synthetically accessible catalysts is defined. For each member in the library, descriptors are calculated. (B) A representative subset is algorithmically selected on the basis of intrinsic chemical properties.(C) The representative subset is synthesized and experimentally tested. (D) The probability of identifying a highly selective catalyst in the first round of screening should be greater than that by random sampling alone. (E) The data from the training set are used to train statistical learning methods. (F) The models predict selectivity values for every member of the greater in silico library. (G) If successful, the model will predict the optimal catalyst for the reaction. If unsuccessful, the new data can be used as training data to make a stronger prediction in successive rounds of modeling. R, any group; X, O or S; Y, OH, SH, or NHTf; i-Pr, isopropyl; t-Bu, tert-butyl; Cy, cyclohexyl.