Abstract

The long-due development of a computational method for the ab initio prediction of chemical reactants that provide a target compound has been hampered by the combinatorial explosion that occurs when reactions consist of multiple elementary reaction processes. To address this challenge, we have developed a quantum chemical calculation method that can enumerate the reactant candidates from a given target compound by combining an exhaustive automated reaction path search method with a kinetics method for narrowing down the possibilities. Two conventional name reactions were then assessed by tracing back the reaction paths using this new method to determine whether the known reactants could be identified. Our method is expected to be a powerful tool for the prediction of reactants and the discovery of new reactions.
Keywords: quantum chemical calculation, reaction path, reaction path network, inverse search, reaction discovery, potential energy surface
Introduction
Since Schrödinger first reported his now fundamental equation describing wave mechanics nearly a century ago, the development of quantum chemical calculation methods has further revealed the behavior of atoms during chemical reactions.1−4 Quantum chemical calculations continue to be used to elucidate the behaviors of atoms. This provides a more detailed understanding of the mechanisms involved in known chemical reactions and establishes mechanistic insights that often lead to the rational design of new chemical reactions.5−9 Such calculations have typically been performed by solving the electronic Schrödinger equation at a series of nucleus configurations along a reaction path.10,11 To this end, one generally needs to first project the motion of atoms along the path beforehand. The projected motion is then assessed to determine whether it is energetically feasible by solving the electronic Schrödinger equation along this path. By performing such calculations for various paths and comparing their kinetic and thermodynamic preferences, one can identify the actual motion of atoms during a chemical reaction. The full automation of this procedure to systematically predict the chemical reactions has been intensively investigated for many years.12−19 In several cases, such methods have been successful in predicting the reaction path from a given initial state (e.g., initial mixture of reactants, additives, and catalysts) to a product, without relying on any knowledge concerning the path or the product.
The next challenge to overcome involves predicting the initial state of a chemical reaction by tracing back reaction paths using quantum chemical calculations, which is equivalent to solving an inverse problem of a chemical reaction. This enables a method that starts from a target product and explores possible reactant candidates in a given atomic composition. Despite an issue with how the atomic composition is decided, such a method has never been established previously and would significantly contribute to future reaction design and discovery. Once a systematic method is established, one could predict a method for synthesizing a target compound without relying on any previous synthetic knowledge or database. We introduced such a design concept in 2013 as quantum chemistry-aided retrosynthetic analysis (QCaRA),14 which has been utilized for predicting a very simple reaction that involves only one elementary step.20 However, a combinatorial explosion of the number of possible paths has been a major obstacle for generalizing the concept to multistep cases and applying it to various chemical reactions. The number of paths for possible reactant candidates increases dramatically depending on the number of elementary steps from the product state. Moreover, the reactant candidates are usually less stable than the product state; therefore, a comprehensive search must be performed to find a feasible path, leading to highly unstable compounds. Therefore, QCaRA for multistep reactions has hitherto been impossible due to the lack of an effective method to explore the kinetically feasible multistep path inverse way.
This study provides a solution to the aforementioned problem by combining a kinetic analysis method with an automated reaction path search method. The artificial force-induced reaction (AFIR) method,21 which is employed herein as an automated reaction path search method, is capable of exhaustive searches that include paths to highly unstable compounds.22 The AFIR method eliminates energy barriers by applying a virtual force between fragments X and Y in a system to induce the chemical transformation (Figure 1a), where single atoms or small groups of atoms can be chosen as X and Y. Depending on the choice of fragments X and Y, different chemical transformations can be produced; two possibilities are illustrated in Figure 1 for forces applied between X and Y, viz cyanide anion dissociation upon the application of a negative force (Figure 1b) and water addition by applying a positive force (Figure 1c). By applying either a negative or a positive force to a variety of fragment pairs, various stable structures are constructed. After a systematic exploration in this manner, a network of stable structures and their transformation pathways is produced to establish a reaction path network.
Figure 1.

Schematic illustrations of how the AFIR method induces chemical transformations. (a) AFIR function for the bond formation reaction between X and Y. (b) Bond dissociation reaction induced by the AFIR method. (c) Bond formation reaction induced by the AFIR method. Atoms shown in (b,c) are H (light blue), C (gray), N (dark blue), and O (red).
Without imposing any restrictions, the AFIR method tries to find all possible stable structures. Since the number of stable structures increases exponentially with increasing number of atoms in the system,12,19 the computational cost to perform a search targeting all possible stable structures is high, even for systems composed of less than ∼20 atoms. Hence, exploration in this study is guided by a kinetic analysis method called rate constant matrix contraction (RCMC).23 The RCMC method performs a coarse graining of a reaction path network by integrating stable structures that reach a thermal equilibrium within a given timescale (tMAX). Figure 2a shows an energy profile that consists of the minima of seven stable structures, with the minimum indicated as “Start” corresponding to the energy of the product state; thus, exploration by the AFIR method in this study starts from there. The RCMC method is then used to integrate the shallowest minimum into its adjacent minima with a certain ratio to yield a reaction profile, as shown in Figure 2b. This procedure called “contraction” is applied to all shallow minima with lives shorter than tMAX. Figure 2c displays the final coarse-grained profile on which all the shallow minima have been contracted to the three deep minima. Each of the three deep minima called a “superstate” (indicated as “SS”) is expressed as the sum of the main minimum to which the contraction has not been applied, and all the minima contracted to it. It is noted that the energy level of the superstate gets lower than the original minimum because of the conformational entropy among all contracted minima, as illustrated in Figure 2c. Since the RCMC method determines the contribution ratio of each minimum to each superstate so as to reproduce the thermal equilibrium within tMAX,23 the ratio directly corresponds to the reaction yield obtained after the time tMAX. In other words, using the RCMC method, the yield of the target product obtained by every reaction starting from a minimum can be computed as its contribution ratio to the superstate, in which the product state is the main component.
Figure 2.

Schematic illustrations of how the RCMC method evaluates the kinetic importance of each local minimum. (a) Energy profile consisting of seven local minima obtained by the AFIR method, starting from the minimum corresponding to the product state that is indicated as “Start.” (b) Energy profile after applying a contraction to the shallowest local minimum and integrating it to the adjacent minima with certain ratios. (c) Energy profile obtained after integrating all minima with shorter lives than tMAX to the three deep minima by the RCMC method.
By combining the AFIR method with the RCMC method, paths to attain the product state can be selectively explored. First, the AFIR method searches for reaction paths starting from the product state to find the minima adjacent to the minimum of the product state. Next, among these minima, the RCMC method identifies the minima contributing to the superstate of the product state greater than a predefined threshold. Then, the AFIR method is once again applied to continue the reaction path exploration from the minima that satisfy this threshold. Through alternately searching by AFIR and screening for minima by RCMC and repeating this procedure until a termination condition is met, the paths leading to the target product can be selectively searched. In this study, the threshold of the contribution ratio (reaction yield) was set to 0.1%, and the search was terminated after computing 10,000 paths.
This study applied the abovementioned combined AFIR/RCMC method to two known reactions,24−26 that is, the Strecker reaction (R1) and the Passerini reaction (R2), as shown in Figure 3, using each product as an input structure. Although the second HCOOH in R2 is not stoichiometrically required, it was added due to its positive impact on the reaction as a catalyst.27,28 The reaction temperatures were set at 200, 300, and 400 K, and the timescale for tMAX was set to 1 day. Gibbs energy and rate constant for each elementary step were estimated by harmonic vibrational analysis and standard transition-state theory, respectively, at these temperatures. When a stable structure gave the target product with a reaction yield of ≥0.1% at one or more of these temperatures, the path search continued beyond the stable structure. During the search, electronic structure calculations were carried out under a vacuum using the ωB97X-D functional, the SV (for H, C, N, and O) and Def2-SVP (for Na and Cl) basis functions. Finally, at all discrete points along all paths, single-point energy calculations were performed using the ωB97X-D functional, the Def2-SVP basis functions for all atomic elements, where the solvent effects of water in R1 and tetrahydrofuran in R2 were taken into account using an implicit solvation model. Further details of the search conditions are discussed in the Supporting Information. The obtained stable structures were classified according to chemical species using the SMILES notation.29 It should be emphasized that information related to the reactants of R1 and R2 was not included in these procedures. In other words, the focus of this study is to determine whether the AFIR/RCMC method is capable of finding the known reactants. In the following, it is demonstrated that the AFIR/RCMC method has been successful in identifying reactants for both R1 and R2, by tracing back the multistep path inverse way starting from the corresponding products.
Figure 3.

Target reactions explored in this study: the Strecker reaction (R1) and the Passerini reaction (R2).
Results and Discussion
Starting from the products of R1 and R2, the searches found 9208 and 12,219 stable structures, respectively, which were then classified into 1679 and 2407 chemical species, respectively, using their SMILES representations. Figure 4a,b shows the reaction path networks for R1 and R2, consisting of 1679 and 2407 nodes, respectively. Each node represents a different chemical species. These nodes are linked by 2934 and 4018 edges that represent reaction paths. The nodes in Figure 4a,b indicate the contribution ratio to the superstate of the product for each chemical species, according to the color scheme in the legend. In other words, reactions initiated from red nodes are predicted to provide the target product in 100% yield, while those from dark blue nodes do not generate the target product at all. As seen from these networks, a variety of species can act as reactants to produce the target products.
Figure 4.
Reaction path networks for the Strecker reaction (R1) and the Passerini reaction (R2). Different nodes represent different chemical species. (a) Network for R1 with each node showing the contribution ratio to the product state (reaction yield of the product state). (b) Network for R2 with each node showing the contribution ratio to the product state (reaction yield of the product state). (c) Reaction path of R1, as identified by its reaction path network highlighted with white arrows in (a). (d) Reaction path of R2, as identified by its reaction path network highlighted with white arrows in (b).
These reaction path networks include nodes representing the known reactants for R1 and R2. In other words, the searches tracing back quantum chemical reaction paths were successful for both cases. In these networks, paths generating the products through well-known reaction mechanisms were identified (Figure 4c,d), with each two-dimensional structure representing that of each node; the corresponding paths are highlighted with white arrows in the networks, as displayed in Figure 4a,b, respectively. As seen in Figure 4a,b, these paths follow red nodes, thus indicating that these paths are kinetically feasible. For the path of R1 in Figure 4c, the reaction steps occur as follows: (1) a proton transfers from an ammonium cation to a cyanide anion to generate ammonia, (2) nucleophilic addition of ammonia to a ketone takes place, (3) proton transfer from hydrogen cyanide to a carbonyl oxygen atom occurs, (4) proton transfer from an ammonium cation to a cyanide anion generates a hemiaminal intermediate, (5) a proton transfers from hydrogen cyanide, and subsequent dissociation of water generates an iminium cation, and finally (6) nucleophilic addition of a cyanide anion generates the target product, aminonitrile. For the path of R2, as shown in Figure 4d, the reaction steps are as follows: (1) the nucleophilic addition of isocyanide to an aldehyde and the subsequent proton transfer from a carboxylic acid to carbonyl oxygen atom generate a polar intermediate, (2) coupling between the charged moieties generates an imine intermediate, (3) the transfer of a proton promotes the ring closure and generates a five-membered ring intermediate, (4) the transfer of a proton from a carboxylic acid to the imine nitrogen atom generates a polar intermediate, and finally (5) a proton transfer promotes the ring opening and generates the target product, that is, α-acyloxyamide. These paths are consistent with those suggested in the literature.26−28 Therefore, the present calculations successfully predicted these synthetic paths based on a reverse path search from the corresponding products.
Here, we further assess and discuss the species that were forecasted from the reaction path network of R2. Figure 5a shows the 21 most stable species with a contribution ratio >50% at 300 K, and Figure 5b lists the seven most stable species consisting of four molecules and having a contribution ratio >50% at 300 K. Below each species are listed the contribution ratios calculated at 250, 300, and 350 K with tMAX = 1 day, that is, their yields as reactants. These include some familiar reactions. First, S23 is a reactant of the Passerini reaction, whereas S15 and S20 are intermediates of the Passerini reaction, respectively, which are included in Figure 4d as well. Therefore, it is obvious that they give the products of the Passerini reaction in high yields. Likewise, the conversion of S4 to α-acyloxyamide is a simple enol–keto tautomerism, and the conversion of S6 to α-acyloxyamide is a simple condensation of an acid anhydride and an alcohol. The anhydrous formic acid and H2O in S28 are readily converted into two molecules of formic acid, which is equivalent to reactant S23 in the Passerini reaction. The correct predictions of such indisputable chemical transformations confirm that the method is able to systematically predict the possible chemical transformations.
Figure 5.
Chemical species with a contribution ratio to the product state (reaction yield of the product state) >50% at 300 K with tMAX = 1 day for R2. (a) 21 most stable species. (b) Seven most stable species consisting of four molecules. Gibbs energy at 300 K relative to the product state, and the contribution ratio to the product state at a250 K, b300 K, and c350 K is shown below each species. The yellow background indicates intermediates involved in the path of Figure 4d.
On the other hand, there are many cases, in which the corresponding experiments are difficult in terms of stability and solubility of the reactants. For example, it is difficult to prepare single-molecule species like S3, S5, S8, S9, S10, S16, S18, and S21 as reactants because they spontaneously change into α-acyloxyamide. Molecules in species S1, S2, S11, S14, and S19, which contain formic acid or HCOO–, may also spontaneously convert to α-acyloxyamide because they have the same chemical composition to α-acyloxyamide or its protonated form. Species such as S12, S13, and S22 contain acid–base pairs; therefore, they easily crystallize and precipitate. Furthermore, S24, S25, S26, and S27 contain molecules that are difficult to isolate, such as acetolactone in S24. Among these, S7 and S17 seem to be experimentally feasible as new reactions, but it was found that formyl 2-hydroxyacetate and formyl 2-formyloxyacetate contained in them are not commercially available or synthesized. Many of the chemical reactions predicted by this method are difficult to demonstrate experimentally; however, it is an important finding that the experimentally established Strecker and Passerini reactions were included among them. It is also important to note that a new reaction was actually found in the previous application limited to one step,20 producing a R–CF2–COO– skeleton in one-pot from CO2-enabled syntheses of a series of compounds that was difficult to synthesize.30 In the future, we would like to systematically apply this method to various compounds to discover unknown chemical reactions.
Concluding Remarks
Finally, we would like to summarize the advantages and limitations of the backward search for synthetic paths consisting of multiple elementary reaction processes achieved by this study. First of all, this method allows us to predict the chemical reactions starting from products and going backward to reactant candidates for the first time by tracing back quantum chemical reaction paths. The most important benefit of this method is that it does not require any previous knowledge or data, which makes it a powerful and innovative computational method that breaks away from the conventional norms of chemistry.
Unfortunately, the search based on quantum chemical calculations involves relatively high computational costs. For the two reactions calculated in this study, it took approximately four days using 1344 cores (Xeon Platinum 2.3 GHz). Since the computational cost of quantum chemical explorations increases exponentially with the number of atoms in the system, the current targets are limited to only simple chemical reactions such as those described here. Another limitation is that only the prediction of reaction paths that can be described by the set of atoms in the microsystem is currently possible. In other words, if a reaction path involves a catalyst, then the catalyst molecule must be explicitly included in the calculation. Since byproducts such as H2O in the Strecker reaction must also be explicitly included in the calculation, reactions achieving 100% atom economy and those containing a typical leaving group such as H2O in hydrolysis, N2 in azide decomposition, or CO2 in decarboxylation could be the first and second choices, respectively, in practical applications. Furthermore, macroscopic phenomena such as the precipitation of acid–amine salts are ignored. These limitations must be overcome or mitigated in the future in order to expand the possibilities of this method. However, even with the current limitations, it is anticipated that this method will be used to discover new reactions. In the future, we plan to work on applying this method to the discovery of new reactions, while working toward mitigation of the application limits in parallel. With these efforts, we continue to develop QCaRA into an essential chemical reaction design concept.
We finally discuss the relation between QCaRA and computer-aided reaction prediction tools relying on the experimental data. The development of the tools of the latter type has been conducted extensively, including Corey’s pioneering work concerning conventional retrosynthetic analysis.31−49 Our QCaRA is an ab initio approach and is essentially different from these empirical data and/or rule-based approaches. A clear advantage of QCaRA is that it can identify the free-energy profiles of predicted synthetic pathways and support chemists in designing new chemical reactions. On the other hand, combining QCaRA with these computer-aided reaction prediction tools would provide a new direction: developing a reaction prediction tool relying on a chemical reaction database created by QCaRA. The extensive set of chemical reactions predicted by QCaRA can help strengthen or alternate experimental chemical reaction databases. Although such a computational database has a drawback concerning the accuracy, it instead can cover unexplored chemical reactions that are not easily studied experimentally due to some reasons, such as accessibility to reactants. The use of QCaRA in this direction would be a fascinating future subject.
Acknowledgments
We thank Takako Homma for her help in draft editing. This work was supported by the JST via ERATO grant JPMJER1903. The support was also provided by the Institute for Chemical Reaction Design and Discovery (ICReDD), which was established by the World Premier International Research Initiative (WPI), MEXT, Japan.
Supporting Information Available
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/jacsau.2c00157.
Computational methods, reaction path networks with different representations, and paths for reactions starting from the reactant candidates in Figure 5 (PDF)
Author Present Address
⊥ Y.S.: Institute for Materials Chemistry and Engineering and Integrated Research Consortium on Chemical Science, Kyushu University, Fukuoka 819-0395, Japan
Author Contributions
S.M. designed the computational framework and conducted the calculations. Y.S. and S.M. developed the method. Y.H. and Y.N. analyzed the reaction path networks. All authors prepared, reviewed, and approved the manuscript.
The authors declare no competing financial interest.
Notes
All raw data used to draw the reaction path networks in Figure 4 and required to reproduce the present results are available at https://afir.sci.hokudai.ac.jp/reference.html.
Supplementary Material
References
- Houk K. N.; Cheong P. H.-Y. Computational prediction of small-molecule catalysts. Nature 2008, 455, 309–313. 10.1038/nature07368. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thiel W. Computational catalysis – Past, present, and future. Angew. Chem., Int. Ed. 2014, 53, 8605–8613. 10.1002/anie.201402118. [DOI] [PubMed] [Google Scholar]
- Sameera W. M. C.; Maeda S.; Morokuma K. Computational catalysis using the artificial force induced reaction method. Acc. Chem. Res. 2016, 49, 763–773. 10.1021/acs.accounts.6b00023. [DOI] [PubMed] [Google Scholar]
- Houk K. N.; Liu F. Holy grails for computational organic chemistry and biochemistry. Acc. Chem. Res. 2017, 50, 539–543. 10.1021/acs.accounts.6b00532. [DOI] [PubMed] [Google Scholar]
- Ahn S.; Hong M.; Sundararajan M.; Ess D. H.; Baik M.-H. Design and optimization of catalysts based on mechanistic insights derived from quantum chemical reaction modeling. Chem. Rev. 2019, 119, 6509–6560. 10.1021/acs.chemrev.9b00073. [DOI] [PubMed] [Google Scholar]
- Rosales A. R.; Wahlers J.; Limé E.; Meadows R. E.; Leslie K. W.; Savin R.; Bell F.; Hansen E.; Helquist P.; Munday R. H.; Wiest O.; Norrby P.-O. Rapid virtual screening of enantioselective catalysts using CatVS. Nat. Catal. 2019, 2, 41–45. 10.1038/s41929-018-0193-3. [DOI] [Google Scholar]
- Falivene L.; Cao Z.; Petta A.; Serra L.; Poater A.; Oliva R.; Scarano V.; Cavallo L. Towards the online computer-aided design of catalytic pockets. Nat. Chem. 2019, 11, 872–879. 10.1038/s41557-019-0319-5. [DOI] [PubMed] [Google Scholar]
- Foscato M.; Jensen V. R. Automated in silico design of homogeneous catalysts. ACS Catal. 2020, 10, 2354–2377. 10.1021/acscatal.9b04952. [DOI] [Google Scholar]
- Athavale S. V.; Simon A.; Houk K. N.; Denmark S. E. Demystifying the asymmetry-amplifying, autocatalytic behaviour of the Soai reaction through structural, mechanistic and computational studies. Nat. Chem. 2020, 12, 412–423. 10.1038/s41557-020-0421-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schlegel H. B. Geometry optimization. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2011, 1, 790–809. 10.1002/wcms.34. [DOI] [Google Scholar]
- Maeda S.; Harabuchi Y.; Ono Y.; Taketsugu T.; Morokuma K. Intrinsic reaction coordinate: Calculation, bifurcation, and automated search. Int. J. Quantum Chem. 2015, 115, 258–269. 10.1002/qua.24757. [DOI] [Google Scholar]
- Wales D. J.Energy Landscapes: with Applications to Clusters, Biomolecules and Glasses; Cambridge University: Cambridge, England, 2003. [Google Scholar]
- Schlegel H. B. Exploring potential energy surfaces for chemical reactions: An overview of some practical methods. J. Comput. Chem. 2003, 24, 1514–1527. 10.1002/jcc.10231. [DOI] [PubMed] [Google Scholar]
- Maeda S.; Ohno K.; Morokuma K. Systematic exploration of the mechanism of chemical reactions: The global reaction route mapping (GRRM) strategy using the ADDF and AFIR methods. Phys. Chem. Chem. Phys. 2013, 15, 3683–3701. 10.1039/c3cp44063j. [DOI] [PubMed] [Google Scholar]
- Wang L.-P.; Titov A.; McGibbon R.; Liu F.; Pande V. S.; Martínez T. J. Discovering chemistry with an ab initio nanoreactor. Nat. Chem. 2014, 6, 1044–1048. 10.1038/nchem.2099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dewyer A. L.; Argüelles A. J.; Zimmerman P. M. Methods for exploring reaction space in molecular systems. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2018, 8, e1354 10.1002/wcms.1354. [DOI] [Google Scholar]
- Grambow C. A.; Jamal A.; Li Y.-P.; Green W. H.; Zádor J.; Suleimanov Y. V. Unimolecular reaction pathways of a γ-ketohydroperoxide from combined application of automated reaction discovery methods. J. Am. Chem. Soc. 2018, 140, 1035–1048. 10.1021/jacs.7b11009. [DOI] [PubMed] [Google Scholar]
- Simm G. N.; Vaucher A. C.; Reiher M. Exploration of reaction pathways and chemical transformation networks. J. Phys. Chem. A 2019, 123, 385–399. 10.1021/acs.jpca.8b10007. [DOI] [PubMed] [Google Scholar]
- Sumiya Y.; Maeda S.. “Paths of chemical reactions and their networks: From geometry optimization to automated search and systematic analysis”. In Chemical Modelling; Springborg M., Joswig J.-O., Eds.; Royal Society of Chemistry: London, 2019; Vol. 15, pp 28–69. [Google Scholar]
- Mita T.; Harabuchi Y.; Maeda S. Discovery of a synthesis method for a difluoroglycine derivative based on a path generated by quantum chemical calculations. Chem. Sci. 2020, 11, 7569–7577. 10.1039/d0sc02089c. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maeda S.; Morokuma K. A systematic method for locating transition structures of A + B → X type reactions. J. Chem. Phys. 2010, 132, 241102. 10.1063/1.3457903. [DOI] [PubMed] [Google Scholar]
- Maeda S.; Harabuchi Y. Exploring paths of chemical transformations in molecular and periodic systems: An approach utilizing force. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2021, 11, e1538 10.1002/wcms.1538. [DOI] [Google Scholar]
- Sumiya Y.; Maeda S. Rate constant matrix contraction method for systematic analysis of reaction path networks. Chem. Lett. 2020, 49, 553–564. 10.1246/cl.200092. [DOI] [Google Scholar]
- Strecker A. Ueber die künstliche Bildung der Milchsäure und einen neuen, dem Glycocoll homologen Körper. Liebigs Ann. Chem. 1850, 75, 27–45. 10.1002/jlac.18500750103. [DOI] [Google Scholar]
- Passerini M. Sopra gli isonitrili (I). Composto del p-isonitril-azobenzolo con acetone ed acido acetico. Gazz. Chim. Ital. 1921, 51, 126–129. [Google Scholar]
- Solomons T. W. G.Organic Chemistry; Wiley: New York, 1996; p 6. [Google Scholar]
- Maeda S.; Komagawa S.; Uchiyama M.; Morokuma K. Finding reaction pathways for multicomponent reactions: The Passerini reaction is a four-component reaction. Angew. Chem., Int. Ed. 2011, 50, 644–649. 10.1002/anie.201005336. [DOI] [PubMed] [Google Scholar]
- Ramozzi R.; Morokuma K. Revisiting the Passerini reaction mechanism: Existence of the nitrilium, organocatalysis of its formation, and solvent effect. J. Org. Chem. 2015, 80, 5652–5657. 10.1021/acs.joc.5b00594. [DOI] [PubMed] [Google Scholar]
- Weininger D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 1988, 28, 31–36. 10.1021/ci00057a005. [DOI] [Google Scholar]
- Hayashi H.; Takano H.; Katsuyama H.; Harabuchi Y.; Maeda S.; Mita T. Synthesis of difluoroglycine derivatives from amines, difluorocarbene, and CO2: Computational design, scope, and application. Chem.—Eur. J. 2021, 27, 10040–10047. 10.1002/chem.202100812. [DOI] [PubMed] [Google Scholar]
- Corey E. J.; Wipke W. T. Computer-Assisted Design of Complex Organic Syntheses: Pathways for molecular synthesis can be devised with a computer and equipment for graphical communication. Science 1969, 166, 178. 10.1126/science.166.3902.178. [DOI] [PubMed] [Google Scholar]
- Corey E. J. The Logic of Chemical Synthesis: Multistep Synthesis of Complex Carbogenic Molecules. Angew. Chem., Int. Ed. Engl. 1991, 30, 455–465. 10.1002/anie.199104553. [DOI] [Google Scholar]
- Ott M. A.; Noordik J. H. Computer tools for reaction retrieval and synthesis planning in organic chemistry. A brief review of their history, methods, and programs. Recl. Trav. Chim. Pays-Bas 1992, 111, 239–246. 10.1002/recl.19921110601. [DOI] [Google Scholar]
- Todd M. H. Computer-aided organic synthesis. Chem. Soc. Rev. 2005, 34, 247–266. 10.1039/b104620a. [DOI] [PubMed] [Google Scholar]
- Cook A.; Johnson A. P.; Law J.; Mirzazadeh M.; Ravitz O.; Simon A. Computer-aided synthesis design: 40 years on. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2012, 2, 79–107. 10.1002/wcms.61. [DOI] [Google Scholar]
- Warr W. A. Short Review of Chemical Reaction Database Systems, Computer-Aided Synthesis Design, Reaction Prediction and Synthetic Feasibility. Mol. Inf. 2014, 33, 469–476. 10.1002/minf.201400052. [DOI] [PubMed] [Google Scholar]
- Milo A.; Neel A. J.; Toste F. D.; Sigman M. S. A data-intensive approach to mechanistic elucidation applied to chiral anion catalysis. Science 2015, 347, 737–743. 10.1126/science.1261043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Szymkuć S.; Gajewska E. P.; Klucznik T.; Molga K.; Dittwald P.; Startek M.; Bajczyk M.; Grzybowski B. A. Computer-Assisted Synthetic Planning: The End of the Beginning. Angew. Chem., Int. Ed. 2016, 55, 5904–5937. 10.1002/anie.201506101. [DOI] [PubMed] [Google Scholar]
- Segler M. H. S.; Preuss M.; Waller M. P. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 2018, 555, 604–610. 10.1038/nature25978. [DOI] [PubMed] [Google Scholar]
- Coley C. W.; Green W. H.; Jensen K. F. Machine learning in computer-aided synthesis planning. Acc. Chem. Res. 2018, 51, 1281–1289. 10.1021/acs.accounts.8b00087. [DOI] [PubMed] [Google Scholar]
- Sanchez-Lengeling B.; Aspuru-Guzik A. Inverse molecular design using machine learning: Generative models for matter engineering. Science 2018, 361, 360–365. 10.1126/science.aat2663. [DOI] [PubMed] [Google Scholar]
- Feng F.; Lai L.; Pei J. Computational Chemical Synthesis Analysis and Pathway Design. J. Front. Chem. 2018, 6, 199. 10.3389/fchem.2018.00199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Molga K.; Dittwald P.; Grzybowski B. A. Navigating around Patented Routes by Preserving Specific Motifs along Computer-Planned Retrosynthetic Pathways. Chem 2019, 5, 460–473. 10.1016/j.chempr.2018.12.004. [DOI] [Google Scholar]
- Zahrt A. F.; Henle J. J.; Rose B. T.; Wang Y.; Darrow W. T.; Denmark S. E. Prediction of higher-selectivity catalysts by computer-driven workflow and machine learning. Science 2019, 363, eaau5631 10.1126/science.aau5631. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mikulak-Klucznik B.; Gołębiowska P.; Bayly A. A.; Popik O.; Klucznik T.; Szymkuć S.; Gajewska E. P.; Dittwald P.; Staszewska-Krajewska O.; Beker W.; Badowski T.; Scheidt K. A.; Molga K.; Mlynarski J.; Mrksich M.; Grzybowski B. A. Computational planning of the synthesis of complex natural products. Nature 2020, 588, 83–88. 10.1038/s41586-020-2855-y. [DOI] [PubMed] [Google Scholar]
- Genheden S.; Thakkar A.; Chadimová V.; Reymond J.-L.; Engkvist O.; Bjerrum E. AiZynthFinder: a fast, robust and flexible open-source software for retrosynthetic planning. J. Cheminf. 2020, 12, 70. 10.1186/s13321-020-00472-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lin K.; Xu Y.; Pei J.; Lai L. Automatic retrosynthetic route planning using template-free models. Chem. Sci. 2020, 11, 3355–3364. 10.1039/c9sc03666k. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shields B. J.; Stevens J.; Li J.; Parasram M.; Damani F.; Alvarado J. I. M.; Janey J. M.; Adams R. P.; Doyle A. G. Bayesian reaction optimization as a tool for chemical synthesis. Nature 2021, 590, 89–96. 10.1038/s41586-021-03213-y. [DOI] [PubMed] [Google Scholar]
- Shen Y.; Borowski J. E.; Hardy M. A.; Sarpong R.; Doyle A. G.; Cernak T. Automation and computer-assisted planning for chemical synthesis. Nat. Rev. Methods Primers 2021, 1, 23. 10.1038/s43586-021-00022-5. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


