Abstract
Hydrogen bond-based organocatalysts rely on networks of attractive noncovalent interactions (NCIs) to impart enantioselectivity. As a specific example, aryl pyrrolidine substituted urea, thiourea, and squaramide organocatalysts function cooperatively through hydrogen bonding and difficult-to-predict NCIs as a function of the reaction partners. To uncover the synergistic effect of the structural components of this catalyst class, we applied data science tools to study various model reactions using a derivatized, aryl pyrrolidine-based, hydrogen-bond donor (HBD) catalyst library. Through a combination of experimentally collected data and data mined from previous reports, statistical models were constructed, illuminating the general features necessary for high enantioselectivity. A distinct dependence on the identity of the electrophilic reaction partner and HBD catalyst is observed, suggesting that a general interaction is conserved throughout the reactions analyzed. The resulting models also demonstrate predictive capability by the successful improvement of a previously reported reaction using out-of-sample reaction components. Overall, this study highlights the power of data science in exploring mechanistic hypotheses in asymmetric HBD catalysis and provides a prediction platform applicable in future reaction optimization.
Keywords: asymmetric catalysis, hydrogen bond donors, organocatalysts, data science, multivariate linear regression modeling
Graphical Abstract:

INTRODUCTION
Chiral hydrogen-bond donor (HBD) catalysts have seen extensive development and application in synthetic chemistry over the past quarter century.1 These catalysts have been shown to induce enantioselectivity in a wide assortment of reactions through the agency of networks of noncovalent interactions (NCIs) through disparate reaction mechanisms.2 Anion-binding catalysis has emerged as a particularly powerful reaction manifold promoted by chiral HBD derivatives, enabling a rich assortment of enantioselective electrophile-nucleophile reactions.3 A particularly effective and broadly useful structural variant in this class are the aryl pyrrolidine-tert-leucine HBDs (Figure 1A).4 Several of these systems have been subjected to detailed mechanistic analyses, revealing the crucial role of attractive NCIs in promoting high enantioselectivity.5 The modular nature of the catalysts permits systematic variation of their components, and three structural features of the catalyst have proven most useful to alter in the context of optimization studies for new applications: the nature of the aryl group on the pyrrolidine, the identity of the HBD functional group (urea/thiourea/squaramide), and Me/H substitution at the a-position of the pyrrolidine (Figure 1B).
Figure 1.

(A) Generalized depiction of an ion-pair intermediate in aryl pyrrolidine HBD-catalyzed anion abstraction mechanism. (B) Specific catalyst components altered in this study. (C) Optimal catalysts for reactions of the corresponding electrophiles from literature-reported transformations.4,5,8
Despite the broad utility of this catalyst class, it remains difficult if not impossible to predict which HBD motif and what aryl pyrrolidine features will prove optimal in a given transformation. In some cases, squaramides, thioureas, and ureas perform similarly,5f whereas in others only one of those HBD groups is effective.5c As an example, squaramide derivatives bearing highly extended aryl pyrrolidine substituents catalyze enantioselective Mukaiyama aldol reactions, whereas analogous thiourea derivatives are ineffective.4 In contrast, addition of silyl ketene acetals to oxocarbenium ion intermediates is hindered by extended arene systems on the pyrrolidine, and optimal enantioselectivity is achieved with smaller arenes with both thiourea or squaramide catalysts (Figure 1C).5 Gaining an improved understanding of catalyst structural effects across different reaction manifolds could advance mechanistic insight while also streamlining the selection of HBD catalysts in future efforts.
Herein, we report the development of a workflow aimed at exploring these goals. This work was inspired by our recent efforts to interrogate the essential features of chiral phosphoric acids and bifunctional HBD catalysts via published data.6,7 In the current study, we supplemented literature-reported examples of aryl pyrrolidine substituted HBD-catalyzed reactions with a designed training set of newly collected experimental data. We then constructed multivariate regression models6–8 relating computationally derived structural features of aryl pyrrolidine catalysts to enantioselectivity values for a selected subset of literature reactions. The resulting statistical models provided insights into the essential catalyst features required for effective asymmetric catalysis with this class of chiral HBDs. Additionally, the model can be applied in a prospective manner by generating improved catalyst predictions for out-of-sample electrophile classes.
CATALYST SELECTION
A catalyst library was constructed that could be used to test and analyze several reported asymmetric, HBD-catalyzed reactions. Three design elements were incorporated into the catalyst library (Figure 2). The aryl component of the aryl pyrrolidine group has been demonstrated to be a crucial variable in a diverse array of reported reactions.4b Arenes with extended π-systems have proven most broadly effective, and successful correlations between enantioselectivity and arene expanse have been established for discrete data sets.5a–f However, this trend is not universal,4a and we selected an assortment of aryl groups for the catalyst library to ensure a spread of enantioselectivity data as required for effective statistical modeling.6–8
Figure 2.

Summary of HBD catalyst library used in this study. R = Me or H.
The presence or absence of methyl substitution at the α-position of the aryl pyrrolidine was introduced as a second design element of the library. Methyl substitution on the pyrrolidine imposes conformational restrictions on these structures and has been shown to enhance their reactivity and enantioselectivity in certain contexts.9 Finally, both thiourea and squaramide HBD cores were examined to probe the origin of enantioselectivity effects imposed by the identity of the HBD component. The library was restricted to catalysts bearing a bis-CF3 phenyl group capping the HBD as this motif imparts enhanced acidity and has proven optimal in a wide range of reactions.10
REACTION SELECTION
Four different reactions previously reported to be catalyzed with high enantioselectivity by aryl pyrrolidine substituted HBDs were selected for this study. By combining published data with newly performed experiments, diverse data sets could be generated with the goal of systematically probing the role of catalyst structural features in inducing enantioselectivity.
The first reaction examined is the nucleophilic substitution by a silyl ketene acetal on a chloro-isochroman (Figure 3A);4a a transformation that was demonstrated to proceed via a discrete oxocarbenium ion intermediate.11 This transformation has been subjected to extensive mechanistic analysis and served as the model reactions in studies on the effects of methyl substitution on the arylpyrrolidine.9,11,12 It is atypical, however, in that the 4-fluorophenylpyrrolidine derivative was identified as optimal, with extended aromatic rings exhibiting decreased enantioselectivity.
Figure 3.

Previously reported HBD-catalyzed transformations for use in statistical modeling: (A) addition to oxocarbenium ions.4a (B) Ring opening of episulfonium ions.5b (C) Oxyallyl ion [4 + 3] cycloaddition.5c (D) Mukaiyama aldol reaction.5c
The enantioselective ring opening of episulfonium ions was selected as the second system for study (Figure 3B).5b This reaction is representative of those where enantioselectivity is highly sensitive to the expanse and polarizability of the aromatic ring on the aryl pyrrolidine. Extended π-conjugation was correlated with higher reactivity and enantioselectivity5b although a maximum beneficial effect was reached with the tricyclic 9-phenanthryl substituent.
The last two reactions included in the training set were the HBD/silyl triflate promoted [4 + 3] cycloaddition reactions and Mukaiyama aldol additions (Figure 3C,D), which were both demonstrated to proceed effectively only with squaramide-based catalysts.5c Enantioselectivity in these transformations was again demonstrated to depend strongly on the identity of the aryl substitution on the pyrrolidine ring.
EXPERIMENTAL DATA COLLECTION
With the four model reactions chosen, we proceed to build a training set of experimental data by performing each reaction under the reported optimized conditions with all 20 HBD catalysts in the library. Given the broad catalyst diversity, we envisioned this approach would generate a well-distributed data set comprised of varying enantioselectivity. Indeed, for the 80 data points collected for statistical modeling, the enantioselectivity window spanned from 0 to 98% e.e. (ΔΔG‡ = 0.0–1.8 kcal/mol at −78 °C). Figure 4 depicts the full set of experimental data as plots of e.e. (%) vs yield (%) for each selected reaction along with a simplified representation of the putative ion-pair intermediate for each transformation featuring the optimal catalyst in each previously published report. In some reactions, a correlation between e.e. and yield was observed but not in others.
Figure 4.

Simplified depictions of the proposed intermediates for each reaction featuring the optimal result from the literature adjacent to the best-performing catalyst from the library screening. All yield and e.e. data from the catalyst library screening are depicted and coded according to catalyst class and Me/H substitution of the pyrrolidine ring.
The nucleophilic addition of silyl ketene acetal nucleophiles13 into oxocarbenium intermediates was the first reaction we focused on for this study (Figure 3A). The nucleophilic addition of silyl ketene acetal to oxocarbenium intermediates was the only reaction among the four examined in this study to exhibit little difference between squaramide and thiourea catalysts with respect to both yield and enantioselectivity. The main driver of enantioselectivity is proposed to arise from a distinct correlation between the arene π-system and enantioselectivity. The best enantioselectivity using this substrate and silyl ketene acetal nucleophiles was obtained using the thiourea, methyl-substituted, 4-fluorophenyl catalyst measured at 93% e.e. This result matches the optimal catalyst reported for this reaction by the Jacobsen lab in 2016.9 The worst selectivity was observed using the squaramide, hydrogen-substituted, 1-pyrene catalyst at 56% e.e. In general, this reaction was significantly improved when methyl-substituted thiourea catalysts in the training set were used and aligns well with previous observations. This trend also holds true when comparing the squaramide pyrene catalysts with the methyl-substituted derivative affording 85% e.e. and the unsubstituted variant affording only 56% e.e.
We continued our data collection with the episulfonium ring opening using a class of indole nucleophiles (Figure 3B). The tri-chloroacetimidate leaving group of the starting material was previously found to be critical in promoting the formation of the episulfonium ion. To ionize the substrate, a strong Brønsted acid (4-nitrobenzenesulfonic acid (NBSA)) additive was employed.5b As depicted in Figure 4, the episulfonium ring opening reaction proceeded with a majority of the catalyst library tested. The optimal catalyst in the 20-catalyst library evaluated was found to be in agreement with the previously published report and identified as the 2-naphthyl thiourea catalyst (85% e.e.),5b with both the Me/H substituted variants performing comparably. The poorest performing catalysts had reaction outputs that were nearly racemic and were identified as the 4-fluorophenyl and phenyl squaramide derivatives at 3 and 6% e.e., respectively. Thiourea type catalysts generally provided higher selectivity than the squaramide catalysts for the episulfonium ring opening. The reported stereochemical dependence on the unique acid additive may explain the origin of divergence in selectivity and perhaps a mismatch with squaramide-based catalysts occurs.8
The subsequent reaction evaluated was the [4 + 3] cycloaddition reaction between an oxyallyl cation and furan as the nucleophile (Figure 3C). The activity of the catalysts in this reaction is enhanced by the presence of a silyl triflate Lewis acid, which is proposed to associate with the chiral hydrogen bond-donating thiourea or squaramide catalyst.5c This is proposed to generate a charge-separated complex with enhanced Lewis acidity relative to the silyl triflate alone, which can activate more challenging electrophiles than was previously observed by HBD catalysts. The results demonstrated for this cycloaddition reaction indicate a strong bias toward higher e.e. and yield for squaramide-based catalysts compared to thioureas.
The best-performing catalyst was the 1-naphthyl, methyl-substituted squaramide catalyst (98% e.e.). This is higher than the previously published result utilizing hydrogen-substituted squaramide catalysts.5c Interestingly, the methyl-substituted variant performed slightly better than the hydrogen-substituted 1-naphthyl squaramide catalyst (94% e.e.). This trend remains consistent for the 1-pyrene derived squaramide catalysts where the methyl-substituted catalyst performed marginally better (97% e.e.) than the hydrogen-substituted counterpart (95% e.e.). Overall, thiourea-based catalysts performed poorly resulting in low yield and selectivity. Substrates with both methyl and benzyl acetals were tested, and similar results were observed. The distinct gap in enantioselectivity and yield between the squaramide and thiourea catalysts could be due to the differences in catalyst acidity. Experimental measurements of pKa values in DMSO have revealed order-of-magnitude differences between analogous squaramide and thiourea-based catalysts with the former being more acidic.10a,b Furthermore, DFT analysis suggests that a direct silylium interaction with the squaramide carbonyl is likely occurring and an analogous interaction may be prohibitive for thiourea catalysts.
The final reaction evaluated was the Mukaiyama aldol addition (Figure 3D). The performance of the catalyst classes was once again segregated based on the nature of the HBD. Activation of these electrophilic species could only be achieved using squaramide-derived pyrrolidine catalysts, similar to the [4 + 3] cycloaddition. This is hypothesized to be due to the Lewis acid complex formed exclusively with the squaramide catalysts using triflate promoters.5c The best-performing catalyst was identified as the methyl-substituted, 1-pyrenyl squaramide, which gave 91% e.e. The worst-performing catalyst was the 4-fluorophenyl squaramide, producing the aldol product in only 30% e.e. Similar to the cycloaddition reaction, extended aromaticity is required for high selectivity, and this reaction type also follows the same trend regarding Me/H pyrrolidine substitution. For the aldol addition, methyl substitution improves the enantioselectivity of the 1-pyrenyl catalyst from 87 to 91% e.e.
Throughout the data collection process, several trends were observed with regard to both yield and enantioselectivity. The presence or absence of extended π-conjugation certainly played a critical role in most of the reactions, presumably participating in various NCIs. Additionally, an acidic threshold of the HBD was required for some reactions to proceed, as has been noted previously.5c Finally, the Me/H substitution of the aryl pyrrolidine had varying levels of influence on the enantioselectivity, depending on the reaction type. This trend suggests that a conformational effect is at play for some reaction types wherein the presence of the methyl group limits flexibility and improves selectivity. The synergistic effects of these catalyst components and how they relate to the various reaction components were the focus of our statistical analysis.
COMPUTATIONAL WORKFLOW
Once the experimental database was collected, our focus turned to in silico treatment of the reaction components and subsequent analysis. A general overview of this data science workflow is depicted in Figure 5. One challenge in the early stages of model development was creating a unifying set of parameters for the various substrates as there was limited structural overlap among them. Considering that all the reactions were proposed to operate through the addition of a nucleophilic reagent to an electrophilic substrate, reaction components were classified and parameterized as either electrophilic or nucleophilic components. A major advantage of using previously published data was the inclusion of substrate scope tables, allowing for increased substrate diversity into the training set. All the variations of the nucleophile and electrophile available from literature sources were incorporated1a,4a,5b,c to probe the effects of variation in these critical reaction components and their effect on enantioselectivity.
Figure 5.

(A) Categorization of general HBD reaction components including catalysts, nucleophiles, and electrophiles. (B) Workflow for computational featurization of HBD catalysis components. (C) Highlight of key catalyst parameters. (D) Highlight of key substrate parameters.
As a primary consideration in the parameterization workflow, we focused on strategies to broaden the chemical space of descriptors to build our computational models. Due to the lack of structural commonality among nucleophile and electrophile reaction components beyond classifying bond-forming electrophilic and nucleophilic atoms, we deployed two-dimensional (2D) parameters typically used in quantitative structure activity relationship (QSAR)-based analyses to supplement the parameter library.14 Each of these four HBD-catalyzed reactions involves the conversion of the electrophile into a cationic intermediate.4b,5b,c To capture the electropositive nature of the substrate in the proposed transition state, all computational treatment of the electrophiles were in the charged state.15–37
After a satisfactory ensemble of conformers was collected (SI) for all reaction components, DFT optimization (B3LYP-D3BJ/6-31G(d,p)) and single-point energy calculations (M06-2X/def2TZVP) were performed.
The DFT parameters utilized and gathered for this study include Sterimol (steric descriptor),16 highest occupied molecular orbital (HOMO) energy, lowest unoccupied molecular orbital (LUMO) energy (Figure 5D), natural bond orbital (NBO) charge (Figure 5C,D),17 and dihedral angles (Figure 5C).
GENERAL STATISTICAL MODEL DEVELOPMENT
Multivariate linear regression analysis was performed on the experimental data collected using the calculated DFT and QSAR parameters to identify potential correlations (Figure 5B). In addition to parameters derived from the lowest energy conformer, all DFT and QSAR related parameters were Boltzmann-averaged based on the relative energies of all DFT-computed conformers. The measured enantioselectivity was converted into ΔΔG‡ using the Gibbs free energy equation (ΔΔG‡ = −RTln(e.r.)), where T represents the temperature for each reaction, R represents the gas constant, and e.r. is the enantiomeric ratio of each data point collected. Converting experimental enantioselectivity to ΔΔG‡ not only allowed us to model the differential energy between major and minor competing transition state ensembles but also permitted us to model reactions together that take place at different temperatures. A forward stepwise linear regression algorithm was used to iteratively generate model candidates. As this process can produce many potential models, common statistical metrics, such as R2, leave-one-out (LOO), and k-fold are used for model comparison as well as the identification of cross terms and overall number of parameters used. Initial efforts to analyze all four reactions in a single regression model did not prove fruitful. We suspect that this outcome is perhaps due to poorly measured reactivity between certain combinations of catalyst and reaction classes. This prompted us to explore potential correlations within more discrete reaction subsets.
Considering that the oxyallyl cation [4 + 3] cycloaddition and Mukaiyama aldol reaction were only effectively catalyzed using the squaramide-based catalysts, we partitioned the data based upon this observation. This “focused” catalyst data set was created using only squaramide-catalyzed reaction data. The oxocarbenium addition and episulfonium ring opening reactions displayed more comparable reactivity with both classes of HBDs and were separated into a “comprehensive” catalyst data set. Specific details regarding the combination of experimentally collected data and results taken from the literature are provided in the Supporting Information.
FOCUSED MODEL DEVELOPMENT
Upon partitioning of the full data set, a satisfactory correlation could be identified for the squaramide-catalyzed transformations shown in Figure 6A. The optimal model for the squaramide-focused data set was created using four parameters, three of which are chemical descriptors related to the catalyst. The fourth parameter was a global descriptor (LUMO energy) used to broadly classify the electrophile. A total of 39 data points were used for the model and split into a 70:30 partition of training set:validation set. Cross-validation and external validation techniques indicated a statistically robust model (LOO = 0.85, 5-fold = 0.84, and predR2 = 0.80). The parameters used for this model include some initial features that were deemed critical in the early model development stage. The Sterimol (B5) term describing the general catalytic pocket of the catalysts along the highlighted N–C bond carried the largest coefficient (0.33) among all catalyst parameters (Figure 6C). This positive correlation strongly suggests that increasing the size of aryl substitution of the aryl pyrrolidine catalyst has a beneficial effect on selectivity. This term serves to differentiate the arene components of the catalyst library via a torsional effect observed from differing aryl substituents. The role of attractive NCIs certainly play an important role for the transition state of these transformations.2 While correlations could be found using NCI energies derived from symmetry-adapted perturbation theory (SAPT) analysis in replacement of steric parameters,38,39 these data sets were found to be better modeled using simpler steric and electronic descriptors. The other Sterimol (B1) term focuses solely on the aryl substitution. This parameter most likely serves as a correction term as it has a small contribution to the model and a modest inverse correlation with enantioselectivity. One other parameter that had a strong positive correlation with enantioselectivity was the LUMO energy of the electrophile with a coefficient of 0.26. The LUMO for these reactions is used as a classification of the different electrophilic reaction components. In contrast, parameters describing the nucleophilic components in the data sets were not needed in the models and were overshadowed by highly correlative electrophilic-based parameters, suggesting that enantioselectivity may be largely driven by catalyst and electrophile interactions for this subset of the data.
Figure 6.

(A) [4 + 3] Cycloaddition and Mukaiyama aldol reactions with squaramide-based HBD catalysts. (B) Oxocarbenium ion addition and episulfonium ring opening reactions with thiourea and squaramide HBD catalysts. (C) Focused linear regression model correlating squaramide-catalyzed reactions and visual representation of associated parameters. (D) Comprehensive linear regression model correlating squaramide- and thiourea-catalyzed reactions and visual representation of associated parameters.
COMPREHENSIVE MODEL DEVELOPMENT
Having generated good multiparameter correlations with enantioselectivity in the focused model study, we turned our efforts to the development of a comprehensive model utilizing reactions that functioned with both squaramide and thiourea HBD catalysts. As noted above, data from the episulfonium ring opening reaction as well as the oxocarbenium ion addition were used in the comprehensive model (Figure 6B). With the application of the forward stepwise algorithm, the model highlighted in Figure 6D was identified. The data set used for the comprehensive model was derived from 41 data points involving a 50:50 split of thiourea- and squaramide-catalyzed reactions. The data library was split into a 70:30 ratio of training set:validation set, with statistical metrics of the comprehensive model resulting in an R2 = 0.90, LOO = 0.79, 5-fold = 0.84, and predR2 = 0.86. These values indicate a statistically robust model with potential for predictive application.
Remarkably, there was a significant overlap of parameters comprising both the comprehensive and focused models. Similarly seen in the focused model, the dihedral angle (Figure 5C) was negatively correlated with a coefficient of 0.23. In general, the parameter appears to highlight distinctions between thiourea vs squaramide catalysts as well as methyl vs hydrogen-substitution on the pyrrolidine ring. For example, the hydrogen-substituted thiourea phenyl pyrrolidine catalyst features a dihedral angle of 162.3° in comparison with 177.4° exhibited by the squaramide analog. In the context of the episulfonium ring opening reaction, these two catalysts provide dramatically different enantioselectivities of 0.83 and 0.05 kcal/mol, respectively. In the same reaction, the methyl-substituted thiourea phenyl pyrrolidine catalyst (152.6°) induces enhanced enantioselectivity at 0.97 kcal/mol.
The second common term between the models was the Sterimol (B1) value of the arene. Intriguingly, the magnitude of the B1 value is inversely correlated with enantioselectivity for both the comprehensive and focused model types. The appearance of this term highlights the important steric contribution of the arene among all the reactions but also sheds light onto the difficulty in correlating all reaction types into a single regression model. The new parameter in the comprehensive model was the inclusion of the NBO partial charge of the N–H hydrogen of the HBD functional group adjacent to the aryl group. This highly weighted (coefficient = 0.30) parameter is negatively correlated, indicating that a lower NBO partial charge of the HBD hydrogen results in increased enantioselectivity. Structurally, the NBO parameter is a classification term explicitly differentiating between thiourea and squaramide scaffolds. Similar to the focused model, a single topological charge parameter was used to describe the electrophile in the comprehensive model.
Overall, the two models highlight the shared features that are critical for enantioselectivity among thiourea and squaramide catalysis. The appearance of common parameters indicates that we have identified the overarching operating components of this chiral catalyst class through our workflow. The lack of correlation with nucleophile-based parameters coupled with strong correlations with electrophile descriptors lends support to many of the previously proposed transition states where a tightly bound ion pair between catalyst and electrophile is required for achieving high enantioselectivity.4a,5b,c
VIRTUAL SCREENING FOR RETROSPECTIVE OPTIMIZATION
With the computational models developed, we sought to test them via application toward predictions of a reaction class not included in our model training. Given the common steric and electronic parameters between the models and strong dependence on the electrophile, a new electrophilic reaction component was selected as the out-of-sample scenario. Nucleophilic additions to imines were among the earliest reactions reported using these catalyst classes, with the first enantioselective Strecker reaction using a thiourea catalyst featuring a Schiff base.1a This catalyst class was also applied to an asymmetric Mannich reaction.40 This reaction was of particular interest as the silyl ketene acetal nucleophiles were already represented in our original training set. Therefore, we hypothesized that this reaction would provide a good opportunity to assess the model’s capability to predict the performance of an out-of-sample electrophile. The reaction shown in Figure 7 produced the desired Mannich adduct at 54% e.e. at room temperature (the previously optimized reaction was performed at lower temperature with a different nucleophile). Using this reaction platform, we performed a virtual screen of our entire catalyst library. Interestingly, both the focused and comprehensive models predicted the 1-pyrene substituted thiourea catalyst as optimal at 1.69 and 1.96 kcal/mol, respectively. This result suggests that both models encompass the nuances required to differentiate between both catalyst types and agree on out-of-sample reaction scenarios. If these predictions are valid, enantioselectivity in this reaction could be improved by >1.0 kcal/mol compared to the previously reported results.
Figure 7.

Out-of-sample predictions for enantioselective Mannich reactions using both focused and comprehensive models. Average prediction error was used as the statistical metric and accompanied by specific catalyst examples demonstrating predictive capability.
These predictions were tested experimentally with 17 catalysts using thiourea and squaramide-based catalysts, and the results are compiled in Figure 7. Overall, the models were able to accurately predict 17 catalysts with an average prediction error of 0.37 kcal/mol for the comprehensive model and 0.31 kcal/mol for the focused squaramide model. Furthermore, 11 examples were predicted within 0.10 kcal/mol of the experimental results. With regards to the 1-pyrenyl thiourea catalyst, the comprehensive model accurately predicted the optimal result (predicted: 89% e.e. and observed: 87% e.e.). This translates into an approximate 1.0 kcal/mol improvement at room temperature relative to the previously reported optimal catalyst, obviating the need to perform this reaction at cryogenic temperatures. Furthermore, the models are adept at predicting not only which catalysts will perform well but which catalysts will perform poorly (useful knowledge to conserve synthetic resources). In the case of the 1-pyrene squaramide catalyst, the comprehensive model succeeded in predicting that the squaramide functional group has a detrimental effect on enantioselectivity (predicted: 42% e.e. and observed: 44% e.e.). It bears mention that squaramide catalysts had not previously been studied for this reaction. These examples showcase how the predictive capabilities of the model can be used to further optimize new reaction types and greatly reduce the experimental optimization effort.
CONCLUSIONS
In summary, a data-driven approach has been established to model the unique character of aryl pyrrolidine-based HBD catalysts and predict their performance. This statistical workflow enabled the correlation of diverse reaction types catalyzed by these privileged HBD structures. Featurization of disparate reaction components has led to the development of parallel models highlighting key steric and electronic characteristics of catalysts and substrates. Furthermore, the statistical models were used in tandem to virtually screen the catalyst library for an out-of-sample Mannich reaction, resulting in a re-optimization of catalyst structure and improved the e.e. from a reported 54 to 87%. This predictive capability validates the model development and showcases the potential for further application in new reaction design. Future work will include leveraging this data science workflow for novel catalyst design and illuminating specific NCIs for mechanistic elucidation of newly developed methods.
Supplementary Material
ACKNOWLEDGMENTS
This research was supported by the NIH (R35 GM136271 to M.S.S. and RO1 GM43214 to E.N.J.). J.A.R. and J.L.H.W. gratefully acknowledge the NIH for financial support in the form of postdoctoral fellowships (F32GM128351 and F32GM134614, respectively).
Footnotes
The authors declare no competing financial interest.
ASSOCIATED CONTENT
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acscatal.2c04824.
Computational methods and experimental procedures (PDF)
Reaction component parameter tables (XLSX)
Complete contact information is available at: https://pubs.acs.org/10.1021/acscatal.2c04824
Contributor Information
Mohammad H. Samha, Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
Julie L. H. Wahlman, Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
Jacquelyne A. Read, Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, United States; Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
Jacob Werth, Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States.
Eric N. Jacobsen, Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, United States
Matthew S. Sigman, Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
REFERENCES
- (1).(a) Sigman MS; Jacobsen EN Schiff Base Catalysts for the Asymmetric Strecker Reaction Identified and Optimized from Parallel Synthetic Libraries. J. Am. Chem. Soc. 1998, 120, 4901–4902. [Google Scholar]; (b) For reviews: Doyle AG; Jacobsen EN Small-Molecule H-Bond Donors in Asymmetric Catalysis. Chem. Rev. 2007, 107, 5713–5743. [DOI] [PubMed] [Google Scholar]; (c) Taylor MS; Jacobsen EN Asymmetric Catalysis by Chiral Hydrogen-Bond Donors. Angew. Chem., Int. Ed. 2006, 45, 1520–1543. [DOI] [PubMed] [Google Scholar]
- (2).Knowles RR; Jacobsen EN Attractive Noncovalent Interactions in Asymmetric Catalysis: Links between Enzymes and Small Molecule Catalysts. Proc. Natl. Acad. Sci. U. S. A. 2010, 107, 20678–20685. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (3).(a) Raheem IT; Thiara PV; Peterson EA; Jacobsen EN Enantioselective Pictet–Spengler-Type Cyclizations of Hydroxylactams: H-Bond Donor Catalysis by Anion Binding. J. Am. Chem. Soc. 2007, 129, 13405–13406. [DOI] [PubMed] [Google Scholar]; (b) For reviews: Brak K; Jacobsen EN Asymmetric Ion-Pairing Catalysis. Angew. Chem., Int. Ed. 2013, 52, 534–561. [DOI] [PMC free article] [PubMed] [Google Scholar]; (c) Zhang Z; Schreiner PR (Thio)Urea Organocatalysis—What Can Be Learnt from Anion Recognition? Chem. Soc. Rev. 2009, 38, 1187–1198. [DOI] [PubMed] [Google Scholar]; (d) Entgelmeier L-M; García Mancheño O Activation Modes in Asymmetric Anion-Binding Catalysis. Synthesis 2022, 54, 3907–3927. [Google Scholar]
- (4).(a) Reisman SE; Doyle AG; Jacobsen EN Enantioselective Thiourea-Catalyzed Additions to Oxocarbenium Ions. J. Am. Chem. Soc. 2008, 130, 7198–7199. [DOI] [PMC free article] [PubMed] [Google Scholar]; (b) Review: Strassfeld DA; Jacobsen EN The Aryl-Pyrrolidine-tert-Leucine Motif as a New Privileged Chiral Scaffold: The Role of Noncovalent Stabilizing Interactions. In Supramolecular Catalysis: New Directions and Developments; van Leeuwen PWNM; Raynal M, Eds.; Wiley, 2022; Chapter 25, pp 361–385. [Google Scholar]
- (5).(a) Knowles RR; Lin S; Jacobsen EN Enantioselective Thiourea-Catalyzed Cationic Polycyclizations. J. Am. Chem. Soc. 2010, 132, 5030–5032. [DOI] [PMC free article] [PubMed] [Google Scholar]; (b) Lin S; Jacobsen EN Thiourea-catalysed ring opening of episulfonium ions with indole derivatives by means of stabilizing, non-covalent interactions. Nat. Chem. 2012, 4, 817–824. [DOI] [PMC free article] [PubMed] [Google Scholar]; (c) Banik SM; Levina A; Hyde AM; Jacobsen EN Lewis Acid Enhancement by Hydrogen-Bond Donors for Asymmetric Catalysis. Science 2017, 358, 761–764. [DOI] [PMC free article] [PubMed] [Google Scholar]; (d) Bendelsmith AJ; Kim SC; Wasa M; Roche SP; Jacobsen EN Enantioselective Synthesis of α-Allyl Amino Esters via Hydrogen-Bond Donor Catalysis. J. Am. Chem. Soc. 2019, 141, 11414–11419. [DOI] [PMC free article] [PubMed] [Google Scholar]; (e) Kutateladze DA; Strassfeld DA; Jacobsen EN Enantioselective Tail-to-Head Cyclizations Catalyzed by Dual-Hydrogen-Bond Donors. J. Am. Chem. Soc. 2020, 142, 6951–6956. [DOI] [PMC free article] [PubMed] [Google Scholar]; (f) Ronchi E; Paradine SM; Jacobsen EN Enantioselective, Catalytic Multicomponent Synthesis of Homoallylic Amines Enabled by Hydrogen-Bonding and Dispersive Interactions. J. Am. Chem. Soc. 2021, 143, 7272–7278. [DOI] [PMC free article] [PubMed] [Google Scholar]; (g) Strassfeld DA; Algera RF; Wickens ZK; Jacobsen EN A Case Study in Catalyst Generality: Simultaneous Highly-Enantioselective Brønsted- and Lewis-Acid Mechanisms in Hydrogen-Bond-Donor Catalyzed Oxetane Openings. J. Am. Chem. Soc. 2021, 143, 9585–9594. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (6).Reid JP; Sigman MS Holistic Prediction of Enantioselectivity in Asymmetric Catalysis. Nature 2019, 571, 343–348. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (7).Werth J; Sigman MS Connecting and Analyzing Enantioselective Bifunctional Hydrogen Bond Donor Catalysis Using Data Science Tools. J. Am. Chem. Soc. 2020, 142, 16382–16391. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (8).Sigman MS; Harper KC; Bess EN; Milo A The Development of Multidimensional Analysis Tools for Asymmetric Catalysis and Beyond. Acc. Chem. Res. 2016, 49, 1292–1301. [DOI] [PubMed] [Google Scholar]
- (9).Lehnherr D; Ford DD; Bendelsmith AJ; Rose Kennedy C; Jacobsen EN Conformational Control of Chiral Amido-Thiourea Catalysts Enables Improved Activity and Enantioselectivity. Org. Lett. 2016, 18, 3214–3217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (10) <j/>(a).Jakab G; Tancon C; Zhang Z; Lippert KM; Schreiner PR (Thio)Urea Organocatalyst Equilibrium Acidities in DMSO. Org. Lett. 2012, 14, 1724–1727. [DOI] [PubMed] [Google Scholar]; (b) Ni X; Li X; Wang Z; Cheng JP Squaramide Equilibrium Acidities in DMSO. Org. Lett. 2014, 16, 1786–1789. [DOI] [PubMed] [Google Scholar]
- (11).Ford DD; Lehnherr D; Kennedy CR; Jacobsen EN Anion-Abstraction Catalysis: The Cooperative Mechanism of α-Chloroether Activation by Dual H-Bond Donors. ACS Catal. 2016, 6, 4616–4620. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (12) <j/>(a).Ford DD; Lehnherr D; Kennedy CR; Jacobsen EN On- and Off-Cycle Catalyst Cooperativity in Anion-Binding Catalysis. J. Am. Chem. Soc. 2016, 138, 7860–7863. [DOI] [PMC free article] [PubMed] [Google Scholar]; (b) Kennedy CR; Lehnherr D; Rajapaksa NS; Ford DD; Park Y; Jacobsen EN Mechanism-Guided Development of a Highly Active, Dimeric Thiourea Catalyst for Anion-Abstraction Catalysis. J. Am. Chem. Soc. 2016, 138, 13525–13528. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (13).Taylor MS; Tokunaga N; Jacobsen EN Enantioselective Thiourea-Catalyzed Acyl-Mannich Reactions of Isoquinolines. Angew. Chem., Int. Ed. 2005, 44, 6700–6704. [DOI] [PubMed] [Google Scholar]
- (14).Todeschini R; Consonni V Molecular Descriptors for Chemoinformatics. In Molecular Descriptors for Chemoinformatics; 2010; Vol. 2; pp 1–252. [Google Scholar]
- (15).Wendlandt AE; Vangal P; Jacobsen EN Quaternary Stereocentres via an Enantioconvergent Catalytic SN1 Reaction. Nature 2018, 556, 447–451. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (16).Brethomé AV; Fletcher SP; Paton RS Conformational Effects on Physical-Organic Descriptors: The Case of Sterimol Steric Parameters. ACS Catal. 2019, 9, 2313–2323. [Google Scholar]
- (17).Weinhold F; Landis CR Natural Bond Orbitals and Extenstions of Loalized Bonding Concepts. Chem. Educ. Res. Pract. 2001, 2, 91–104. [Google Scholar]
- (18).Yeung CS; Ziegler RE; Porco JA; Jacobsen EN Thiourea-Catalyzed Enantioselective Addition of Indoles to Pyrones: Alkaloid Cores with Quaternary Carbons. J. Am. Chem. Soc. 2014, 136, 13614–13617. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (19).Klausen RS; Kennedy CR; Hyde AM; Jacobsen EN Chiral Thioureas Promote Enantioselective Pictet-Spengler Cyclization by Stabilizing Every Intermediate and Transition State in the Carboxylic Acid-Catalyzed Reaction. J. Am. Chem. Soc. 2017, 139, 12299–12309. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (20).Mayfield AB; Metternich JB; Trotta AH; Jacobsen EN Stereospecific Furanosylations Catalyzed by Bis-Thiourea Hydrogen-Bond Donors. J. Am. Chem. Soc. 2020, 142, 4061–4069. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (21).Trotta A; Jacobsen EN Chiral Ureas, Thioureas, and Squaramides in Anion-Binding Catalysis with Co-Catalytic Brønsted/Lewis Acids. In Anion-Binding. Catalysis; Wiley, 2022; 141–159. [Google Scholar]
- (22).Brown AR; Uyeda C; Brotherton CA; Jacobsen EN Enantioselective Thiourea-Catalyzed Intramolecular Cope-Type Hydroamination. J. Am. Chem. Soc. 2013, 135, 6747–6749. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (23).Uyeda C; Jacobsen EN Transition-State Charge Stabilization through Multiple Non-Covalent Interactions in the Guanidinium-Catalyzed Enantioselective Claisen Rearrangement. J. Am. Chem. Soc. 2011, 133, 5062–5075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (24).Zuend SJ; Jacobsen EN Mechanism of Amido-Thiourea Catalyzed Enantioselective Imine Hydrocyanation: Transition State Stabilization via Multiple Non-Covalent Interactions. J. Am. Chem. Soc. 2009, 131, 15358–15374. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (25).Klauber EG; De CK; Shah TK; Seidel D Merging Nucleophilic and Hydrogen Bonding Catalysis: An Anion Binding Approach to the Kinetic Resolution of Propargylic Amines. J. Am. Chem. Soc. 2010, 132, 13624–13626. [DOI] [PubMed] [Google Scholar]
- (26).Brown AR; Kuo WH; Jacobsen EN Enantioselective Catalytic α-Alkylation of Aldehydes via an S N1 Pathway. J. Am. Chem. Soc. 2010, 132, 9286–9288. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (27).Santiago CB; Guo J-Y; Sigman MS Predictive and mechanistic multivariate linear regression models for reaction development. Chem. Sci. 2018, 9, 2398–2412. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (28).Levin MD; Ovian JM; Read JA; Sigman MS; Jacobsen EN Catalytic Enantioselective Synthesis of Difluorinated Alkyl Bromides. J. Am. Chem. Soc. 2020, 142, 14831–14837. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (29).Peterson EA; Jacobsen EN Enantioselective, Thiourea-Catalyzed Intermolecular Addition of Indoles to Cyclic N-Acyl Iminium Ions. Angew. Chem., Int. Ed. 2009, 48, 6328–6331. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (30).Sharma HA; Essman JZ; Jacobsen EN Enantioselective Catalytic 1,2-Boronate Rearrangements. Science 2021, 374, 752–757. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (31).Strassfeld DA; Wickens ZK; Picazo E; Jacobsen EN Highly Enantioselective, Hydrogen-Bond-Donor Catalyzed Additions to Oxetanes. J. Am. Chem. Soc. 2020, 142, 9175–9180. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (32).Kutateladze DA; Jacobsen EN Cooperative Hydrogen-Bond-Donor Catalysis with Hydrogen Chloride Enables Highly Enantioselective Prins Cyclization Reactions. J. Am. Chem. Soc. 2021, 143, 20077–20083. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (33).Schifferer L; Stinglhamer M; Kaur K; Machenño OG Halides as Versatile Anions in Asymmetric Anion-Binding Organocatalysis. Beilstein J. Org. Chem. 2021, 17, 2270–2286. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (34).Ovian JM; Jacobsen EN A Catalytic One-Two Punch. Science. Science 2019, 366, 948–949. [DOI] [PubMed] [Google Scholar]
- (35).Kennedy CR; Choi BY; Reeves MGR; Jacobsen EN Enantioselective Catalysis of an Anionic Oxy-Cope Rearrangement Enabled by Synergistic Ion Binding. Isr. J. Chem. 2020, 60, 461–474. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (36).Kutateladze DA; Wagen CC; Jacobsen EN Chloride-Mediated Alkene Activation Drives Enantioselective Thiourea and Hydrogen Chloride Co-Catalyzed Prins Cyclizations. J. Am. Chem. Soc. 2022, 144, 15812–15824. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (37).Metternich JB; Reiterer M; Jacobsen EN Asymmetric Nazarov Cyclizations of Unactivated Dienones by Hydrogen-Bond-Donor/Lewis Acid Co–Catalyzed. Enantioselective Proton-Transfer. Adv. Synth. Catal. 2020, 362, 4092–4097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (38).Stone AJ Computation of charge-transfer energies by perturbation theory. Chem. Phys. Lett. 1993, 211, 101–109. [Google Scholar]
- (39).Stone AJ; Misquitta AJ Charge-transfer in symmetry-adapted perturbation theory. Chem. Phys. Lett. 2009, 473, 201–205. [Google Scholar]
- (40).Wenzel AG; Jacobsen EN Asymmetric Catalytic Mannich Reactions Catalyzed by Urea Derivatives: Enantioselective Synthesis of β-Aryl-β-Amino Acids. J. Am. Chem. Soc. 2002, 124, 12964–12965. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
