Abstract
The generalization of related asymmetric processes in organocatalyzed reactions is an ongoing challenge due to subtle, non-covalent interactions driving selectivity. This lack of transferability is often met with a largely empirical approach to optimizing catalyst structure and reaction conditions. This has led to the development of diverse structural catalyst motifs and inspired unique design principles in this field. Bifunctional hydrogen bond donor (HBD) catalysis exemplifies this in which a broad collection of enantioselective transformations has been successfully developed. Herein, we describe the use of data science methods to connect catalyst and substrate structural features of an array of reported enantioselective bifunctional HBD catalysis through an iterative statistical modeling process. The computational parameters used to build the correlations are mechanism-specific based on the proposed transition states, which allows for analysis into the non-covalent interactions responsible for asymmetric induction. The resulting statistical models also allow for extrapolation to out of sample examples to provide a prediction platform that can be used for future applications of bifunctional hydrogen bond donor catalysis. Finally, this multi-reaction workflow presents an opportunity to build statistical models unifying various modes of activation relevant to asymmetric organocatalysis.
Graphical Abstract

INTRODUCTION
Asymmetric organocatalysis has and continues to represent a broad landscape of reactivity for enantioselective bond construction.1 Iterative design has led to unique eras of privileged catalysts with subtle changes continually expanding the reach of asymmetric induction to new reactions.2 Although most of the catalysts in this genre contain readily tunable features to pursue opportunities in unrealized reaction space, the non-covalent interactions (NCIs), from hydrogen bonding to π-interactions, responsible for selectivity have been difficult to define and ultimately translate into novel catalyst design.3 For this reason, optimization efforts have been traditionally empirical, evaluating substrate and catalyst structural features as well as other reaction parameters through trial-and-error.4
Considering this limitation and the significant research activity in the field over the last several decades, we have initiated a program to analyze asymmetric organocatalysis retrospectively through combining statistical tools with physical organic descriptions of reaction features.5 This approach allows us to quantify descriptive, NCIs between unique reaction combinations. By data mining a particular reaction class, we can better determine structural aspects pertinent for enantioselectivity and apply the statistical models to prediction for future applications.6 In this context, our group recently reported this strategy to analyze asymmetric nucleophilic additions to imines enabled by binol-derived chiral phosphoric acid organocatalysis.7 This privileged class of catalysts has been defined by C2-symmetry, rigidity imparted by the binapthyl backbone, and relatively simple diversification strategies (functionalization at the 3, 3’ positions of the binol structure).8
As a significant step forward, we evaluated this general workflow with another important class of organocatalysts, namely bifunctional hydrogen bond donor (HBD). As compared to binol phosphoric acid catalysts, the diversity of structures reported in this area of organocatalysis is immense. Additionally, most catalysts are C1 symmetric and constitute a wide spectrum of structural flexibility.9 In considering this type of catalysis, several questions are apparent: 1) is the mechanism of bifunctional HBD asymmetric catalysis generalizable for many different catalyst classes especially those from various generations? 2) can these catalysts be connected in a similar manner as chiral phosphoric acids, and lessons be garnered about the underlying subtleties required for effective asymmetric catalysis? And, finally what NCIs dominate reactions where no covalent interactions exist between the substrate and catalyst? Herein, we begin to probe these questions by data mining a sub class of HBD catalysis, featurizing the molecules involved in the reactions using computationally derived physical organic parameters, and analyzing the resulting correlations. This multi-reaction parameterization workflow allows the unification of bifunctional HBD catalysts, while also providing a prediction platform for reaction outcomes of novel catalyst and substrate combinations (Figure 1).
Figure 1.

Workflow for multi-reaction parameterization using linear regression analysis and further application in predicting new reaction components. Overview of the study’s goals to correlate diverse catalyst classes and elucidate NCIs contributing to enantioselectivity.
General approach.
To initiate the study, we determined bifunctional activation, originally defined by what has been termed the “Takemoto” catalyst, as an area of HBD catalysis that exhibits an intriguing body of literature due to the catalyst diversity and reported empirical enantioselectivity data.10 Bifunctional HBD catalysts are proposed to simultaneously activate both the nucleophile and electrophile partners through stabilizing non-covalent interactions, typically by a tertiary amine base and urea/thiourea/squaramide-based HBD, respectively.11 Enantioselective variants of the Michael addition, transfer hydrogenation and Mannich reaction have been realized by their development.12 The relatively simple nature of these catalysts in terms of synthesis has enabled nearly two decades of catalyst derivatization efforts, producing a remarkably diverse library.13 This presents an ideal setting to apply data science techniques to correlate multiple reaction outputs and pursue answers to the overarching questions posed above. Additionally, although the general steps of these mechanisms are assumed to be fundamentally similar for a wide variety of reactions, the particular reaction conditions often do not effectively translate to a similar transformation.14 This lack of reaction transfer knowledge presents an opportunity to gain a fundamental understanding of catalytic bifunctional activation through multivariate linear regression analysis. By taking advantage of readily available data science based statistical tools, prediction of reaction outcomes can be accomplished through mechanistically driven probes.15
RESULTS AND DISCUSSION
Reaction Selection.
To focus model development towards a deeper understanding of the disparate structural features between the catalyst types required for optimal reactivity, one electrophile class was initially selected to remain constant. In this regard, the enantioselective addition of nucleophiles to nitroalkenes has served as a proving ground for new catalyst development.16 This is in part due to products of these reactions serving as chiral, primary amine precursors, which have multifaceted use in synthetic organic chemistry.17 Consequently, bifunctional activation of nitroalkenes is abundantly represented in the literature across an eclectic group of catalysts.
A total dataset of 150 unique reactions from seven literature reports was curated for parameter collection and analysis.12a, 18 The spread of enantioselectivity measurements is broad, comprising a ΔΔG‡ window of 0.0–3.0 kcal/mol. The training set included combinations of 39 catalysts, 51 electrophiles, 21 nucleophiles, and 11 solvents. As such, the statistical modeling approach requires structural/molecular descriptors in which a combination of density functional theory (DFT), quantitative structure-activity relationships (QSAR) and molecular mechanics (MM) are used to create a database of parameters to enable the development of linear regression models.19 Thus, the most significant challenge in the early stages of our workflow was developing suitable parameters to appropriately capture subtle differences between reaction components with limited structural overlap. Guided in part by the proposed transition state of bifunctional HBD catalysis, the various contact points between reaction components were used to generate common parameters featured across all reaction types (Figure 2). By independently parameterizing the three key reaction components of bifunctional HBD activation (catalyst, electrophile and nucleophile), we hypothesized various types of reactions could be correlated. Additionally, the model’s capability of extrapolating to new reaction types via “out-of-sample” predictions would be most beneficial if applicable to diverse systems. As exemplified in Figure 2, this approach allows for inclusion of structurally distinct components into the training set.
Figure 2.

Overview of the computational workflow used in this study. Categorization of reaction components and molecular descriptors used to characterize key elements of chiral induction. Each category displays specific examples included in the training set used for multivariate linear regression analysis.
The multi-reaction workflow was initiated with a conformational search (mixed torsional/low-mode sampling) for all substrates, catalysts and solvents used in the study. DFT-level optimization (M06–2X/def2-TZVP//B3LYPD3BJ/6–31(d,p)) of the conformers in the gas phase was used to identify the lowest-energy conformer from which parameters were collected. The derived parameters focused specifically on the bond forming steps of the reactions and exploring the effects of shared reaction component NCIs on enantioselectivity included Sterimol (multidimensional, steric descriptor), NMR chemical shifts, natural bond orbital (NBO) charges, bond lengths and IR frequencies/vibrations.20 Sterimol and NBO descriptors have been shown to correlate discrete datasets of chiral thiourea/squaramide catalysis.18a, 21 Global parameters including highest occupied molecular orbital (HOMO)/lowest unoccupied molecular orbital (LUMO) energies and polarizability were also collected for each reaction component. An iterative modeling process facilitates the refinement of descriptors and additional parameter collection if deemed necessary.
Solvent effects were treated in a similar fashion. For each solvent, parameters were collected from DFT-optimized structures. In addition to the parameters listed above, topological, two-dimensional descriptors from various indices were used as well including surface area and molecular shape.22
Model Development
Multivariate linear regression analysis was performed on the curated enantioselectivity dataset using the collected parameters to identify potential correlations (see SI for details).6 Measured enantioselectivity was calculated using Gibb’s free energy equation (ΔΔG‡ = −RTln|er|) where T represents the temperature of the reaction, R is the gas constant and er is the enantiomeric ratio. This allows for different reaction temperatures to be directly compared. Absolute enantiomeric excesses were applied, which means the statistical model lacks the ability to predict the absolute configuration. This requirement was imperative in producing a well-distributed dataset and eliminating bias in the model. Furthermore, this strategy obviates the reliance of reported absolute configuration assignment in the dataset design. Prospective models generated through forward stepwise linear regression were assessed by general statistical metrics (R2 and internal validation methods to avoid overfitting), number of parameters and presence of cross-terms. As mechanistic understanding of the model is an overarching goal of this workflow, ease of interpretation is typically considered in the model search and can be further simplified by removing cross-terms of parameters.
A good correlation was found (R2 = 0.82) and the optimal model revealed a strong dependence on the structural composition of the catalyst indicated by the number of catalyst-based parameters (Figure 3A). All reaction components including catalyst (red), electrophile (green), nucleophile (blue) and solvent (black) are represented in the model across eight parameters. Cross-validation analysis and external validation were used to indicate a robust model (LOO = 0.76, 5-fold = 0.75). External validation was performed by pseudorandom partitioning of the entire data set into 50:50, training set: validation set (predR2 = 0.81). To further validate our model and test its general predictive nature, leave-one-reaction-out (LORO) analysis was used to demonstrate prediction of catalyst and nucleophile types not included in the training set. In this process, one reaction (by specific publication) is removed from the data set and held as the validation set. The model is retrained on the remaining data set and is used to predict the left-out reaction. Notably, a LORO average of predR2 = 0.72 ± 0.22 was achieved suggesting that the model is not dependent on a singular reported reaction.
Figure 3.

Multivariate regression analysis of bifunctional HBD catalyzed reactions of nitroalkenes (150 reactions). A. Regression model of external validation (predR2 = 0.81) by pseudorandom 50:50 partitioning of data into training set: validation set. The leave-one-reaction-out (LORO) average score is 0.72 ± 0.22; the leave-one-out (LOO) cross-validation score is 0.76; the 5-fold cross-validation score is 0.75. B. Visual representation of parameters used for model. C. Representative examples of catalyst diversity incorporated into the model.
A representation of the eight parameters is depicted in Figure 3B. Two unique parameters were used to describe the nucleophile of each reaction: B5 (avg) and NBO charge of the atom undergoing deprotonation/activation by the catalyst. The Sterimol values were calculated to capture the steric profile of the nucleophile along the H‒X bond being activated. An average was used to accommodate cases where multiple protons were present. The NBO charge of the nucleophile carries the largest value among all parameters, most likely due to the classification of carbon atoms vs heteroatoms (N, S, P), where the distribution of values is significant. These nucleophile parameters demonstrate the capability of quantitative descriptors to directly relate to the relevant bond activation or transition state of a reaction of interest. Due to designed structural overlap of the electrophiles, only one parameter (polarizability) was required in the model to describe the general character/size of substitution and potential NCIs at play, which is discussed further below.
The model emphasized the importance of catalyst structure through four unique parameters: NBO(H) and NBO(N) charge values were used to recognize subtle steric and electronic differences of the chiral portion of the catalyst as well as describe the type of HBD catalyst (urea/thiourea/squaramide). Additionally, an IR stretching intensity and B1 Sterimol term further defined the achiral portion of the HBD catalyst. Detailed analysis of these parameters will be described below. Finally, a topological descriptor (PEOE1) was used to categorize the changes in van der waals surface area of the various solvents. Surprisingly, parameters for reaction additives such as molecular sieves or heterogeneous bases were not required to achieve this correlation.
Model Extrapolation to New Catalyst Classes
The next step in validating our model was to test the ability of extrapolating to out of sample predictions. As a first step towards this evaluation, all cinchona alkaloid catalyst structural types were removed from the training set and used as a validation set. The model was retrained using the same parameters and provided an accurate prediction of this catalyst subclass with a predR2 of 0.76 (Figure 4). This successful result prompted us to continue challenging the predictive ability of the model by expanding to unique catalyst structures not used in the development stage of the initial model. Each of the following scenarios feature a new catalyst type in addition to a novel reaction component, either electrophile or nucleophile.
Figure 4.

Prediction of cinchona alkaloid catalyzed reactions. Reactions catalyzed by cinchona alkaloid-based catalysts were removed from the training set and held as the validation set.
Average prediction error was used as the statistical metric to evaluate the test cases as it is unbiased by sample size or data distribution.23 Considering the model was developed to encompass many catalyst types, the goal of this prediction analysis is to identify general trends of new reactions rather than distinguish between subtleties of substrate-dependent enantioselectivity. To put this metric into perspective, a 0.30 kcal/mol average error would predict a 90% ee reaction at room temperature within the range of 84 – 94% ee.
The first scenario (Figure 5A) used a traditional, Jacobsen-type HBD catalyzed asymmetric hydrogenation of trifluoromethyl-substituted, nitroalkenes.24 The nucleophile is captured in the training set as is the reaction type. However, the trifluoromethyl substituted-nitroalkene class is a new reaction component. Extension to the novel catalyst scaffold and type of nitroalkene was successful and resulted in an average enantioselectivity prediction error of 0.33 ± 0.19 kcal/mol. The best predictions included non-styrenyl derived nitroalkenes such as nonyl and benzyl substitution (0.03 – 0.45 kcal/mol). Styrene based substrates produced greater variance in enantioselectivity leading to larger prediction error (0.25 – 0.82 kcal/mol).
Figure 5.

A–C. Out-of-sample predictions for new reaction types. Each example contains a novel catalyst type and an additional reaction component unique to the training set. Selectivity examples for each new reaction type highlight the strength and limitation of the model’s predictive capability (obs. = observed, pred. = predicted).
The second catalyst test case used an acyclic analogue of Takemoto’s catalyst in the presence of a novel nucleophile class (2-coumaranone derivatives) to predict 20 examples with an average error of 0.25 ± 0.19 kcal/mol (Figure 5B).25 The electrophile class was highly represented in the training data. In this case, the prediction error reflects the modest range of enantioselectivity reported for this reaction with the poorest predictions found containing styrene derived nitroalkenes that include 2,6-ortho substitution on the arene, which are not represented in the training set.
The final prediction test used a unique catalyst, designed by Ricci et al. to achieve enantioselective addition of indole derivatives into nitroalkenes. The catalyst featured a hydroxyl group in place of the tertiary amine typically employed (Figure 5C).26 Neither this catalyst nor indoles were included in the training set. With a novel catalyst/nucleophile pairing, an average error of 0.21 ± 0.12 kcal/mol over 9 examples was found. Although general prediction error was low, the poorest prediction (0.32 kcal/mol) resulted from a unique chloro-substitution on the indole nucleophile, which is distal from the point of activation. Overall, the modest prediction errors associated with the out-of-sample analysis suggest a robust model describing bifunctional HBD catalysis. As highlighted in Figure 5, the selectivity examples illustrate the model’s deficiency in accurately capturing significant enantioselectivity changes due to minor substrate modifications.
Model Extrapolation to New Electrophile Classes
To further establish the mechanistic transferability of the model, we explored electrophiles beyond nitroalkenes. Given the number of parameters describing the catalyst and nucleophile, we hypothesized that similar mechanistic pathways activating a unique electrophile class could be described. Imine transformations were found to be a highly represented in the bifunctional catalysis literature and allowed us to identify suitable extrapolation tests for our model. Electrophiles for these reactions were computationally featurized as the published starting material, either imines or amines proposed to proceed through an iminium intermediate.
As a first case, an asymmetric Mannich reaction, which features a catalyst and nucleophile represented in the training set was used to probe our hypothesis (Figure 6A).12b Interestingly, a reasonable prediction was achieved with an average prediction error of 0.39 ± 0.17 kcal/mol across 29 examples. The error and standard deviation indicate that the nitroalkene model in Figure 3A can predict the distribution of observed enantioselectivities (observed: 25 – 97% ee, predicted: 62 – 94% ee) with a single electrophile-based parameter (polarizability) for this new class of electrophiles. This result suggests the catalyst and nucleophile parameters can describe broad reactivity and a relatively similar binding mode is active between the two electrophile classes.
Figure 6.

A–C. Extrapolation of nitroalkene model to various imine transformations. Average prediction error used as the statistical metric accompanied by specific examples demonstrating predictive capability and limitation of the model (obs. = observed, pred. = predicted).
The next scenario featured a nitroalkane as the nucleophile in an aza-Henry reaction reported by Takemoto et al. (Figure 6B).27 The model predicted the reaction with an average prediction error of 0.19 ± 0.18 kcal/mol using 10 examples. Arene and styrene derived imines were predicted with the lowest error (0.01 – 0.13 kcal/mol) while heteroaromatic substitution produced relatively higher error (0.15 – 0.48 kcal/mol) due to their lower polarizability values. Furthermore, the presence of a tosyl protecting group had the most detrimental effect on selectivity (4% ee) but could not be accurately captured by the model.
As a final test, a more challenging scenario was probed wherein the nucleophile is not captured in the training set and the squaramide catalyst is only featured once (Figure 6C).28 An average prediction error and standard deviation of 0.31 ± 0.23 kcal/mol over 26 examples was found, indicating the general trend can be accurately described (observed: 38 – 94% ee, predicted: 79 – 89% ee). Collectively, these out of sample electrophile tests suggest a related mechanism of stereoinduction is functioning for both nitroalkene and imine transformations.
Given the predictive capability of our model, we sought to further challenge it by testing more complex, cascade-type reactions involving nitroalkenes and imines catalyzed by bifunctional hydrogen bond donors (Figure 7). The reaction partners of these randomly chosen transformations were structurally similar to substrates contained in the training set. The examples included a Mannich/aza-Michael reaction (Figure 7A),29 Michael reaction/spirocyclization (Figure 7B)30 and Michael/aza-Henry reaction (Figure 7C).31 Overall, low average prediction error was observed in each scenario (<0.30 kcal/mol). In the case of the Michael/aza-Henry, an additive (2,2,2-trifluoroethanol) was required for cyclization. However, as discussed previously in model development, additive parameters were found to have no influence on correlation/prediction. Overall, this result is consistent with the enantiodetermining step being influenced by the same factors as the simpler model systems.
Figure 7.

A–D. Extrapolation of nitroalkene model to various cascade-type reactions. Average prediction error used as the statistical metric.
Several reaction examples were less successful and resulted in overall higher average prediction error (~0.90 kcal/mol). These reactions generally had higher observed enantioselectivity ranges wherein the majority of data points were reported between 98 – 99% ee. The difference between 98% and 99% ee is 0.40 kcal/mol. One instance of this is depicted in Figure 7D for a squaramide catalyzed [2+2] annulation of nitroalkenes.32 Although the overall error is higher, the predicted enantioselectivity is typically 4–9% ee from the reported value. Additionally, it is possible that the interactions required between catalyst and substrates for this type of sequential reactivity are not adequately captured in the training set.
An example of an electrophile class that could not be predicted as accurately with the general model was acrylate derivatives. One explanation for this lack of transferability could be the stark steric/electronic distinction of a terminal alkene in comparison with the training set electrophile data. A specific example of asymmetric nitrophosphonate addition yielded a larger, average prediction error of 0.57 ± 0.35 kcal/mol across 35 examples using the nitroalkene model (Figure 8).33 Given the data distribution (0.1 – 1.9 kcal/mol), we were interested if a suitable model could be found by applying linear regression algorithms using the full parameter set. Successful development of a robust, correlative model (R2 = 0.79, LOO = 0.72, 5-fold = 0.70) indicated the reaction type could be described however different parameters were necessary. Interestingly, the same Sterimol dimension (B5) used in the nitroalkene model was an effective nucleophile descriptor in this reaction, suggesting activation is similar across these reaction types.
Figure 8.

Regression analysis of nitrophosphonate addition to terminal enones. The LOO cross-validation score is 0.72 and the 5-fold cross-validation score is 0.70.
Mechanistic Discussion
With the predictive nature of the statistical model evaluated, we next turned our attention to the interpretation of the parameters found in the model to elucidate the mechanistic features described. Of the four catalyst parameters incorporated into the nitroalkene model, the NBO(H) was of particular note. Considering the level of variation incorporated into the nucleophile-activating fragment of the HBD catalysts, it was intriguing only one parameter was necessary to describe this critical catalyst component. Additionally, it carries the highest coefficient value among catalyst-based parameters. Univariate trends were explored to rationalize the contribution of the NBO(H) charge. We postulated the descriptor was in part related to the acidity (pKa) value of the HBD, which has been demonstrated to correlate with enantioselectivity in some instances.34 Although no direct correlation was observed between NBO(H) and experimentally determined pKa values (in DMSO) curated from literature reports,18a, 34–35 an excellent correlation was observed between LUMO energies and experimental pKa values (R2 = 0.96) of 15 catalysts included in the training set (Figure 9A).
Figure 9.

A. Linear correlation of experimentally determined pKa (in DMSO) vs. LUMO energies. B. Multivariate linear regression analysis of LUMO energies using catalyst parameters: NBO(H), molecular surface area (MSA) of the full catalyst and NMR(δ) shift of the HBD carbon. C. Visual representation of LUMO orbitals depicting a delocalization trend with corresponding pKa and NBO(H).
Assessment of catalysts in the training set revealed a qualitative correlation between LUMO energies and NBO(H). However, there appeared to be a steric-based classification present, preventing direct comparison. Using the LUMO energy as an output value (in lieu of ΔΔG‡), linear regression analysis of catalyst parameters revealed a correlation (R2 = 0.84) using NBO(H), molecular surface area (MSA) and NMR shift (δ) of the HBD carbon (Figure 9B). This correlation suggests the NBO(H) value is not only providing a partial steric descriptor but also contributing an indirect connection to acidity of the HBD catalysts. A qualitative comparison of the LUMO orbitals (Figure 9C) shows the delocalization across the catalyst as a change in acidity. Focusing on the weakest acid catalyst, where no aryl group is present, the LUMO orbital is entirely confined to the HBD portion of the catalyst. As the pKa value decreases, a delocalization of the LUMO energy is dispersed throughout the aryl portion of the catalyst. Furthermore, a reduction in lobe density can be observed about the N–H bond from which the NBO(H) value is derived, resulting in a stronger acid and higher NBO(H) charge value. Of particular note was the squaramide-based catalyst in which even orbital distribution across the 3,5-bis(trifluoromethyl)phenyl group and HBD is observed but almost no lobe density on the N–H bond is displayed. This distinguishing feature, in comparison to thiourea catalysts, may explain the presence of the NMR parameter in the LUMO model.
Upon further deconstruction of the model and evaluation of the specific roles of substrate-based parameters, a clear trend indicating the beneficial influence of a larger nucleophile on enantioselectivity was observed (Figure 10). Setting an approximate intermediate B5(avg) value of 6.0 (range: 4.5 – 8.6), a divergence is observed in which values >6 represent higher overall ΔΔG‡ and values <6 display lower selectivity. Envisioning the trend conceptually in terms of the transition state of bifunctional activation, this intuitively makes sense as the presence of a larger nucleophile would limit the rotation of the electrophile and only allow access to one face of the prochiral electrophile.
Figure 10.

General nitroalkene model using full dataset (150 reactions) depicting nucleophile size classification (B5(avg)) and respective enantioselectivity.
Finally, to gain a better understanding of the polarizability parameter representing the electrophile in the model, linear regression analysis was used to correlate with other electrophile-based descriptors. A high correlation (R2 = 0.86) was found with the HOMO and two Sterimol terms in reference to the substitution of the alkene (Figure 11). The cooperative effect of the HOMO(electrophile)/LUMO(catalyst) interaction encapsulated in the model can be rationalized by the proposed transition state of bifunctional catalysis, wherein the nitro group acts as an electron donor and the HBD participates as an electron acceptor. Although this mechanistic principle is conceptualized through bifunctional activation in our study, the fundamental interaction can be extended to a broader framework of asymmetric organocatalysis.
Figure 11.

Multivariate linear regression analysis - polarizability parameter of nitroalkene electrophiles (51 substrates) correlated with HOMO energy and Sterimol terms (L, B5).
CONCLUSION
In summary, multivariate linear regression analysis provides a mechanistically driven correlation of diverse bifunctional HBD catalysis using an analogous workflow originally developed for chiral phosphoric acid catalysts. The discovery reveals that complex interactions, involved in chiral induction, can be accurately quantified by sophisticated parameters, indicating there is a unified mode of activation among general reaction types. This study reinforces the hypothesis that mechanism-specific parameterization of a transition state can be used to integrate varied structural components into a single model, allowing for prediction of new reaction types and novel catalyst scaffolds. Furthermore, the approach enables elucidation of the reaction component contact points required for enantioselectivity through interpretable parameters and provides insight into the synergistic NCIs of bifunctional HBD catalysis. As we learn more of these general connections between catalytic systems, a comprehensive model of asymmetric catalysis may be attainable through utilization of evolving statistical toolsets, computational chemistry and advanced analysis.
Supplementary Material
ACKNOWLEDGMENT
This research was supported by the NIH (NIGMS R01 GM121383 & R35 GM136271). We are grateful to Dr. Jolene Reid for insightful discussions.
Footnotes
The Supporting Information is available free of charge on the ACS Publications website.
Computational methods and details (PDF)
The authors declare no competing financial interest.
REFERENCES
- (1).(a) Doyle AG; Jacobsen EN, Small-Molecule H-Bond Donors in Asymmetric Catalysis. Chem. Rev 2007, 107, 5713–5743; [DOI] [PubMed] [Google Scholar]; (b) Taylor MS; Jacobsen EN, Asymmetric Catalysis by Chiral Hydrogen-Bond Donors. Angew. Chem. Int. Ed 2006, 45, 1520–1543. [DOI] [PubMed] [Google Scholar]
- (2).Yoon TP; Jacobsen EN, Privileged Chiral Catalysts. Science 2003, 299, 1691–1693. [DOI] [PubMed] [Google Scholar]
- (3).(a) Knowles RR; Jacobsen EN, Attractive noncovalent interactions in asymmetric catalysis: Links between enzymes and small molecule catalysts. Proc. Natl. Acad. Sci. USA 2010, 107, 20678–20685; [DOI] [PMC free article] [PubMed] [Google Scholar]; (b) Davis HJ; Phipps RJ, Harnessing non-covalent interactions to exert control over regioselectivity and site-selectivity in catalytic reactions. Chem. Sci 2017, 8, 864–877. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (4).(a) Robbins DW; Hartwig JF, A Simple, Multidimensional Approach to High-Throughput Discovery of Catalytic Reactions. Science 2011, 333, 1423–1427; [DOI] [PMC free article] [PubMed] [Google Scholar]; (b) McNally A; Prier CK; MacMillan DWC, Discovery of an α-Amino C–H Arylation Reaction Using the Strategy of Accelerated Serendipity. Science 2011, 334, 1114–1117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (5).Sigman MS; Harper KC; Bess EN; Milo A, The Development of Multidimensional Analysis Tools for Asymmetric Catalysis and Beyond. Acc. Chem. Res 2016, 49, 1292–1301. [DOI] [PubMed] [Google Scholar]
- (6).Santiago CB; Guo J-Y; Sigman MS, Predictive and mechanistic multivariate linear regression models for reaction development. Chem. Sci 2018, 9, 2398–2412. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (7).Reid JP; Sigman MS, Holistic prediction of enantioselectivity in asymmetric catalysis. Nature 2019, 571, 343–348. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (8).Parmar D; Sugiono E; Raja S; Rueping M, Complete Field Guide to Asymmetric BINOL-Phosphate Derived Brønsted Acid and Metal Catalysis: History and Classification by Mode of Activation; Brønsted Acidity, Hydrogen Bonding, Ion Pairing, and Metal Phosphates. Chem. Rev 2014, 114, 9047–9153. [DOI] [PubMed] [Google Scholar]
- (9).(a) Crawford JM; Sigman MS, Conformational Dynamics in Asymmetric Catalysis: Is Catalyst Flexibility a Design Element? Synthesis 2019, 51, 1021–1036; [DOI] [PMC free article] [PubMed] [Google Scholar]; (b) Limnios D; Kokotos CG, Chapter 19 Ureas and Thioureas as Asymmetric Organocatalysts In Sustainable Catalysis: Without Metals or Other Endangered Elements, Part 2, The Royal Society of Chemistry: 2016; pp 196–255. [Google Scholar]
- (10).Okino T; Hoashi Y; Takemoto Y, Enantioselective Michael Reaction of Malonates to Nitroolefins Catalyzed by Bifunctional Organocatalysts. J. Am. Chem. Soc 2003, 125, 12672–12673. [DOI] [PubMed] [Google Scholar]
- (11).(a) Hamza A; Schubert G; Soós T; Pápai I, Theoretical Studies on the Bifunctionality of Chiral Thiourea-Based Organocatalysts: Competing Routes to C–C Bond Formation. J. Am. Chem. Soc 2006, 128, 13151–13160; [DOI] [PubMed] [Google Scholar]; (b) Jakab G; Schreiner PR, Brønsted Acids: Chiral (Thio)urea Derivatives. In Comprehensive Enantioselective Organocatalysis, 2013; pp 315–341; [Google Scholar]; (c) Türkmen YE; Zhu Y; Rawal VH, Brønsted Acids. In Comprehensive Enantioselective Organocatalysis, 2013; pp 239–288. [Google Scholar]
- (12).(a) Martin NJA; Ozores L; List B, Organocatalytic Asymmetric Transfer Hydrogenation of Nitroolefins. J. Am. Chem. Soc 2007, 129, 8976–8977; [DOI] [PubMed] [Google Scholar]; (b) Wasa M; Liu RY; Roche SP; Jacobsen EN, Asymmetric Mannich Synthesis of α-Amino Esters by Anion-Binding Catalysis. J. Am. Chem. Soc 2014, 136, 12872–12875. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (13).(a) Akiyama T; Itoh J; Fuchibe K, Recent Progress in Chiral Brønsted Acid Catalysis. Adv. Synth. Catal 2006, 348, 999–1010; [Google Scholar]; (b) Connon SJ, Asymmetric catalysis with bifunctional cinchona alkaloid-based urea and thiourea organocatalysts. Chem. Commun 2008, 2499–2510. [DOI] [PubMed] [Google Scholar]
- (14).Pihko PM; Rahaman H, Bifunctional Acid-Base Catalysis. In Enantioselective Organocatalyzed Reactions I: Enantioselective Oxidation, Reduction, Functionalization and Desymmetrization, Mahrwald R, Ed. Springer Netherlands: Dordrecht, 2011; pp 185–207. [Google Scholar]
- (15).Reid JP; Sigman MS, Comparing quantitative prediction methods for the discovery of small-molecule chiral catalysts. Nat. Rev. Chem 2018, 2, 290–305. [Google Scholar]
- (16).Alonso DA; Baeza A; Chinchilla R; Gómez C; Guillena G; Pastor IM; Ramón DJ, Recent Advances in Asymmetric Organocatalyzed Conjugate Additions to Nitroalkenes. Molecules 2017, 22, 895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (17).Orlandi M; Brenna D; Harms R; Jost S; Benaglia M, Recent Developments in the Reduction of Aromatic and Aliphatic Nitro Compounds to Amines. Org. Process Res. Dev 2018, 22, 430–445. [Google Scholar]
- (18).(a) Yang C; Wang J; Liu Y; Ni X; Li X; Cheng J-P, Study on the Catalytic Behavior of Bifunctional Hydrogen-Bonding Catalysts Guided by Free Energy Relationship Analysis of Steric Parameters. Chem. Eur. J 2017, 23, 5488–5497; [DOI] [PubMed] [Google Scholar]; (b) Zhu Y; Malerich JP; Rawal VH, Squaramide-Catalyzed Enantioselective Michael Addition of Diphenyl Phosphite to Nitroalkenes. Angew. Chem. Int. Ed 2010, 49, 153–156; [DOI] [PMC free article] [PubMed] [Google Scholar]; (c) Blom J; Vidal-Albalat A; Jørgensen J; Barløse CL; Jessen KS; Iversen MV; Jørgensen KA, Directing the Activation of Donor–Acceptor Cyclopropanes Towards Stereoselective 1,3-Dipolar Cycloaddition Reactions by Brønsted Base Catalysis. Angew. Chem. Int. Ed 2017, 56, 11831–11835; [DOI] [PubMed] [Google Scholar]; (d) Kimmel KL; Robak MT; Ellman JA, Enantioselective Addition of Thioacetic Acid to Nitroalkenes via N-Sulfinyl Urea Organocatalysis. J. Am. Chem. Soc 2009, 131, 8754–8755; [DOI] [PubMed] [Google Scholar]; (e) Bui T; Syed S; Barbas CF, Thiourea-Catalyzed Highly Enantio- and Diastereoselective Additions of Oxindoles to Nitroolefins: Application to the Formal Synthesis of (+)-Physostigmine. J. Am. Chem. Soc 2009, 131, 8758–8759; [DOI] [PubMed] [Google Scholar]; (f) He X-H; Yang L; Ji Y-L; Zhao Q; Yang M-C; Huang W; Peng C; Han B, Chemo- and Stereoselective Cross Rauhut–Currier-Type Reaction of Tri-substituted Alkenes Containing Trifluoromethyl Groups. Chem. Eur. J 2018, 24, 1947–1955. [DOI] [PubMed] [Google Scholar]
- (19).Metsänen TT; Lexa KW; Santiago CB; Chung CK; Xu Y; Liu Z; Humphrey GR; Ruck RT; Sherer EC; Sigman MS, Combining traditional 2D and modern physical organic-derived descriptors to predict enhanced enantioselectivity for the key aza-Michael conjugate addition in the synthesis of Prevymis™ (letermovir). Chem. Sci 2018, 9, 6922–6927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (20).(a) Bess EN; Guptill DM; Davies HML; Sigman MS, Using IR vibrations to quantitatively describe and predict site-selectivity in multivariate Rh-catalyzed C–H functionalization. Chem. Sci 2015, 6, 3057–3062; [DOI] [PMC free article] [PubMed] [Google Scholar]; (b) Piou T; Romanov-Michailidis F; Romanova-Michaelides M; Jackson KE; Semakul N; Taggart TD; Newell BS; Rithner CD; Paton RS; Rovis T, Correlating Reactivity and Selectivity to Cyclopentadienyl Ligand Properties in Rh(III)-Catalyzed C–H Activation Reactions: An Experimental and Computational Study. J. Am. Chem. Soc 2017, 139, 1296–1310. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (21).Yang C; Zhang E-G; Li X; Cheng J-P, Asymmetric Conjugate Addition of Benzofuran-2-ones to Alkyl 2-Phthalimidoacrylates: Modeling Structure–Stereoselectivity Relationships with Steric and Electronic Parameters. Angew. Chem. Int. Ed 2016, 55, 6506–6510. [DOI] [PubMed] [Google Scholar]
- (22).Denmark SE; Gould ND; Wolf LM, A Systematic Investigation of Quaternary Ammonium Ions as Asymmetric Phase-Transfer Catalysts. Application of Quantitative Structure Activity/Selectivity Relationships. J. Org. Chem 2011, 76, 4337–4357. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (23).Alexander DL; Tropsha A; Winkler DA, Beware of R(2): Simple, Unambiguous Assessment of the Prediction Accuracy of QSAR and QSPR Models. J. Chem. Inf. Model 2015, 55, 1316–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (24).Martinelli E; Vicini AC; Mancinelli M; Mazzanti A; Zani P; Bernardi L; Fochi M, Catalytic highly enantioselective transfer hydrogenation of β-trifluoromethyl nitroalkenes. An easy and general entry to optically active β-trifluoromethyl amines. Chem. Commun 2015, 51, 658–660. [DOI] [PubMed] [Google Scholar]
- (25).Li X; Xue X-S; Liu C; Wang B; Tan B-X; Jin J-L; Zhang Y-Y; Dong N; Cheng J-P, Asymmetric Michael addition reactions of 3-substituted benzofuran-2(3H)-ones to nitroolefins catalyzed by a bifunctional tertiary-amine thiourea. Org. Biomol. Chem 2012, 10, 413–420. [DOI] [PubMed] [Google Scholar]
- (26).Herrera RP; Sgarzani V; Bernardi L; Ricci A, Catalytic Enantioselective Friedel–Crafts Alkylation of Indoles with Nitroalkenes by Using a Simple Thiourea Organocatalyst. Angew. Chem. Int. Ed 2005, 44, 6576–6579. [DOI] [PubMed] [Google Scholar]
- (27).Okino T; Nakamura S; Furukawa T; Takemoto Y, Enantioselective Aza-Henry Reaction Catalyzed by a Bifunctional Organocatalyst. Org. Lett 2004, 6, 625–627. [DOI] [PubMed] [Google Scholar]
- (28).Wang K; Chen C; Liu X; Li D; Peng T; Liu X; Yang D; Wang L, Enantioselective Reaction between 2-(Cyanomethyl)azaarenes and N-Boc-amino Sulfones. Org. Lett 2018, 20, 5260–5263. [DOI] [PubMed] [Google Scholar]
- (29).Enders D; Göddertz DP; Beceño C; Raabe G, Asymmetric Synthesis of Polyfunctionalized Pyrrolidines via a Thiourea Catalyzed Domino Mannich/Aza-Michael Reaction. Adv. Synth. Catal 2010, 352, 2863–2868. [Google Scholar]
- (30).Zeng X-M; Meng C-Y; Bao J-X; Xu D-C; Xie J-W; Zhu W-D, Enantioselective Construction of Polyfunctionalized Spiroannulated Dihydrothiophenes via a Formal Thio [3+2] Cyclization. J. Org. Chem 2015, 80, 11521–11528. [DOI] [PubMed] [Google Scholar]
- (31).Xie J; Yoshida K; Takasu K; Takemoto Y, Thiourea-catalyzed asymmetric formal [3+2] cycloaddition of azomethine ylides with nitroolefins. Tetrahedron Lett. 2008, 49, 6910–6913. [Google Scholar]
- (32).Akula PS; Hong B-C; Lee G-H, Catalyst- and Substituent-Controlled Switching of Chemoselectivity for the Enantioselective Synthesis of Fully Substituted Cyclobutane Derivatives via 2 + 2 Annulation of Vinylogous Ketone Enolates and Nitroalkene. Org. Lett 2018, 20, 7835–7839. [DOI] [PubMed] [Google Scholar]
- (33).Bera K; Namboothiri INN, Quinine-Derived Thiourea and Squaramide Catalyzed Conjugate Addition of α-Nitrophosphonates to Enones: Asymmetric Synthesis of Quaternary α-Aminophosphonates. J. Org. Chem 2015, 80, 1402–1413. [DOI] [PubMed] [Google Scholar]
- (34).Li X; Deng H; Zhang B; Li J; Zhang L; Luo S; Cheng J-P, Physical Organic Study of Structure–Activity–Enantioselectivity Relationships in Asymmetric Bifunctional Thiourea Catalysis: Hints for the Design of New Organocatalysts. Chem. Eur. J 2010, 16, 450–455. [DOI] [PubMed] [Google Scholar]
- (35).(a) Jakab G; Tancon C; Zhang Z; Lippert KM; Schreiner PR, (Thio)urea Organocatalyst Equilibrium Acidities in DMSO. Org. Lett 2012, 14, 1724–1727; [DOI] [PubMed] [Google Scholar]; (b) Ni X; Li X; Wang Z; Cheng J-P, Squaramide Equilibrium Acidities in DMSO. Org. Lett 2014, 16, 1786–1789. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
