Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Jul 11.
Published in final edited form as: Chem. 2024 Jun 12;10(7):2074–2088. doi: 10.1016/j.chempr.2024.05.009

Data-science-guided calibration curve prediction of an MLCT-based ee determination assay for chiral amines

James R Howard 1, Julia R Shuluk 1, Arya Bhakare 1, Eric V Anslyn 1,2,*
PMCID: PMC11243635  NIHMSID: NIHMS2004877  PMID: 39006239

SUMMARY

Circular dichroism (CD) based enantiomeric excess (ee) determination assays are optical alternatives to chromatographic ee determination in high-throughput screening (HTS) applications. However, the implementation of these assays requires calibration experiments using enantioenriched materials. We present a data-driven approach that circumvents the need for chiral resolution and calibration experiments for an octahedral Fe(II) complex (1) used for the ee determination of α-chiral primary amines. By computationally parameterizing the imine ligands formed in the assay conditions, a model of the circular dichroism (CD) response of the Fe(II) assembly was developed. Using this model, calibration curves were generated for four analytes and compared to experimentally generated curves. In a single-blind ee determination study, the ee values of unknown samples were determined within 9% mean absolute error, which rivals the error using experimentally generated calibration curves.

Keywords: ee determination, data driven, cheminformatics, circular dichroism, chemosensor

Graphical Abstract

graphic file with name nihms-2004877-f0001.jpg

Circular dichroism ee determination assays are HTE-compatible alternatives to traditional chiral resolution techniques. The time-saving potential of these assays is hindered by the creation of calibration curves for each analyte of interest. This work presents a method for predicting these calibration curves based on DFT features of an ee determination assay for chiral amines.

INTRODUCTION

Metal-to-ligand charge-transfer (MLCT) bands are a unique feature of inorganic complexes in which an electron is transferred from the metal center to low lying orbitals of the ligand.1,2 These electronic transitions typically have intense molar absorptivities (ε), which in turn give large CD signals in chiral environments.3 Additionally, the d–ligand orbital transitions can produce long-wavelength and near-infrared absorption bands.4,5 Longer wavelength absorption bands are advantageous for chirality sensing, as the chiral ligands typically present in the HTS of asymmetric reactions can interfere with analysis in the UV region and are typically removed in parallel with chromatography.6 Given the large CD signals present in the visible region, it is no surprise that transition metal complexes are frequently employed in ee determination assays of chiral functional groups such as alcohols,79 amines,10,11 and amides.12 A large number of optical ee determination assays are now available, and one can readily identify a sensor that accommodates a desired functional group.6,1216.

Several years ago, our group introduced an octahedral Fe(II) complex (1) that exhibits all of the aforementioned properties (Figure 1A).10 Condensation of a chiral amine (2) 3-hydroxypyridine-2-carboxaldehyde (3) forms complex 1, which exhibits a pronounced CD couplet at approximately 520 nm. This absorption band is well beyond the absorbance spectra of common optical interference compounds. The assay features a simple mix-and-measure procedure conducive to HTS. Although complex 1 exists in equilibrium with the mono-ligated and di-ligated complexes, super-stoichiometric amounts of chiral amine and aldehyde 3 favor the tri-ligated complex.17 When the incorporated chiral amine is enantiopure, complex 1 is a mixture of two configurational isomers (fac and mer), each of which exists in two helical isomeric forms (Δ and Λ). If the chiral amine is not enantiopure, the number of accessible stereoisomers increases to 24.10 Although the stereoisomerism of 1 is complex, the CD spectrum remains consistent for a large number of amines. Thus, the complex can be readily used for ee determination of the amine.

Figure 1.

Figure 1.

Recent work in chiroptical sensing for primary amines. (A) Formation of octahedral Fe(II) complex 1. (B) Reaction of stereodynamic diol with 3-methoxy-2-formylbenzeneboronic acid to form an ortho-iminoboronic acid assembly 4. (C) Data-driven approach to predicting calibration curves using DFT descriptors. (D) Example ee calibration experiment of complex 1 with 1-cyclohexylethan-1-amine.

The intensity of the CD signal of complex 1 depends on the identity of the incorporated amine. Thus, calibration curves must be constructed for each amine before conducting the assay. Chemists often resort to chiral high-performance liquid chromatography (HPLC) to access the enantioenriched material required to create calibration standards of known ee. Because this calibration procedure must be performed for each analyte, the chromatographic conditions for sufficient resolution must also be optimized. Thus, HPLC-dependent calibration curve generation presents a significant bottleneck in optical ee determination workflow.

Ideally, calibration curves could be predicted a priori to circumvent the need for chiral resolution. We recently reported the prediction of calibration curves for a three-component ortho-aminomethylphenylboronic acid assembly for the ee determination of α-chiral primary amines, which was originally introduced by Wolf and co-workers (Figure 1B).18,19 In assembly 4, the twist of the binaphthol subunit gives rise to exciton-coupled CD (ECCD). In our study, quantum chemical descriptors (e.g., Sterimol parameters, partial charges, natural bond orbital populations) were used to describe each analyte, and a multiple linear regression (MLR) model was developed to predict the maximum CD signal observed of enantiopure amines (Figure 1C).20 These computational descriptors can be generated for virtually any chemical structure using density functional theory (DFT). In recent years, the computational chemistry community has delivered increasingly accurate and computationally inexpensive methods, which can be used to generate chemical descriptors for model development. Thus, predictive modeling is becoming increasingly popular for asymmetric catalyst development.2124

To explore whether computational prediction of calibration curves could be applied more generally, we turned to complex 1. Our prior modeling of assembly 4 addressed ECCD, but complex 1 relies on MLCT bands. We wondered if parameters that describe the electron transfer, along with the structural aspects that influence the distribution of fac/mer and Δ/Λ stereoisomers, could be generated and correlated to the structures of the chiral amines.18 If so, the calibration curve for chiral amines that successfully incorporate into the assembly could be predicted without performing calibration experiments (Figure 1D).

However, as just implied, the compatibility of a particular amine with complex 1 is not guaranteed. This potential incompatibility (i.e., the CD-active assembly does not form) presents an issue if 1 were used in an HTS workflow for a particular amine, which would be ineffective due to unforeseen reactivity. For this reason, we also explored parameters that preclude some analytes from being analyzed with complex 1 in hopes to better generalize the assay scope.

Importantly, the exact protocol used for the model building approach to predict the analyte compatibility and the CD response as a function of amine structural parameters is not specific to the present application. As now described, many of the same routines used to model asymmetric catalytic transformations, such as diverse analyte selection, DFT parameterization, and multiple linear regression, are applied to the complex 1. Both in analytical and synthetic applications, the weights and variables derived from machine learning protocols are specific to the exact system, while the general strategies are similar from analysis to analysis, and from reaction to reaction.

RESULTS AND DISCUSSION

Training Set Construction

Asymmetric catalysis is routinely used in the production and discovery of pharmaceuticals.25,26 In reaction development, a transformation is often conducted on an array of substrates to understand the reaction breadth and limitations, i.e., the “substrate scope”.27 Diverse substrate scopes elucidate how the electronic and steric factors of a substrate influence the reaction yield or enantioselectivity.28 Similarly, the utility and compatibility of ee determination assays should be demonstrated on a myriad of analytes in an analogous “analyte scope”. For ee determination assays, diverse analyte scopes reveal spectroscopic details such as inversion of Cotton effects or incompatibility with off-target functional groups.13

In some instances, reaction substrates that perform well (i.e., give a high ee or yield) with a given method are reported over those that represent the scope and limitations of the reaction.29 In response, a growing theme in reaction development is to seek generality during reaction optimization.30,31 Instead of optimizing conditions for a single model substrate, reactions are optimized against an array of substrates simultaneously.32 This approach increases the robustness of a method.33

Several strategies can be employed to aid in substrate selection for a broad optimization routine. For instance, Hammett substituent constants (σ) can guide substrate inclusion by selecting a range of substituents with either low or high values. This can be effective, but the dependence on an experimentally measured parameter limits its utility as Hammett substituent constants have been measured for only a small number of substituents.34 Similarly, Charton or Taft parameters can be used to guide selection of sterically diverse substrates. One computational alternative is to use molecular fingerprints, which are 2D representations of molecules that indicate the presence of functional groups.29,35,36 By calculating these fingerprints for all possible substrates, a diverse training set can be obtained by clustering molecules based on fingerprint similarity and selecting molecules from representative clusters.

To create a broad analyte set for testing the analyte compatibility of complex 1, we adopted a similar approach, but instead of using molecular fingerprints, we used a large number of low-level descriptors (e.g., number of heteroatoms, calculated partition coefficients, topological descriptors, etc.) commonly used in quantitative structure-activity relationship development.37 The low-level descriptors are informative enough to map the chemical space of our amine library.

To begin, the SMILES strings of 4,370 commercially available α-chiral amines were curated from the eMolecules database. This initial set was then filtered for free-base, primary amines (i.e., no salt derivatives were considered) with molecular weights ≤ 500 Da. Our selection also excluded diamines, which complex 1 would respond to both amines and thereby not report an ee value accurate for either. If one amine were not on a stereogenic center, the response for the amine on the stereogenic center would be diminished and therefore suffer from a limited dynamic range. Lastly, known confounding functional groups (e.g., pyridines) were removed, as we recognized they would interfere with the formation of complex 1.10 After all filtering was complete, the 3D structure of each amine was then optimized at the GFN2-xTB level.38 Using Mordred, a molecular descriptor calculation package, we calculated 1,826 low-level descriptors for each structure.37 Any single descriptor could be used to select a substrate scope, akin to the use of Hammett σ parameters. However, a more prudent approach is to select substrates based on all descriptors simultaneously.

To accomplish this, all descriptors with covariance ≥ 0.95 were pruned from the descriptor list by pair-wise regression. The descriptors were then projected into two dimensions by using Uniform Manifold Approximation Projection (UMAP). UMAP is a data transformation technique that projects high-dimensional data (e.g., the 1,826 dimensions of the Mordred descriptors) into a lower dimensional space for visualization. This method is similar to principal component analysis (PCA) which maximizes the variance of the projected data in the principal components. Either PCA or UMAP could be used for data transformation. However, UMAP allows the use of nonlinear functions to perform the projection whereas PCA is strictly linear.39

We first attempted to use PCA for data reduction, but we observed poor separation on the first few principal components. We found UMAP better clustered our data and separated the amines into different “islands” of chemical space. The resulting islands can be clustered by either k-means clustering or through density-based spatial clustering of applications with noise (DBSCAN). When applied to our UMAP-transformed dataset, DBSCAN appropriately identified the boundaries of the clusters (Figures 2A and 2B). For a detailed discussion on analyte selection, see SI section 5.

Figure 2.

Figure 2.

Design of chiral amine training set using multidimensional projections. (A) 2-dimensional UMAP plot of the curated amine dataset. Red points indicate amines in the training set while the remaining colors indicate cluster membership. The points colored black were not assigned to a cluster. (B) 3-dimensional representation of the same UMAP projection.

UMAP analysis produced five distinct clusters of primary amines. Amines were selected from each cluster based on price and the presence of only one stereogenic center, and if a corresponding enantiomer was found, it was also added to the dataset. These newly selected amines were purchased and combined with our own inventory of chiral amines to produce a set of analytes that contained 54 amines (red circles, Figure 2). The amines in our training set are biased toward cluster 3 as it contains many of our inhouse amines. Additionally, cluster 2 primarily contains amines with multiple stereocenters that could confound ee determination and are not appropriate for the assay. As shown later, the selected amines span a large dynamic range of CD intensities with few gaps along the range. This demonstrates that the UMAP-guided selection produced a diverse set of amines. Of the 54-amine dataset, 28 were enantiomer pairs (i.e., 14 different structures with both enantiomers, Figure 3A), and the remaining 26 amines were used without their respective enantiomer (Figure 3B).

Figure 3.

Figure 3.

Curated training set for model construction. (A) Amines of which both enantiomers were analyzed. (B) Amines of which only a single enantiomer was analyzed.

When we consider the time for data collection, there are two different time periods: incubation and CD measurement. The first is a 4 hour incubation time required to form complex 1, which can be done in parallel for 96 or 384 samples (depending on plate size). This is the time it takes for the chiral amine, aldehyde 3, and Fe(II) complexes to reach equilibrium and exhibit a reproducible CD signal. The second period is for acquisition of the CD spectrum. Ee determination with complex 1 typically utilizes a single wavelength (520 nm). However, at this stage in model development, it was unclear whether this wavelength would be optimal for training. Thus, we recorded the CD spectrum from 400 to 700 nm for each complex, which dramatically increased acquisition time. In practice, once an optimal wavelength is identified, CD measurements at a single wavelength further accelerate ee determination. When using a single wavelength to determine unknown ee values, an entire plate is read in under 5 minutes.

Computational Parameter Generation

The particular representation of the system is critical for accurate parameterization. In the case of metal complexes, it is standard practice to model the species of interest entirely with both the metal center and ligands present.16,21 This provides the most accurate description because intramolecular interactions (e.g., agostic interactions) are apparent when ligands adopt geometries that accurately represent binding modes.40 However, modeling metal complexes (and large systems in general) is computationally expensive. For DFT, the computational complexity scales as O(N 3), where N is the number of atoms.41 While advancements in linearly scaling O(N) methods show promise, most current DFT implementations suffer from poor scaling.42 Moreover, transition metals are notoriously difficult to model computationally.43

It is well-known that computational descriptors are conformation dependent.44 For instance, a Sterimol parameter that describes the length (L) of a butyl substituent varies with the number of methylene units in anti or gauche conformations.18 Therefore, computational parameters used for LFERs or MLR models are often calculated for a set of representative conformers and are then weighted according to the Boltzmann distribution.45 The weighted parameters more accurately reflect the “true” value one would measure experimentally, which of course would represent a weighted average of all accessible conformations.

The poor scaling of DFT is magnified for our data-driven approach given the conformation dependence of accurate parameters. For a realistic conformer-rotamer ensemble (CRE), all four stereoisomers of complex 1 (Δ-fac, Δ-mer, Λ-fac, Λ-mer) could be modeled along with their respective conformations. We reasoned that for our HTS applications, expeditious prediction of CD with MLR could be prioritized. Therefore, high accuracy parametrization would only slow the prediction of calibration curves with only a modest increase in accuracy. The logical extreme of this reasoning is to parameterize the free amines alone (Figure 3). However, the difference between complex 1 and the amine is significant, and the relationship between amine descriptors and CD is difficult to predict. Ultimately, we decided to computationally model the imine ligands that would result from the imine condensation of 2a-2an and 3. This intermediate approach would allow calculation of ligand descriptors associated with the imine (e.g., bite angle, imine IR frequencies) without exceedingly time-intensive calculations.

To begin parameterization, each amine from our training set was modeled as the corresponding imine and optimized at the GFN2-xTB level.38 Each of these structures was then subject to a Conformer−Rotamer Ensemble Search Tool (CREST).46 The resulting conformers were optimized further at the B3LYP/6-31G(d,p) level. Subsequent frequency calculations were performed to ensure the optimized geometries were stationary points (i.e., possessed no negative vibration frequencies). Finally, single point energy (SPE) calculations were performed at the M06-2X/Def2TZVP level.47 Using these energies, a large number of weighted geometric, electronic, and steric descriptors were calculated. These descriptors include Sterimol parameters, which have been employed in MLR models for asymmetric catalysis and ee determination assays, and natural bond orbital (NBO) charges.16 For a full list of the descriptors calculated, see SI section 6.

Model Development

In recent years, complex machine learning (ML) models using artificial neural nets have seen wide success in both theoretical and experimental chemistry.48,49 While powerful, the sharp increase in ML for chemistry applications has led to a large number of “black-box” models, which are difficult to interpret despite their high accuracy.50 Moreover, state-of-the-art ML models built on experimental data show large biases toward frequently used reaction conditions.51 In contrast to ML, multiple linear regression (MLR) is frequently used to develop explanatory models for asymmetric catalysis.5254 MLR methods benefit from the use of chemical descriptors that are familiar to physical organic chemists (e.g., steric size, polarizability, or electronegativity). This often results in an explanatory and sensible model for chemical insight.51

We opted to use a forward stepwise MLR algorithm to relate as few chemical descriptors as possible to the CD data of our training set. By doing so, model overfitting was avoided.55 Using pair-wise regression, the correlation between all descriptors was calculated. For descriptors that covaried with an R2 > 0.5, one descriptor was removed to reduce the number of descriptors in the model development pipeline. The MLR algorithm produced several models relating the descriptors to CD. However, low regression metrics, such as leave-one-out (LOO) and stratified k-fold cross-validation, showed high model instability. These values are low when the model coefficients vary depending on how the training data is split. Additionally, the CD intensity of many amines was overpredicted (Figure 4). Notably, the CD intensity varies between 10 and nearly 140 mdeg abs−1. The only response range with scarce data points is between 40 and 70 mdeg abs−1, meaning that our training set nicely spans the range of 10 to 140, which confirms that the amine dataset is sufficiently diverse.

Figure 4.

Figure 4.

Exemplary model of predicted vs. measured CD featuring a 60:40 train-test split and low regression metrics. Amines at the bottom right (2an, 2d, 2h, etc.) were consistently overpredicted across all models.

Upon consideration of the data, we hypothesized that the CD signal of the amines that were overpredicted resulted from incomplete formation of complex 1. This hypothesis was supported by a profound difference in color between complexes that gave large CD signals (purple) and those that gave low CD signals (orange-red). Additionally, many of the absorbance spectra of the low CD complexes that were overpredicted by the model have vastly different λmax values.

Many factors influence coordination of ligands to metals, including metal identity, oxidation state, ionic radius, and the steric nature of the ligand(s).5659 Because the only difference between the CD-active complexes is the identity of the incorporated amine, ligand descriptors could provide insight into the poor initial model performance. Recently, Doyle and co-workers observed diminished reactivity when investigating Buchwald-type phosphines in nickel-catalyzed cross coupling reactions.21 The authors found that the reduced activity of some phosphine ligands is due to the lack of formation of a key phosphine-ligated Ni(0) complex. When investigating the relationship between the Boltzmann-weighted percent buried volume (Vbur) and complex formation, they identified a threshold above which complex formation did not occur. We hypothesized that a similar effect may influence the formation of complex 1.

Ideally, the Vbur of the imine ligands for complex 1 could be derived from modeling the ortho-iminopyridine-ligated Fe(II) species. Because this is a geometric parameter, computationally inexpensive DFT functionals could be used to expedite the calculation of this descriptor.60 To retain our established workflow, we opted to measure Vbur of the imines used to form 1a-1an, which may adequately describe the steric nature of the ligand without modeling the ligated Fe(II) center. To accomplish this, our workflow for the initial dataset was augmented by constraining the NCCN torsion angle between the imine and pyridine nitrogen atoms to 0°. When performed for all steps (CREST, optimization, SPE), this procedure generated conformers that better represent the geometry of the imine bound to the Fe(II) center. We then modified Paton’s DBStep, a program used to calculate steric descriptors, to enable Vbur measurements at arbitrary points between atoms (See SI section 6 additional details). Finally, by measuring Vbur between the two chelating nitrogen atoms, we generated Vbur-bind – a proxy value for Vbur measured at the metal center. Additionally, by changing the size of the sphere of measurement for Vbur, the influence of proximal and distal steric influence could be separated.

When Vbur-bind is plotted against CD, a distinct break in between high and low CD complexes is observed (Figure 5A), akin to the reactivity cliffs seen in catalytic reaction development.21 Threshold analysis, a technique based on decision tree classifiers, was performed on Vbur-bind values using different sphere radii.61 When measured at 5 Å, amines with Vbur-bind values greater 31.8% produced low CD signals. Interestingly, amines above this threshold still produced a moderate CD signal, but the spectral features (e.g., λmax and couplet zero-crossing) are significantly different to those below the threshold value.

Figure 5.

Figure 5.

Descriptor-based refinement of training sets. (A) Visualization of buried volume centered between the two nitrogen atoms of an imine. (B) Threshold analysis of CD vs. Vbur-bind measured with a 5 Å sphere radius.

Thus, amines with Vbur-bind values above the threshold were expected to not form the CD-active complex 1 to an appreciable extent. To address this in model construction, we attempted to add a buried volume classifier parameter to the model that would indicate whether a given imine was above or below the threshold (i.e., those above the threshold are assigned a value of 1; those below the threshold are assigned a value of −1). This approach improved model evaluation metrics. However, the coefficient of the %Vbur classifier varied substantially depending on the train-test split. In practice, this threshold value represents an upper bound above which complex 1 is not compatible with the target analyte. In an HTS campaign, Vbur-bind could be used to determine whether complex 1 is suitable for the amine of interest. Therefore, we opted to exclude amines exceeding the Vbur-bind from model training as it was clear that a model that included the amines which surpassed the threshold value of 31.8% could not be included in a successful model.

After threshold analysis, we used the same forward stepwise linear regression algorithm to generate a second model of the CD response of the remaining 25 amines (Figure 6). The instability of the previous model (Figure 4) was greatly improved. The high LOO and 4-fold cross-validation metrics demonstrated the model’s robustness, and the high R2 values for both the training and test splits show excellent correlation between predicted and measured CD. The two parameter model features ΔL, which is the difference in substituent (R1 and R2, Figure 6) length represented by the Sterimol L parameter. Although ΔL gave the best correlation, other steric parameters, such as the difference in Vbur measured at the first atom of the stereogenic center substituents, also yielded a robust model. In other ee determination assays, larger differences in steric size between the two substituents give rise to larger ellipticities.15,18,62 Thus, amines with disparately sized substituents (e.g., 2b, 2k, and 2f) exhibited some of the largest CD intensities of our amine dataset. Conversely, amines with similarly sized substituents such as 2ac produced weaker signals. The second model parameter is the C–N stretch frequency of the imine (vCN). IR vibrational frequencies have been employed in several MLR models. For instance, the C–O stretch frequencies of acetophenones correlate strongly with Hammett σ parameters,63 and several other applications of IR frequencies in MLR have been reported.64 Despite this strong correlation, it is difficult to interpret the effect this parameter represents because vibrational frequencies depend on both electronic and steric characteristics. We attempted to regress the IR frequencies against other descriptors in our dataset to substitute vCN with more explanatory chemical descriptors. However, even after enumerating all possible combinations of descriptors, only weak correlations (R2 ≈ 0.6) were found. MLR models with explanatory parameters are desirable as they allow one to tune substrates for a desired experimental outcome (i.e., modifying ligand sterics to affect reaction outcome). Albeit it would be nice to gain chemical insight from vCN, for predicting calibration curves it is not necessary because we are not attempting to maximize CD intensity, but instead only predict its magnitude.

Figure 6.

Figure 6.

Second generation MLR model relating descriptors to CD featuring a 60:40 train-test split, and a graphical representation of the vCN and ΔL parameters used in the model.

Having a method in hand to alert researchers of amines that would be predicted to be incompatible with the Fe(II) assembly (i.e., the Vbur-bind parameter), and having a multivariate model that predicts CD values for chiral amines that function well in the assembly, we set out to test several aspects of our prediction techniques. First, given that the model is agnostic to the Vbur-bind parameter, we can predict the CD value of amines even if they exceed the threshold value. This would give us insight into whether these amines would hypothetically fall in the dynamic range of calculated CD values for the amines we used in the model. Second, we sought to explore if the amines used in the model properly spanned the dynamic range of CD values obtainable with the Fe(II) assembly (i.e., evaluate whether our training set of amines was appropriate for model creation).

To explore how such amines preform in our model, we choose 24 additional amines from the UMAP analysis, with a set that exceeded the Vbur-bind threshold and several that met the threshold and overlaid these amines in the predicted/measured CD plot with those amines used to generate the model (see Figure S82). While some of the amines that exceed the Vbur-bind threshold gave predicted CD values within the group of training set amines, most of these amines were predicted to have CD values significantly higher, between 170 to 220 CD, than the amines in the initial training and test sets. Thus, while the substituents on the point stereocenter of these amines are too large to be accepted into the Fe(II) assembly, they would, however, have led to large experimental CD values. Thus, the dynamic range of the amines that work in the assay is effectively capped on the high end by those that do not function in the assembly. This suggests that we have adequately sampled the dynamic range of amines that the Fe(II) assembly can accommodate. Hence, we are confident that the training set properly captures the expected dynamic range of amines that would function in our assay.

It is important to address the computational workload that MLR model development requires. Because the ultimate terms of the model are unknown at the beginning of model development, a wide array of descriptors are calculated to evaluate many possible models. Once a suitable model is identified, predictions of out-of-sample amines would be accelerated by only calculating descriptors that are found in the model. For instance, the average time for optimization and frequency calculation for the training set (Figure 3) was 13 minutes.

Single-blind ee Determination Study

To demonstrate the error associated with the model in a hypothetical HTS campaign, we conducted a single-blind ee determination study. One researcher (JRS) prepared solutions of various ee of four different enantiomer pairs (2a, 2e, 2f, and 2k). These values were unknown to the second researcher (JRH) who constructed calibration curves from both the predicted CDmax values (Figure 6) and a typical calibration experiment. The unknown solutions were incorporated into complex 1 and the CD was measured and compared to both calibration curves (Figure 7).

Figure 7.

Figure 7.

Experimental (blue) and predicted (red) calibration curves for four enantiomer pairs 2a, 2e, 2f, and 2k. Ee values of the blinded samples are plotted in yellow and purple for the experimental and predicted calibration curves, respectively.

The experimental calibration curves resulted in a mean absolute error (MAE) of 4.5% whereas the predicted calibration curves resulted in an MAE of 8.8%. The magnitude of the two errors, experimental and predicted, are on the order of other chiroptical methods.17 Because the two calibration curves cross near 0 on both axes (i.e., they differ primarily by slope), it is expected that the error is larger for high ee values as the two lines diverge (Table 1). Thus, additional error from using a predicted calibration curve depends on the ee of the sample measured.

Table 1.

Results from the single-blind ee determination study

Trial 1 Trial 2 Trial 3
Analyte actual exp. pred. actual exp. pred. actual exp. pred.
2a 51.0 39.5 68.6 −29.0 −37.0 −41.1 −83.0 −84.8 −109.8
2e 70.0 55.3 70.2 15.0 5.6 9.8 −100.0 −99.6 −118.2
2f 36.0 33.8 49.7 9.0 7.3 13.5 −21.0 −21.9 −26.5
2k 20.0 21.0 20.6 −2.0 −2.6 −2.8 45.0 46.8 45.9

Considering the differences between experimental and predicted calibration curves, we reflected on the minimal computational approach we used for generating the regression variables. Although parameterization of the ligated Fe(II) center may have led to less scatter, it is not likely to have significantly changed the weights applied to ΔL or vCN. Most importantly, the predicted and experimental calibration curves give similar errors in the final analysis of ee values of unknown mixtures of 2a, 2e, 2f, and 2k.

CONCLUSION

Theoretical calibration curves for an octahedral Fe(II) assembly-based ee determination assay were generated. By using a datadriven approach, a diverse analyte scope was designed to explore the limits of compatible analytes. We identified a buried volume (Vbur-bind) threshold that precludes successful incorporation of an amine into complex 1. Further computational parameterization of each analyte allowed us to develop a two-parameter model to predict the CD of complex 1 at 100% ee. Our interpretation of the model suggests that the difference in steric size of the two substituents at the point stereogenic center is the dominant factor that contributes to the CD of the sensor. We then demonstrated through a single-blind ee determination study that a similar MAE is achieved between the experimentally generated and predicted calibration curves. Most importantly, these findings demonstrate the utility of MLR modelling for analytical applications. The substrate selection and regression workflow, which has had major success in synthetic method optimization, can be adapted for sensor development. Additionally, we have demonstrated these techniques for MLCT-based ee determination assays which are well suited for an HTE workflow. While the two-parameter model described here is specific for complex 1 and chiral amines, we believe the general strategy of using DFT parameterization and multiple linear regression is broadly applicable to analytical chemistry endeavors, as it has similarly been found to be for synthetic chemistry endeavors.

EXPERIMENTAL PROCEDURES

Resource availability

Lead contact

Further information and requests for resources should be directed to and will be fulfilled by the lead contact, Professor Eric V. Anslyn (anslyn@austin.utexas.edu).

Materials availability

This study did not generate new materials.

Data and code availability

The code associated with this work is available in the supplemental information.

Supplementary Material

1

Data S1. Spreadsheet containing DFT descriptors and SMILES strings for the original amine dataset.

2

Document S1. Experimental procedures for ee determination assays, circular dichroism spectra, and calculation details.

Highlights.

Data science-guided analyte selection

Identification of incompatible analytes with threshold analysis

Bigger Picture.

Rapid ee determination is critical for the development of the asymmetric reactions. Although chiral HPLC is ubiquitous throughout both academia and industry for determining ee, the serial nature of chromatography presents a major bottleneck in the reaction development pipeline. Optical ee determination assays like the one presented in this work offer a well-plate compatible alternative to chiral HPLC. Using these methods, thousands of reactions can be screened in parallel.

By predicting calibration curves of ee determination assays, the implementation of these chiroptical techniques is further accelerated. Typically, chiroptical assays require enantioenriched analytical standards and thus chiral HPLC. Using this data-driven approach, no chiral chromatography is required to create calibration curves for new analytes. The reported advance in data-driven chemistry for analytical assays complements the data-driven approaches currently used in synthetic method development.

ACKNOWLEDGMENTS

The authors thank the National Institutes of Health for financial support for this work (5R01GM077437-12 and 1 R35 GM149308-01). The authors also acknowledge NIH Grant 1 S10 OD021508-1 (2015) for funding the 500 MHz NMR spectrometer upon which the NMR spectra were collected. The authors acknowledge the Texas Advanced Computing Center (TACC) at The University of Texas at Austin for providing high performance computing resources that have contributed to the research results reported within this paper. E.V.A. gratefully acknowledges support from the Welch Regents Chair (F-0046).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

DECLARATION OF INTERESTS

The authors declare no competing interests.

REFERENCES

  • 1.Solomon EI, and Lever ABP (2006), Inorganic Electronic Structure and Spectroscopy (Wiley-Interscience: Hoboken, NJ: ). [Google Scholar]
  • 2.Turro NJ, Ramamurthy V, and Scaiano JC (2009). Principles of Molecular Photochemistry: An Introduction (University Science Books: Sausalito, CA: ). [Google Scholar]
  • 3.Housecroft CE, and Sharpe AG (2012). Inorganic Chemistry, 4th ed. (Pearson: Harlow, England , New York: ). [Google Scholar]
  • 4.Mews NM, Berkefeld A, Hörner G, and Schubert H (2017). Controlling Near-Infrared Chromophore Electronic Properties through Metal–Ligand Orbital Alignment. J. Am. Chem. Soc, 139 (7), 2808–2815. 10.1021/jacs.6b13085. [DOI] [PubMed] [Google Scholar]
  • 5.Kaim W (2011). Concepts for Metal Complex Chromophores Absorbing in the near Infrared. Coordination Chemistry Reviews, 255 (21–22), 2503–2513. 10.1016/j.ccr.2011.01.014. [DOI] [Google Scholar]
  • 6.Thanzeel FY, Balaraman K, and Wolf C (2018). Click Chemistry Enables Quantitative Chiroptical Sensing of Chiral Compounds in Protic Media and Complex Mixtures. Nat Commun, 9 (1), 5323. 10.1038/s41467-018-07695-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Dillon J, and Nakanishi K (1975). Absolute Configurational Studies of Vicinal Glycols and Amino Alcohols. I. With Bis(Acetylacetonato)Nickel. J. Am. Chem. Soc, 97 (19), 5409–5417. 10.1021/ja00852a015. [DOI] [PubMed] [Google Scholar]
  • 8.Dillon J, and Nakanishi K (1975) Absolute Configurational Studies of Vicinal Glycols and Amino Alcohols. II. With Tris(Dipivalomethanato)Pra seodymium. J. Am. Chem. Soc, 97 (19), 5417–5422. 10.1021/ja00852a016. [DOI] [PubMed] [Google Scholar]
  • 9.Dillon J, and Nakanishi K (1974). Use of Copper Hexafluoroacetylacetonate for the Determination of the Absolute Configuration of Alcohols. J. Am. Chem. Soc, 96 (12), 4055–4057. 10.1021/ja00819a076. [DOI] [Google Scholar]
  • 10.Dragna JM, Pescitelli G, Tran L, Lynch VM, Anslyn EV, and Di Bari L (2012). In Situ Assembly of Octahedral Fe(II) Complexes for the Enantiomeric Excess Determination of Chiral Amines Using Circular Dichroism Spectroscopy. J. Am. Chem. Soc, 134 (9), 4398–4407. 10.1021/ja211768v. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Nieto S, Dragna J, and Anslyn EA (2010). Facile Circular Dichroism Protocol for Rapid Determination of Enantiomeric Excess and Concentration of Chiral Primary Amines. Chem. Eur. J, 16 (1), 227–232. 10.1002/chem.200902650. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zardi P, Wurst K, Licini G, and Zonta C (2017). Concentration-Independent Stereodynamic g -Probe for Chiroptical Enantiomeric Excess Determination. J. Am. Chem. Soc, 139 (44), 15616–15619. 10.1021/jacs.7b09469. [DOI] [PubMed] [Google Scholar]
  • 13.Superchi S, Bisaccia R, Casarini D, Laurita A, and Rosini C (2006). Flexible Biphenyl Chromophore as a Circular Dichroism Probe for Assignment of the Absolute Configuration of Carboxylic Acids. J. Am. Chem. Soc, 128 (21), 6893–6902. 10.1021/ja058552a. [DOI] [PubMed] [Google Scholar]
  • 14.Minus MB, Featherston AL, Choi S, King SC, Miller SJ, and Anslyn EV (2019). Reengineering a Reversible Covalent-Bonding Assembly to Optically Detect Ee in β-Chiral Primary Alcohols. Chem, 5 (12), 3196–3206. 10.1016/j.chempr.2019.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.You L, Pescitelli G, Anslyn EV, and Di Bari L (2012). An Exciton-Coupled Circular Dichroism Protocol for the Determination of Identity, Chirality, and Enantiomeric Excess of Chiral Secondary Alcohols. J. Am. Chem. Soc, 134 (16), 7117–7125. 10.1021/ja301252h. [DOI] [PubMed] [Google Scholar]
  • 16.Dotson JJ, van Dijk L, Timmerman JC, Grosslight S, Walroth RC, Gosselin F, Püntener K, Mack KA, and Sigman MS (2023). Data-Driven Multi-Objective Optimization Tactics for Catalytic Asymmetric Reactions Using Bisphosphine Ligands. J. Am. Chem. Soc, 145 (1), 110–121. 10.1021/jacs.2c08513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Moor SR, Howard JR, Herrera BT, and Anslyn EV (2021). High-Throughput Screening of α-Chiral-Primary Amines to Determine Yield and Enantiomeric Excess. Tetrahedron, 94, 132315. 10.1016/j.tet.2021.132315. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Howard JR, Bhakare A, Akhtar Z, Wolf C, and Anslyn EV (2022). Data-Driven Prediction of Circular Dichroism-Based Calibration Curves for the Rapid Screening of Chiral Primary Amine Enantiomeric Excess Values. J. Am. Chem. Soc, 144 (37), 17269–17276. 10.1021/jacs.2c08127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Bentley KW, Nam YG, Murphy JM, and Wolf C (2013). Chirality Sensing of Amines, Diamines, Amino Acids, Amino Alcohols, and α-Hydroxy Acids with a Single Probe. J. Am. Chem. Soc, 135 (48), 18052–18055. 10.1021/ja410428b. [DOI] [PubMed] [Google Scholar]
  • 20.Karelson M, Lobanov VS, and Katritzky AR (1996). Quantum-Chemical Descriptors in QSAR/QSPR Studies. Chem. Rev, 96 (3), 1027–1044. 10.1021/cr950202r. [DOI] [PubMed] [Google Scholar]
  • 21.Newman-Stonebraker SH, Wang JY, Jeffrey PD, and Doyle AG (2022). Structure–Reactivity Relationships of Buchwald-Type Phosphines in Nickel-Catalyzed Cross-Couplings. J. Am. Chem. Soc, 144 (42), 19635–19648. 10.1021/jacs.2c09840. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Ahneman DT, Estrada JG, Lin S, Dreher SD, and Doyle AG (2018). Predicting Reaction Performance in C–N Cross-Coupling Using Machine Learning. Science, 360 (6385), 186–190. 10.1126/science.aar5169. [DOI] [PubMed] [Google Scholar]
  • 23.Shields BJ, Stevens J, Li J, Parasram M, Damani F, Alvarado JIM, Janey JM, Adams RP, and Doyle AG (2021). Bayesian Reaction Optimization as a Tool for Chemical Synthesis. Nature, 590 (7844), 89–96. 10.1038/s41586-021-03213-y. [DOI] [PubMed] [Google Scholar]
  • 24.Sigman MS, Harper KC, Bess EN, and Milo A (2016). The Development of Multidimensional Analysis Tools for Asymmetric Catalysis and Beyond. Acc. Chem. Res, 49 (6), 1292–1301. 10.1021/acs.accounts.6b00194. [DOI] [PubMed] [Google Scholar]
  • 25.Krska SW, DiRocco DA, Dreher SD, and Shevlin M (2017). The Evolution of Chemical High-Throughput Experimentation To Address Challenging Problems in Pharmaceutical Synthesis. Acc. Chem. Res, 50 (12), 2976–2985. 10.1021/acs.accounts.7b00428. [DOI] [PubMed] [Google Scholar]
  • 26.Farina V, Reeves JT, Senanayake CH, and Song JJ. (2006). Asymmetric Synthesis of Active Pharmaceutical Ingredients. Chem. Rev, 106 (7), 2734–2793. 10.1021/cr040700c [DOI] [PubMed] [Google Scholar]
  • 27.Das S, Zhu C, Demirbas D, Bill E, De CK, and List B (2023). Asymmetric Counteranion-Directed Photoredox Catalysis. Science, 379 (6631), 494–499. 10.1126/science.ade8190. [DOI] [PubMed] [Google Scholar]
  • 28.Xiang M, Pfaffinger DE, Ortiz E, Brito GA, and Krische MJ (2021). Enantioselective Ruthenium-BINAP-Catalyzed Carbonyl Reductive Coupling of Alkoxyallenes: Convergent Construction of Syn-Sec,Tert-Diols via ( Z )-σ-Allylmetal Intermediates. J. Am. Chem. Soc, 143 (23), 8849–8854. 10.1021/jacs.1c03480. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Gensch T, and Glorius F (2016). The Straight Dope on the Scope of Chemical Reactions. Science, 352 (6283), 294–295. 10.1126/science.aaf3539. [DOI] [PubMed] [Google Scholar]
  • 30.Wagen CC, McMinn SE, Kwan EE, and Jacobsen EN (2022). Screening for Generality in Asymmetric Catalysis. Nature, 610 (7933), 680–686. 10.1038/s41586-022-05263-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Schmidt-Dannert C, and Arnold FH (1999). Directed Evolution of Industrial Enzymes. Trends in Biotechnology, 17 (4), 135–136. 10.1016/S0167-7799(98)01283-9. [DOI] [PubMed] [Google Scholar]
  • 32.Satyanarayana T, and Kagan HB (2005). The Multi-Substrate Screening of Asymmetric Catalysts. Advanced Synthesis & Catalysis, 347 (6), 737–748. 10.1002/adsc.200505057. [DOI] [Google Scholar]
  • 33.Collins KD, and Glorius FA (2013). Robustness Screen for the Rapid Assessment of Chemical Reactions. Nature Chem, 5 (7), 597–601. 10.1038/nchem.1669. [DOI] [PubMed] [Google Scholar]
  • 34.Brown HC, and Okamoto Y (1958). Electrophilic Substituent Constants. J. Am. Chem. Soc, 80 (18), 4979–4987. 10.1021/ja01551a055. [DOI] [Google Scholar]
  • 35.Scior T, Bender A, Tresadern G, Medina-Franco JL, Martínez-Mayorga K, Langer T, Cuanalo-Contreras K, and Agrafiotis DK (2012). Recognizing Pitfalls in Virtual Screening: A Critical Review. J. Chem. Inf. Model, 52 (4), 867–881. 10.1021/ci200528d. [DOI] [PubMed] [Google Scholar]
  • 36.Willett P (2006). Similarity-Based Virtual Screening Using 2D Fingerprints. Drug Discovery Today, 11 (23–24), 1046–1053. 10.1016/j.drudis.2006.10.005. [DOI] [PubMed] [Google Scholar]
  • 37.Moriwaki H, Tian Y-S, Kawashita N, and Takagi T (2018). Mordred: A Molecular Descriptor Calculator. J. Chem. Inf, 10 (1), 4. 10.1186/s13321-018-0258-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Bannwarth C, and Ehlert S, Grimme S (2019). GFN2-XTB—An Accurate and Broadly Parametrized Self-Consistent Tight-Binding Quantum Chemical Method with Multipole Electrostatics and Density-Dependent Dispersion Contributions. J. Chem. Theory Comput, 15 (3), 1652–1671. 10.1021/acs.jctc.8b01176. [DOI] [PubMed] [Google Scholar]
  • 39.Jolliffe IT, and Cadima J (2016). Principal Component Analysis: A Review and Recent Developments. Phil. Trans. R. Soc. A, 374. 10.1098/rsta.2015.0202. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Spicher S, Plett C, Pracht P, Hansen A, and Grimme S (2022). Automated Molecular Cluster Growing for Explicit Solvation by Efficient Force Field and Tight Binding Methods. J. Chem. Theory Comput, 18 (5), 3174–3189. 10.1021/acs.jctc.2c00239. [DOI] [PubMed] [Google Scholar]
  • 41.Müller T, Sharma S, Gross EKU, and Dewhurst JK (2020). Extending Solid-State Calculations to Ultra-Long-Range Length Scales. Phys. Rev. Lett, 125 (25), 256402. 10.1103/PhysRevLett.125.256402. [DOI] [PubMed] [Google Scholar]
  • 42.Goedecker S (1999). Linear Scaling Electronic Structure Methods. Rev. Mod. Phys, 71 (4), 1085–1123. 10.1103/RevModPhys.71.1085. [DOI] [Google Scholar]
  • 43.Yang Y, Weaver MN, and Merz KM (2009). Assessment of the “6–31+G** + LANL2DZ” Mixed Basis Set Coupled with Density Functional Theory Methods and the Effective Core Potential: Prediction of Heats of Formation and Ionization Potentials for First-Row-Transition-Metal Complexes. J. Phys. Chem. A, 113 (36), 9843–9851. 10.1021/jp807643p. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Brethomé AV, Fletcher SP, and Paton RS (2019). Conformational Effects on Physical-Organic Descriptors: The Case of Sterimol Steric Parameters. ACS Catal, 9 (3), 2313–2323. 10.1021/acscatal.8b04043. [DOI] [Google Scholar]
  • 45.Gensch T, dos Passos Gomes G, Friederich P, Peters E, Gaudin T, Pollice R, Jorner K, Nigam A, Lindner-D’Addario M, Sigman MS, and Aspuru-Guzik A (2022). A Comprehensive Discovery Platform for Organophosphorus Ligands for Catalysis. J. Am. Chem. Soc, 144 (3), 1205–1217. 10.1021/jacs.1c09718. [DOI] [PubMed] [Google Scholar]
  • 46.Grimme S (2019). Exploration of Chemical Compound, Conformer, and Reaction Space with Meta-Dynamics Simulations Based on Tight-Binding Quantum Chemical Calculations. J. Chem. Theory Comput, 15 (5), 2847–2862. 10.1021/acs.jctc.9b00143. [DOI] [PubMed] [Google Scholar]
  • 47.Zhao Y, and Truhlar DG (2008). The M06 Suite of Density Functionals for Main Group Thermochemistry, Thermochemical Kinetics, Noncovalent Interactions, Excited States, and Transition Elements: Two New Functionals and Systematic Testing of Four M06-Class Functionals and 12 Other Functionals. Theor Chem Account, 120 (1–3), 215–241. 10.1007/s00214-007-0310-x. [DOI] [Google Scholar]
  • 48.Kulichenko M, Smith JS, Nebgen B, Li YW, Fedik N, Boldyrev AI, Lubbers N, Barros K, and Tretiak S (2021). The Rise of Neural Networks for Materials and Chemical Dynamics. J. Phys. Chem. Lett, 12 (26), 6227–6243. 10.1021/acs.jpclett.1c01357. [DOI] [PubMed] [Google Scholar]
  • 49.Hung T-H, Xu Z-X, Kang D-Y, and Lin L-C (2022). Chemistry-Encoded Convolutional Neural Networks for Predicting Gaseous Adsorption in Porous Materials. J. Phys. Chem. C, 126 (5), 2813–2822. 10.1021/acs.jpcc.1c09649. [DOI] [Google Scholar]
  • 50.Goh GB, Hodas NO, and Vishnu A (2017). Deep Learning for Computational Chemistry. J. Comput. Chem, 38 (16), 1291–1307. 10.1002/jcc.24764. [DOI] [PubMed] [Google Scholar]
  • 51.Beker W, Roszak R, Wołos A, Angello NH, Rathore V, Burke MD, and Grzybowski BA (2022). Machine Learning May Sometimes Simply Capture Literature Popularity Trends: A Case Study of Heterocyclic Suzuki–Miyaura Coupling. J. Am. Chem. Soc, 144 (11), 4819–4827. 10.1021/jacs.1c12005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Reid JP, Proctor RSJ, Sigman MS, and Phipps RJ (2019). Predictive Multivariate Linear Regression Analysis Guides Successful Catalytic Enantioselective Minisci Reactions of Diazines. J. Am. Chem. Soc, 141 (48), 19178–19185. 10.1021/jacs.9b11658. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Santiago CB, Guo J-Y, and Sigman MS (2018). Predictive and Mechanistic Multivariate Linear Regression Models for Reaction Development. Chem. Sci, 9 (9), 2398–2412. 10.1039/C7SC04679K. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Williams WL, Zeng L, Gensch T, Sigman MS, Doyle AG, and Anslyn EV (2021). The Evolution of Data-Driven Modeling in Organic Chemistry. ACS Cent. Sci, 7 (10), 1622–1637. 10.1021/acscentsci.1c00535 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Gowen AA, Downey G, Esquerre C, and O’Donnell CP (2011). Preventing Over-Fitting in PLS Calibration Models of near-Infrared (NIR) Spectroscopy Data Using Regression Coefficients. J. Chemometrics, 25 (7), 375–381. 10.1002/cem.1349. [DOI] [Google Scholar]
  • 56.Tolman CA (1970). Phosphorus Ligand Exchange Equilibriums on Zerovalent Nickel. Dominant Role for Steric Effects. J. Am. Chem. Soc, 92 (10), 2956–2965. 10.1021/ja00713a007. [DOI] [Google Scholar]
  • 57.Tolman CA, Seidel WC, and Gosser LW (1974). Formation of Three-Coordinate Nickel(0) Complexes by Phosphorus Ligand Dissociation from NiL4. J. Am. Chem. Soc, 96 (1), 53–60. 10.1021/ja00808a009. [DOI] [Google Scholar]
  • 58.Tolman CA (1977). Steric Effects of Phosphorus Ligands in Organometallic Chemistry and Homogeneous Catalysis. Chem. Rev, 77 (3), 313–348. 10.1021/cr60307a002. [DOI] [Google Scholar]
  • 59.Haas KL, and Franz KJ (2009). Application of Metal Coordination Chemistry To Explore and Manipulate Cell Biology. Chem. Rev, 109 (10), 4921–4960. 10.1021/cr900134a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Weymuth T, Couzijn EPA, Chen P, and Reiher M (2014). New Benchmark Set of Transition-Metal Coordination Reactions for the Assessment of Density Functionals. J. Chem. Theory Comput, 10 (8), 3092–3103. 10.1021/ct500248h. [DOI] [PubMed] [Google Scholar]
  • 61.DeLisle RK, and Dixon SL (2004). Induction of Decision Trees via Evolutionary Programming. J. Chem. Inf. Comput. Sci, 44 (3), 862–870. 10.1021/ci034188s. [DOI] [PubMed] [Google Scholar]
  • 62.Lin C-Y, Lim S, and Anslyn EV (2016). Model Building Using Linear Free Energy Relationship Parameters–Eliminating Calibration Curves for Optical Analysis of Enantiomeric Excess. J. Am. Chem. Soc, 138 (26), 8045–8047. 10.1021/jacs.6b03928. [DOI] [PubMed] [Google Scholar]
  • 63.Seth-Paul WA, and Van Duyse A (1972). A New Substituent Constant Derived from Carbonyl Stretching Frequencies of Simple R’R”CO Molecules. Spectrochimica Acta Part A: Molecular Spectroscopy, 28 (2), 211–234. 10.1016/0584-8539(72)80248-4. [DOI] [Google Scholar]
  • 64.Wang L, Cao C, Qu J, and Cao C (2023). Substituent Effects on the Stretching Vibration Frequencies of C=C Bridge Bond in Aryl Ethylene with Furyl or Thienyl Group. J of Physical Organic Chem, 36 (2). 10.1002/poc.4433. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

Data S1. Spreadsheet containing DFT descriptors and SMILES strings for the original amine dataset.

2

Document S1. Experimental procedures for ee determination assays, circular dichroism spectra, and calculation details.

Data Availability Statement

The code associated with this work is available in the supplemental information.

RESOURCES