Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Mar 8.
Published in final edited form as: Nat Synth. 2023 Jun 1;2(10):998–1008. doi: 10.1038/s44160-023-00332-4

Accessing three-dimensional molecular diversity through benzylic C–H cross-coupling

Si-Jie Chen 1,2, Cyndi Qixin He 3, May Kong 2, Jun Wang 2, Shishi Lin 3, Shane W Krska 3, Shannon S Stahl 1,*
PMCID: PMC10923599  NIHMSID: NIHMS1921944  PMID: 38463240

Abstract

Pharmaceutical and agrochemical discovery efforts rely on robust methods for chemical synthesis that rapidly access diverse molecules1,2. Cross-coupling reactions are the most widely used synthetic methods3, but these methods typically form bonds to C(sp2)-hybridized carbon atoms (e.g., amide coupling, biaryl coupling) and lead to a prevalence of “flat” molecular structures with suboptimal physicochemical and topological properties4. Benzylic C(sp3)–H cross-coupling methods offer an appealing strategy to address this limitation by directly forming bonds to C(sp3)-hybridized carbon atoms, and emerging methods exhibit synthetic versatility that rivals conventional cross-coupling methods to access products with drug-like properties. Here, we use a virtual library of >350,000 benzylic ethers and ureas derived from benzylic C–H cross-coupling to test the widely held view that coupling at C(sp3)-hybridized carbon atoms affords products with improved three-dimensionality. The results show that the conformational rigidity of the benzylic scaffold strongly influences the product dimensionality. Products derived from flexible scaffolds often exhibit little or no improvement in three-dimensionality, unless they adopt higher energy conformations. This outcome introduces an important consideration when designing routes to topologically diverse molecular libraries. The concepts elaborated herein are validated experimentally through an informatics-guided synthesis of selected targets and the use of high-throughput experimentation to prepare a library of three-dimensional products that are broadly distributed across drug-like chemical space.


High quality medicinal chemistry libraries play a crucial role in the early stages of the drug discovery and development pipeline5. Since its debut, Lipinski’s “Rule of 5”6 has provided a framework for the design of such libraries by drawing attention to important molecular properties—the number of hydrogen bond donors/acceptors, molecular weight, and lipophilicity—typically encountered in successful orally administered drug molecules. Recent medicinal chemistry efforts have expanded on these concepts, with a particular emphasis on the three-dimensional (3D) character of drug candidates4,7.

One strategy to assess the “dimensionality” of organic molecules involves calculating the principal moments of inertia (PMI) of a molecule along three orthogonal axes, I1, I2, and I3. Normalization of these values, achieved by dividing the two smaller values by the largest value [PMI(Y) = I1/I3 and PMI(X) = I2/I3], permits structural comparison of different molecules. To assess the current state of drug-like chemical matter, we performed a PMI analysis of over 9,400 clinically investigated compounds (CIC), including drugs approved in the United States and other countries (see Methods for details). The PMI(X) and PMI(Y) values for these molecules may be plotted to assess their distribution within a triangular array, which has vertices corresponding to idealized one-, two-, or three-dimensional structures with rod-, disc-, or sphere-like symmetry7 (grey dots, Fig. 1a).

Fig. 1. Benzylic C–H cross coupling reactions provide access to 3D chemical space.

Fig. 1.

a, Known bioactive compounds are topologically diverse, including planar structures like flurbiprofen (1) and 3D structures like JHW-007 (2). b, PMI of 1 and 2 with reference to selected bioactive molecules. c, Benzylic C(sp3)–H cross coupling reactions offer opportunities to introduce more three-dimensionality in contrast to C(sp2)-focused functionalization methods. Compounds 3 and 4 were visualized with force-field minimized conformations (see Methods for details).

The plot in Fig. 1a shows that the CIC structures are especially concentrated toward the axis that connects vertices corresponding to the rod and disc shapes the PMI plot. Such structural features correlate with suboptimal physicochemical properties like low water solubility, which often lead to high attrition rates during development4, and they partly reflect the prevalence of C(sp2)-coupling methods, such as amide and palladium-catalyzed cross-coupling methods2,3, in chemical library synthesis. This phenomenon is illustrated by two representative molecules taken from different regions of the PMI plot in Fig. 1a: the anti-inflammatory drug flurbiprofen (1)8 and the dopamine transporter (DAT) inhibitor JHW-007 (2)9 (Fig. 1b). The biaryl structure 1 can be readily accessed by Suzuki-Miyaura coupling10 and appears directly along the vector that connects the vertices corresponding to rod and disc shapes in the PMI plot in Fig. 1a. In contrast, the C(sp3)-rich, tropane-derived structure 2 is positioned substantially away from the rod-disc axis, toward the spherical region of the plot.

We hypothesized that a C(sp3)–H cross-coupling strategy would provide efficient access to more diverse 3D molecular structures.11 This concept is illustrated by structures 3 and 4 in Fig. 1c, which are derived from C(sp2) versus C(sp3) coupling with an alcohol reaction partner involving the same building block. The products appear in very different regions of the PMI plot in Fig. 1a. On the other hand, C(sp3)–H functionalization reactions need to meet several synthetic criteria to offer a credible complement to conventional cross-coupling methods. For example, the methods should (a) use the C–H substrate as the limiting reagent, (b) exhibit predictable and high C–H site-selectivity within complex building blocks, and (c) undergo coupling with diverse reaction partners, ideally derived from large pools of readily available reagents. Benzylic C(sp3)–H coupling methods that have emerged in recent years appear to meet these criteria12,13. These methods leverage the prevalence of (hetero)aromatic rings present in medicinally relevant building blocks, and many are compatible with limiting C–H substrate14. They support direct C(sp3)–H coupling with a variety of partners, including arylboronic acids15,16, alcohols17, and azoles18, in addition to sequential C(sp3)–H functionalization/diversification via halogenation/substitution1922 and isocyanation/amine addition23 and create new opportunities to modify existing complex molecules via late-stage C–H functionalization11, 24. The present study uses cheminformatics tools, paired with synthetic demonstrations, to test whether these C(sp3)–H coupling reactions are equipped to expand the 3D diversity of drug-like molecules. The results elaborated herein highlight both the opportunities and the constraints of these synthetic methods in expanding molecular dimensionality.

Enumeration of a benzylic cross-coupling product library.

Benzylic ethers and ureas were selected as representative target structures in this study,17,23 motivated by the diversity of commercially available alcohol and amine coupling partners, in addition to the compatibility of these reactions with high-throughput experimentation platforms commonly used in drug discovery. Twenty (hetero)benzylic scaffolds (A01-A20, Fig. 2a) were selected as core structures for C(sp3)–H coupling. This array features diversity in the benzylic C(sp3)–H sites via inclusion of differently substituted benzenes and aromatic heterocycles bearing acyclic and cyclic substituents. A complementary set of coupling partners, consisting of 3,249 alcohols, 6,108 anilines and 11,316 alkylamines, was then selected for combinatorial enumeration of benzylic ethers and ureas (see Methods for additional details) (Fig. 2b). Substrates of this type have found broad utility in C(sp2) coupling reactions, such as Ullmann25, Buchwald-Hartwig26 and amide coupling27 reactions, and they are widely used in drug discovery. The collection of C–H and alcohol/amine coupling partners affords 368,948 unique products. This virtual benzylic cross-coupling product library (P01-P20) is suitable for cheminformatic analysis and comparison to the bioactive molecule collection, CIC, noted in Fig. 1a.

Fig. 2. Enumeration of P01-P20 and topological comparison between P01-P20 and CIC.

Fig. 2.

a, Display of selected benzylic C(sp3)–H scaffolds for cross coupling product enumeration. b, Enumeration of P01-P20 via etherification with alcohols and isocyanation/urea formation sequence with amines. c, Comparison of 3D scores between CIC and P01-P20 compounds by parent scaffold. See Methods for details of box-whisker plots in panel c. 3DAvg, average 3D scores; 3DScaffold, the 3D scores of the benzylic C–H scaffolds A01–A20.

Topological analyses.

Structural analysis of the enumerated compounds began with assessment of their topological features using the PMI methodology noted above. The sum of normalized PMI values, PMI(X) + PMI(Y), defines a numerical “3D score”28 that simplifies comparison of molecules. 3D scores were obtained for force-field-minimized conformations of the CIC library and the full collection of coupling products P01-P20 derived from individual (hetero)benzylic core structures are depicted as box-whisker plots in Fig. 2c.

The CIC structures exhibit average 3D scores (3DAvg) of 1.11, corresponding to linear/planar geometries (Fig. 2c). The data for P01-P20 reveal that the benzylic C(sp3)–H scaffold has a strong influence on the topology of the cross-coupling products. Acyclic alkylarenes A01-A05 mostly afford products with low 3D scores (3DAvg = 1.03–1.07), reflecting no clear improvement in three-dimensionality relative to the CIC structures. Di- and triarylmethane derivatives exhibit somewhat increased three-dimensionality in the coupling products, with the substitution pattern on arenes showing a minor effect (A06, A08; 3DAvg = 1.11 and 1.12, respectively), while a rigid backbone significantly increases the spherical shape (A07, A09; 3DAvg = 1.21 and 1.31, respectively). Higher 3D scores are also observed with six-membered (A10-A15; 3DAvg = 1.14–1.22) and five-membered (A16-A20; 3DAvg = 1.14–1.27) cyclic alkylarene scaffolds. The 3D scores of the di/triarylmethanes and cyclic benzylic scaffolds also consistently span a wider distribution of 3D scores than the acyclic alkylarenes, reflecting broader diversity in the topology of the benzylic cross coupling products among these structures.

Scaffold A11 features a C(sp2)–Cl functional group adjacent to one of the benzylic C(sp3)–H sites, providing the basis for comparison of topological changes that arise from functionalization at sp2 versus sp3 sites (Fig. 3a). In silico combinatorial cross couplings at the C(sp2)–Cl moiety of A11 were performed with the same set of alcohols and amines used for benzylic C(sp3)–H cross coupling, affording substituted heteroarene analogs (P11-sp2) of the benzylic ethers and ureas (P11-sp3). The scaffold A11 is structurally flat (3DScaffold = 1.02) and located along the rod-disc axis in a PMI plot. The C(sp2)–Cl functionalization products P11-sp2 exhibit minimal three-dimensionality, with average 3D score of only 1.04 (Fig. 3a). In contrast, the P11-sp3 product library exhibits both a higher and broader range of 3D scores: 3DAvg = 1.22, with the box encompassing the first-to-third quartile of values ranging from 1.16–1.28. Replacement of the chloride substituent with a hydrogen atom at the C(sp2) site in the P11-sp3 structures generates a complementary family of structures P11-sp3 C–H with similar 3D scores (3DAvg = 1.20, Fig. 3a), indicating that the chloride substituent does not have a major influence on the dimensionality of the P11-sp3 structures. The same type of analysis was conducted with A10, comparing the product structures derived from coupling at the proximal C(sp3)–H or C(sp2)–Br sites in this scaffold. The same trends were observed, with significantly higher 3D scores observed for products derived from coupling at the C(sp3) rather than the C(sp2) site: 3DAvg = 1.05 (sp2) versus 1.19 (sp3) (see Supplementary Fig. 3 for details). The results here draw attention to the merits of PMI analysis and associated 3D scores for assessing three dimensionality, relative to other metrics, such as the fraction of sp3 carbon atoms, f(sp3). The structures P11-sp2 and P11-sp3 C–H are isomers and have the same f(sp3) values, yet they exhibit significantly different 3D scores from the PMI analysis.

Fig. 3.

Fig. 3.

Analysis of functionalization products and structural derivatives of A11. a, Comparison of PMI and 3D scores between sp2 and sp3 functionalization products derived from A11. b, Conformational study of selected sp2, sp3 and sp3-acyclic ethers and ureas derived from A11. See Methods for details of box-whisker plots in panels a and b. 3DAvg, average 3D scores; 3DMin, 3D scores of minimum-energy conformer of a given compound.

The influence of the conformational constraints of the fused cyclohexane ring in P11-sp3 structures was probed by performing PMI analysis of P11-sp3 acyclic structures, in which the benzylic C–H bond is present on an n-butyl group rather than a fused cyclohexane. The lowest energy conformers of the acyclic products have notably lower 3D scores (3DAvg = 1.10) relative to the P11-sp3 structures (Fig. 3a). A broader perspective was achieved, however, by considering P11 structures derived from one alcohol and one urea coupling partner and evaluating the available conformers within 10 kcal/mol of the most stable structure (Fig. 3b). For the P11-sp2 ether, only 5 conformations were identified and all exhibited low 3D scores (1.06–1.10). More conformations with a wider range of 3D scores were identified for the C(sp3)-derived ethers P11-sp3 ether (21 conformations, 3D score = 1.07–1.67) and P11-sp3 ether acyclic (117 conformations, 3D score = 1.08–1.65). Similar results were obtained from the urea coupling partners. P11-sp2 urea and P11-sp3 urea exhibited 48 unique conformations, while the P11-sp3 acyclic urea revealed significantly higher flexibility (371 conformers). The C(sp2)-based urea conformers exhibited the lowest 3D scores among the products (3D scores: P11-sp2 urea, 1.04–1.35; P11-sp3 urea, 1.08–1.64; P11-sp3 acyclic urea, 1.10–1.57). The flexibility of the less constrained acyclic structures allows them to access 3D conformations that match 3D scores of the cyclic derivatives, albeit at higher energies. The most stable conformation of the acyclic ether and urea showed lower 3D scores than their cyclic counterparts (P11-sp3 ether: 3DMin = 1.66, acyclic: 3DMin = 1.33; P11-sp3 urea: 3DMin = 1.57, acyclic: 3DMin = 1.17).

The collection of data in Fig. 3 demonstrates that benzylic C(sp3)–H coupling reactions employing cyclic or otherwise constrained benzylic scaffolds reliably access structures with improved three-dimensionality relative to C(sp2)-based coupling products. These results show that the identity of the C(sp3)–H coupling partner plays an important role in the spatial distribution of the coupled fragments, and more rigid scaffolds should offer privileged access to coupling products with enhanced three-dimensionality.

Principal component analysis.

Drug discovery involves multiparameter optimization, and successful drugs typically must balance numerous physicochemical and biological properties29. Principal component analysis (PCA) is a technique that allows representation of multidimensional data in lower dimensions, often as a two-dimensional plot, while maximizing variance of the projected data30. Entities with similar properties appear in similar regions of such plots, and this approach is commonly used in medicinal chemistry to assess the physicochemical relationships among molecules within a compound library31,32. Thus, PCA methodology provides an ideal approach to compare the properties of the benzylic C(sp3)–H coupling products (P01-P20, >368,000 molecules) with the existing collection of >9,400 clinically investigated compounds (CIC).

Twenty-one molecular descriptors were included in the PCA, examples of which include molecular weight, number of charged atoms, number of stereogenic centers, number of hydrogen bond donors/acceptors, AlogP (which describes the lipophilicity), fractional molecular polar surface area, quantitative estimate of drug-likeness (QED)33, 3D scores, f(sp3), and plane of best fit (PBF) (see Supplementary Table 1 for a full listing)34. The P01-P20 and CIC compounds were combined for PCA, and four principal components, PC1–PC4, were found to account for 72% of the cumulative variance. Separate two-dimensional plots of PC1/PC2 and PC3/PC4 were then prepared to visualize the distribution of physicochemical properties of the CIC and P01-P20 compounds, with darker colors used to highlight more densely populated regions of the plots (Figs. 4a4d). The PC1/PC2 plots in Figs. 4a and 4b cluster both sets of compounds similarly, with primary contributions associated with molecule size, polar surface area, and the number of hydrogen bond donors/acceptors (see Supplementary Table 1 for additional data). The PC3/PC4 plots in Figs. 4c and 4d also show a similar distribution among the CIC and P01-P20 compounds, with distribution influenced most strongly by the number of positive/negative charges and stereogenic centers in the molecules and the f(sp3) value.

Fig. 4. Principal components analysis (PCA) comparing the physicochemical features of CIC and P01-P20 compounds.

Fig. 4.

a,b - Physicochemical features reflecting PC1 and PC2. c,d - Physicochemical features reflecting PC3 and PC4. Libraries are visualized in density heatmaps, where CIC (n = 9,425) and P01-P20 (n = 368,948) compounds were analyzed by 100 × 100 bins, and the color of each bin was determined by the number of compounds contained therein. Selected pairs of CIC and P01-P20 compounds with similar coordinates are showcased to highlight the higher 3D scores accessible with the P01-P20 compounds.

The overlap in chemical space occupied by the CIC and P01-P20 libraries evident in Fig. 4 shows that benzylic C(sp3)–H cross-coupling offers a direct route to drug-like molecules from readily available building blocks. This synthetic efficiency is complemented by the improved three-dimensionality of many of these molecules, illustrated by comparing six prototypical molecules present in the CIC library with direct counterparts within the benzylic cross-coupling library that appears in the same region of the PCA plots (cf. structures in Figs. 4a and 4c versus those in Figs. 4b and 4d). In each case, the benzylic coupling product has similar physicochemical properties to the reference CIC compound, but substantially higher 3D scores.

Synthetic implementation.

The informatics analysis shows that benzylic C–H cross coupling provides an efficient strategy to access diverse drug-like molecules with improved three dimensionality. Products derived from the substrates enumerated here are amenable to synthesis using high-throughput experimentation methods;23 however, the PMI and PCA plots of the type in Figs. 2 and 4 also provide a means to guide targeted selection and synthesis of products with enhanced three-dimensionality and tailored physicochemical properties. To validate this concept, sixteen benzylic ethers and ureas comprising structures with high 3D scores and distributed throughout the PCA plots (Fig. 5) were selected for synthesis. Reaction conditions directly adapted from previously reported protocols17,23 led to moderate-to-good yields of the desired cross coupling products. The resulting compounds incorporate prominent structural motifs in medicinal chemistry, including sugar fragments (A08–1, A15–1)35, a phosphonate (A18–1)36, spirocyclic and heteroaryl ureas (A08–2, A12–2, A12–3)37,38, piperidine and pyrrolidine (A06–1, A16–2)39, amino acid fragments (A06–2, A08–3), and a polyamine derivative (A13–1)40. As evident from the PCA plots in Fig. 5b, these benzylic cross-coupling products exhibit diverse physicochemical features that resemble the properties of known drug candidates and bioactive compounds. Moreover, the benzylic cross-coupling products have significantly higher three-dimensionality relative to the CIC collection. The 3D scores determined for the CIC molecules and the benzylic ethers, N-aryl ureas, and N-alkyl ureas result in 3DAvg values of 1.11, 1.29, 1.26, and 1.38. The N-alkyl ureas show particularly broad and high 3D scores, with three of the seven structures exhibiting a 3D score ≥ 1.5.

Fig. 5. Synthesis of medicinally relevant benzylic C–H cross coupling products and comparison with bioactive molecules.

Fig. 5.

a, Assessment of benzylic ethers and ureas that sample the chemical space generated from virtual enumeration. b, Comparison of selected P01-P20 products (including all isolated regio- and stereoisomers) with CIC space in PCA, PMI and Box-Whisker (3D scores) plots. aReaction run with 15 mol % CuCl/biox. bReaction run at room temperature. cReaction run with 20 mol % CuCl/biox. dReaction run at 30°C. eReaction run in DCM. fIsocyanation reaction run at 40°C. gIsocyanation reaction run at room temperature.

High-throughput experimentation (HTE) is a powerful tool in medicinal chemistry, miniaturizing and parallelizing reactions to enable resource-efficient evaluation of synthetic methods and rapid delivery of diverse molecules on a scale suitable for biological screening41. In order to validate the HTE compatibility of the methods described herein, arrays were designed for synthesis of libraries of diverse benzylic ethers and ureas (Fig. 6). The arrays included four benzylic C–H scaffolds with eight alcohols to access 32 different benzylic ethers (see Section 8 in Supplementary Information for details of reaction setups). The same four benzylic C–H scaffolds were subjected to isocyanation, and the resulting four benzylic isocyanates were subjected to reactions with 16 different amines to access 64 unique benzylic ureas. Desired products were formed in 23 of the 32 ether coupling reactions and in 47 of the 64 of the urea coupling reactions, an overall success rate that is comparable with traditional library synthesis methods42. Products derived from 21 of the alcohol (O2-O8) and amine building blocks (N1-N5, N7-N12 and N14-N16) were purified and characterized. PMI analysis showed that these products exhibited significantly improved 3D scores relative to established drug candidates from the group of CIC defined above (CIC 3DAvg = 1.11 vs. benzylic coupling products 3DAvg = 1.23; see Supplementary Fig. 4).

Fig. 6. High-throughput synthesis of medicinally relevant benzylic C–H cross coupling products.

Fig. 6.

a, Parallel synthesis of 48 benzylic ethers and 48 benzylic ureas via cross coupling of 4 benzylic C–H scaffolds and 12 alcohols and 12 amines respectively. Reaction conditions were the same as described in Fig. 5a. b. Display of reaction outcomes of the 96 benzylic C–H cross coupling reactions monitored by HPLC-MS. Products were found in 70 reactions out of 96 substrate pairs, with 21 of which were selected for purification. Benzylic regioisomers for some products were also observed (see Supplementary Information for details)

Conclusions

Robust cross-coupling methods, such as amide-bond formation and Suzuki-Miyaura and Buchwald-Hartwig reactions, have demonstrated tremendous value in drug discovery, providing rapid access to vast chemical space by enabling reliable assembly of diversely functionalized molecules derived from broad pools of reaction partners. Along with their advantages, however, these methods bias the resulting products toward two-dimensional structures with suboptimal physicochemical properties. The results described herein highlight the ability to use ubiquitous (hetero)benzylic C–H bonds as a functional group for direct cross-coupling with abundant reaction partners, without requiring pre-installation of a functional group to promote site-selectivity. These C(sp3)–H coupling reactions may be performed with ease and efficacy similar to that of conventional cross-coupling methods, while accessing products with improved three-dimensionality. Cheminformatics data reveal that this family of reactions affords products distributed throughout drug-like chemical space that feature higher three-dimensionality than existing pharmaceuticals and clinical candidates. The conformational flexibility of the benzylic C–H building block influences the spatial distribution of substituents in the resulting product. More rigid scaffolds reliably afford products with improved three-dimensionality. Flexible scaffolds often generate products with modest three-dimensionality in their lowest energy conformations, but they can access higher three-dimensionality by adopting higher energy conformations. Collectively, benzylic C–H cross-coupling reactions offer an important addition to the synthetic toolbox for early-stage drug discovery and development.

Methods.

General Procedure for Cross Coupling of Benzylic C–H Substrates and Alcohols.

Copper(I) chloride (2.0 mg, 0.020 mmol, 10 mol%), 2, 2’-bisoxazoline (2.8 mg, 0.020 mmol, 10 mol%), NFSI (126.1 mg, 0.40 mmol, 2.0 equiv.), alcohols (0.30 mmol, 3.0 equiv.) and benzylic substrate (if solid, 0.20 mmol, 1.0 equiv.) were added under air to a 4 ml borosilicate glass vial containing a magnetic stir bar. Then the vial was capped with a pierceable Teflon cap. A needle was pierced through the cap to facilitate exchange of the vial headspace with an inert atmosphere. Then the vial was moved into a glove box, through three vacuum-nitrogen-backfill cycles. The needle was removed, and the vial was taken out of the glove box (now sealed under nitrogen). Solvent (1.0 mL), benzylic substrate (if liquid, 0.20 mmol, 1.0 equiv.), and diisopropyl phosphite (16.3 μL, 0.10 mmol, 0.5 equiv.) were added into the vial by injection through the cap. The sealed vial was heated at 40 °C and stirred for 18 h. When the reaction finished, the mixture was cooled down to room temperature. Then the mixture was evaporated under vacuum and the crude mixture was purified by column chromatography (silica gel, eluted by pentane:ethyl acetate = 20:1 to 4:1).

General Procedure for Benzylic C–H Isocyanation/Urea Synthesis.

Copper(I) acetate (2.9 mg, 0.024 mmol, 6.0 mol%), 2, 2’-bisoxazoline (3.4 mg, 0.024 mmol, 6.0 mol%), NFSI (315 mg, 1.0 mmol, 2.5 equiv.) and benzylic substrate (if solid, 0.40 mmol, 1.0 equiv.) were added under air to an 8 ml borosilicate glass vial containing a magnetic stir bar. Then the vial was capped with a pierceable Teflon cap. A needle was pierced through the cap to facilitate exchange of the vial headspace with an inert atmosphere. Then the vial was moved into a glove box, through three vacuum-nitrogen-backfill cycles. The needle was removed in the glove box. Acetonitrile (3.3 mL), benzylic substrate (if liquid, 0.20 mmol, 1.0 equiv.), diisopropyl phosphite (33 μL, 0.10 mmol, 0.5 equiv.) and trimethylsilyl isocyanate (160 μL, 1.2 mmol, 3.0 equiv.) were added into the vial. The vial was capped, removed from the glove box, heated at 30 °C and stirred for 2 h. When the reaction finished, the vial was uncapped, and an amine (1.4 mmol, 3.5 equiv.) was added into the reaction mixture. Then the mixture was heated at 50 °C and stirred for 20 h. When finished, the reaction mixture was purified by reverse phase column chromatography (H2O/MeCN = 80:20 to 40:60) without further workup.

Procedure for Micro-Scale Cross Coupling of Benzylic C–H Substrates and Alcohols.

A stock solution/suspension of benzylic C–H substrates (0.8 mmol, 1.0 equiv.), NFSI (1.6 mmol, 2.0 equiv.), alcohols (2.4 mmol, 3.0 equiv.) was prepared in acetone (4 ml). Each benzylic C–H substrate stock solution (200 μl, containing 0.04 mmol of the benzylic C–H substrate) was dispensed into 16 1 ml glass vials charged with a magnetic stir bar. Vials containing A06 and A11 were placed into one 96-well aluminum reaction block, whereas vials containing A16 and A18 were placed into a separate 96-well aluminum reaction block. Vials in these two reaction blocks were then dried under vacuum. Stock solutions of CuCl (0.432 mmol, 0.008 mmol for each reaction, 20 mol %), 2, 2’-bisoxazoline (0.432 mmol, 20 mol %) and diisopropyl phosphite (1.08 mmol, 0.5 equiv.) in both dichloroethane (DCE) and dichloroethane:hexafluoroisopropanol = 4:1 (DCE:HFIP, 0.2 M, 24.3 mL) were prepared. The heating blocks and the two stock solutions were moved into a glovebox under N2 atmosphere, where 200 μl stock solution was dispensed into 8 glass vial with different alcohol building blocks, resulting 16 reactions (8 in DCE and 8 in DCE:HFIP for each benzylic C–H substrate). The 96-well plates were then sealed and removed from the glovebox. The 96-well reaction block containing A06 and A11 was heated to 40°C and stirred for 16 h. The 96-well reaction block containing A16 and A18 was heated to 30°C and stirred for 16 h. After that, the 96-well reaction blocks was removed from the heating blocks, cooled down to room temperature and the reaction mixtures were transferred to a 96-well filter. An aliquot of each of the resulting reaction mixtures was diluted with MeCN:DMSO = 3:1 and the diluted reactions were submitted for LC-MS analysis for product detection. Selected reactions were purified by normal-phase and/or reverse-phase column chromatography.

Procedure for High-Throughput Micro-Scale Benzylic C–H Isocyanation/Urea Synthesis.

Stock solutions/suspensions of 16 different amines in MeOH (0.14 mmol, 3.5 equiv. for each reaction) were dispensed into 1 ml glass vials in a 96-well aluminum reaction block. Vials in the reaction block were then dried under vacuum. Benzylic isocyanates (0.8 mmol scale with reference to benzylic C–H substrates, 1.0 equiv. in MeCN, 0.12 M, 6.7 mL) will be prepared following the general procedure described above. Upon completion of the benzylic isocyanation reactions, the reaction mixtures (333 μL, 0.04 mmol with respect to the benzylic C–H substrates) were directly dispensed into the 1 ml vials charged with amine solutions. The 96-well reaction block was then sealed, and the reaction mixture was heated at 50°C for 20 h. After that, the 96-well reaction block was removed from the heating block, cooled down to room temperature and the reaction mixtures were transferred to a 96-well filter. The reaction mixtures were then filtered under vacuum and rinsed with 400 μL DMSO. An aliquot of each of the resulting reaction mixtures was diluted with MeCN and submitted for LC-MS analysis for product detection. Selected reaction mixtures were purified by reverse-phase column chromatography and preparative scale HPLC purification.

Data set of clinically investigated compounds (CIC).

To obtain a collection of relevant drug or drug-like molecules, 9,800 compounds that have been investigated in clinical trials (including approved drugs) were downloaded from the ZINC database (accessed July 6th, 2022, https://zinc.docking.org/substances/subsets/in-trials/). These compounds were filtered (molecular weight ≥ 136 (molecular weight of the coupling product of ethylbenzene and methanol), carbon count ≥ 5, −12 ≤ AlogP ≤ 12, polar surface area ≤ 500 and metal atom count = 0). Finally, 9,425 compounds were compiled in the collection for comparison.

Enumeration of benzylic C–H cross coupling products.

Benzylic C–H scaffolds and monomers were selected and ordered from commercial sources. Certain monomers were intentionally excluded due to interfering functional groups (e.g. carboxylic acids) or no reactivity (e.g. tertiary amines and anilines, which do not react with isocyanates to afford ureas). The “Enumerate Combinatorial Reaction” component in Pipeline Pilot (version 18.1.100.11) was used to enumerate 2D structures of benzylic C–H cross coupling products. All the output products were hydrolyzed to obtain final products with more drug-relevant physicochemical properties (for example, alkyl carbamates to amines, alkyl esters to carboxylic acids and alkyl phosphates to phosphoric acids; alkyl defined as Me, Et, iPr or tBu). These hydrolysis steps led to some duplicated molecules. These duplicates were removed prior to performing informatics analysis.

3D structures of the benzylic C–H cross-coupling products were generated with force field minimized conformations in Pipeline Pilot. These 3D structures were used to compute the topology-dependent properties, including PMI, 3D scores and PBF scores.

Conformational search for selected benzylic C–H cross coupling products

MacroModel43 was used to perform conformational searches with the OPLS3e force field in implicit water solvation with a 10 kcal/mol energy cutoff. Redundant conformations were eliminated with an RMSD cutoff of 0.75 Å44.

Molecular properties.

Twenty-one common physical chemical and topological properties were used as molecular descriptors to probe the medicinal relevance of the enumerated compounds by comparison with bioactive compounds: molecular weight, number of positive atoms, number of negative atoms, number of rings, number of ring assemblies, number of aromatic rings, number of stereo atoms, number of N atoms, number of O atoms, number of hydrogen bond donors, number of hydrogen bond acceptors, AlogP, molecular volume, molecular polar surface area, fractional molecular polar surface area, dipole magnitude, quantitative estimate of drug-likeness (QED)33, fraction of sp3 carbon atoms -f(sp3), 3D scores and plane of best fit scores (PBF). All the properties but 3D scores and PBF scores were directly calculated with Pipeline Pilot (version 18.1.100.11) from Dassault Systèmes (Vélizy-Villacoublay, France). 3D scores were calculated as 3D score = npr1 + npr2, where npr1 = I1/I3, and npr2 = I2/I3 and I1, I2 and I3 are the three calculated principal moments of inertia in an ascending order (I1I2I3). I1, I2 and I3 were directly measured with Pipeline Pilot and then calculated to obtain 3D scores.

PBF scores were calculated with RDKit package for Python. 3D conformation for each of enumerated compounds and bioactive compounds were computed with Pipeline Pilot and exported as SD files, which were then loaded into Jupyter Notebook with RDKit and PBF values were calculated for each of the conformers.

Principal Component Analysis.

The principal components were calculated with R programming language and the procedure was adopted from literature45. Prior to PCA calculation, all the physical chemical and topological properties calculated were centered (subtract mean) and scaled (divide by variance). The cumulative total variance explained by the principal components were as follows: 35% (PC1), 51% (PC2), 63% (PC3) and 72% (PC4). Density plots were created with Python to avoid significant overlap of spot markers due to the large number of compounds. The loadings of all the properties were tabulated to assist the interpretation of the PCA (see Section 6 in Supplementary Information for details and further explanation).

Box-Whisker Plots.

The box-whisker plots were generated in Microsoft Excel 2016. The boxes were defined by the 1st and 3rd quartiles (Q1 and Q3) of a given dataset, with which the interquartile range (IQR) can be calculated as IQR = Q3 – Q1. Based on Tukey’s fences41, the upper whisker boundary was defined as Q3+1.5IQR (or the maximum value of the dataset that is smaller than Q3+1.5IQR) and the lower whisker boundary was defined as Q1–1.5IQR (or 1, the lowest possible 3D score, or the minimum value of the dataset, whichever is larger). Maximum and minimum 3D scores were the largest and smallest data points found within the boundaries. Data points with 3D scores > Q3+1.5IQR or < Q1–1.5IQR were considered as outliers and not shown in the plots.

Supplementary Material

SI

Acknowledgements.

We thank Sung-Eun Suh for helpful discussions about applications of benzylic C–H isocyanation/urea synthesis, Thomas G. Greshock for suggestions on PCA and HTE, Ilia A. Guzei and Kyana M. Sanders for assistance with X-ray crystallographic characterization of A16–1 and Bita A. Parvizian for LCMS support. This work was supported by the NIH (R35 GM134929, to S.S.S.) and Merck & Co., Inc., Kenilworth, NJ, USA (travel funds to S.-J.C.). Spectroscopic instrumentation was supported by a gift from Paul. J. Bender, the NSF (CHE-1048642), and the NIH (S10 OD020022).

Footnotes

Competing Interests.

The authors declare no competing interests.

Author Information Reprints and permissions information is available at www.nature.com/reprints.

Supplementary Information is available in the online version of the paper.

Code Availability. Enumeration sequences and molecular property calculations described in this work are available for use as components in Pipeline Pilot. Please see Supplementary Section 10 for the python code used for PBF calculations and Section 11 for the R code used for PCA.

Data Availability.

The authors declare that all of the data supporting the findings of this study are available within the paper and its supplementary information file.

References

  • 1.Blakemore DC et al. Organic Synthesis Provides Opportunities to Transform Drug Discovery. Nat. Chem. 10, 383–394 (2018). [DOI] [PubMed] [Google Scholar]
  • 2.Boström J, Brown DG, Young RJ & Keserü GM Expanding the Medicinal Chemistry Synthetic Toolbox. Nat. Rev. Drug Discov. 17, 709–727 (2018). [DOI] [PubMed] [Google Scholar]
  • 3.Brown DG & Boström J Analysis of Past and Present Synthetic Methodologies on Medicinal Chemistry: Where Have All the New Reactions Gone?: Miniperspective. J. Med. Chem. 59, 4443–4458 (2016). [DOI] [PubMed] [Google Scholar]
  • 4.Lovering F, Bikker J & Humblet C Escape from Flatland: Increasing Saturation as an Approach to Improving Clinical Success. J. Med. Chem. 52, 6752–6756 (2009). [DOI] [PubMed] [Google Scholar]
  • 5.Dolle RE Historical Overview of Chemical Library Design. in Chemical Library Design (ed. Zhou JZ) 3–25 (Humana Press, 2011). [DOI] [PubMed] [Google Scholar]
  • 6.Lipinski CA; Lombardo F; Dominy BW; Feeney PJ Experimental and Computational Approaches to Estimate Solubility and Permeability in Drug Discovery and Development Settings. Adv. Drug Deliv. Rev. 23, 3–25 (1997). [DOI] [PubMed] [Google Scholar]
  • 7.Sauer WHB & Schwarz MK Molecular Shape Diversity of Combinatorial Libraries: A Prerequisite for Broad Bioactivity †. J. Chem. Inf. Comput. Sci. 43, 987–1003 (2003). [DOI] [PubMed] [Google Scholar]
  • 8.Brogden RN, Heel HC, Speight TM, Avery GS Fluriprofen: A Review of its Pharmacological Properties and Therapeutic Use in Rheumatic Diseases. Drug. 18, 417–438 (1979). [DOI] [PubMed] [Google Scholar]
  • 9.Agoston GE et al. Novel N-Substituted 3α-[Bis(4’-fluorophenyl)methoxy]tropane Analogues: Selective Ligands for the Dopamine Transporter. J. Med. Chem. 40, 4329–4339 (1997). [DOI] [PubMed] [Google Scholar]
  • 10.Quasdorf KW, Riener M, Petrova KV & Garg NK Suzuki–Miyaura Coupling of Aryl Carbamates, Carbonates, and Sulfamates. J. Am. Chem. Soc. 131, 17748–17749 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Cernak T, Dykstra KD, Tyagarajan S, Vachal P & Krska SW The Medicinal Chemist’s Toolbox for Late Stage Functionalization of Drug-like Molecules. Chem. Soc. Rev. 45, 546–576 (2016). [DOI] [PubMed] [Google Scholar]
  • 12.Zhang Z, Chen P, Liu G Copper-Catalyzed Radical Relay in C(sp3)–H Functionalization. Chem. Soc. Rev. (2022). DOI: 10.1039/d1cs00727k. [DOI] [PubMed] [Google Scholar]
  • 13.Golden DL, Suh S-E, Stahl SS Radical Advances in C(sp3)–H Functionalization: Foundations in for C–H Cross Coupling. Nat. Chem. Rev, in press. (could cite the ChemRxiv version: [Golden D, Suh S-E, Stahl S. Radical Advances in C(sp3)–H Functionalization: Foundations for C–H Cross-Coupling. ChemRxiv. Cambridge: Cambridge Open Engage; 2022; This content is a preprint and has not been peer-reviewed.] [Google Scholar]
  • 14.Taylor RD, MacCoss M & Lawson ADG Rings in Drugs: Miniperspective. J. Med. Chem. 57, 5845–5859 (2014). [DOI] [PubMed] [Google Scholar]
  • 15.Zhang W, Chen P & Liu G Copper-Catalyzed Arylation of Benzylic C–H bonds with Alkylarenes as the Limiting Reagents. J. Am. Chem. Soc. 139, 7709–7712 (2017). [DOI] [PubMed] [Google Scholar]
  • 16.Zhang W, Wu L, Chen P & Liu G Enantioselective Arylation of Benzylic C–H Bonds by Copper-Catalyzed Radical Relay. Angew. Chem. Int. Ed. 58, 6425–6429 (2019). [DOI] [PubMed] [Google Scholar]
  • 17.Hu H et al. Copper-catalysed Benzylic C–H Coupling with Alcohols via Radical Relay Enabled by Redox Buffering. Nat. Catal. 3, 358–367 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Chen S-J, Golden DL, Krska SW, Stahl SS Copper-Catalyzed Cross-Coupling of Benzylic C–H Bonds and Azoles with Controlled N-Site Selectivity. J. Am. Chem. Soc. 2021, 14438–14444 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Buss JA, Vasilopoulos A, Golden DL & Stahl SS Copper-Catalyzed Functionalization of Benzylic C–H Bonds with N -Fluorobenzenesulfonimide: Switch from C–N to C–F Bond Formation Promoted by a Redox Buffer and Brønsted Base. Org. Lett. 22, 5749–5752 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Vasilopoulos A, Golden DL, Buss JA & Stahl SS Copper-Catalyzed C–H Fluorination/Functionalization Sequence Enabling Benzylic C–H Cross Coupling with Diverse Nucleophiles. Org. Lett. 22, 5753–5757 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Zhang Y et al. A Photoredox-catalyzed Approach for Formal Hydride Abstraction to Enable Csp3–H Functionalization with Nucleophilic Partners (F, C, O, N, and Br/Cl). Chem Catal. 2, 292–308 (2022). [Google Scholar]
  • 22.Lopez MA, Buss JA, Stahl SS Cu-Catalyzed Site-Selective Benzylic Chlorination Enabling Net C–H Coupling with Oxidatively Sensitive Nucleophiles. Org. Lett. 24, 597–601 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Suh S-E, Nkulu LE, Lin S, Krska S & Stahl SS Benzylic C–H Isocyanation/Amine Coupling Sequence Enabling High-Throughput Synthesis of Pharmaceutically Relevant Ureas. Chem. Sci. 12, 10380–10387 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Guillemard L, Kaplaneris N, Ackermann L, Johansson MJ Late-stage C–H Functionalization Offers New Opportunities in Drug Discovery. Nat. Rev. Chem. 5, 522–545 (2021). [DOI] [PubMed] [Google Scholar]
  • 25.Sambiagio C, Marsden SP, Blacker AJ & McGowan PC Copper Catalysed Ullmann Type Chemistry: from Mechanistic Aspects to Modern Development. Chem. Soc. Rev. 43, 3525–3550 (2014). [DOI] [PubMed] [Google Scholar]
  • 26.Dorel R, Grugel CP & Haydl AM The Buchwald–Hartwig Amination After 25 Years. Angew. Chem. Int. Ed. 58, 17118–17129 (2019). [DOI] [PubMed] [Google Scholar]
  • 27.Dunetz JR, Magano J & Weisenburger GA Large-Scale Applications of Amide Coupling Reagents for the Synthesis of Pharmaceuticals. Org. Process Res. Dev. 20, 140–177 (2016). [Google Scholar]
  • 28.Prosser KE, Stokes RW & Cohen SM Evaluation of 3-Dimensionality in Approved and Experimental Drug Space. ACS Med. Chem. Lett. 11, 1292–1298 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Wu G et al.Overview of Recent Strategic Advances in Medicinal Chemistry. J. Med. Chem. 62, 9375–9414 (2019) [DOI] [PubMed] [Google Scholar]
  • 30.Medina-Franco J, Martinez-Mayorga K, Giulianotti M, Houghten R & Pinilla C Visualization of the Chemical Space in Drug Discovery. Curr. Comput. Aided Drug Des. 4, 322–333 (2008). [Google Scholar]
  • 31.Kutchukian PS et al. Chemistry Informer Libraries: a Chemoinformatics Enabled Approach to Evaluate and Advance Synthetic Methods. Chem. Sci. 7, 2604–2613 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Greshock TJ et al. Synthesis of Complex Druglike Molecules by the Use of Highly Functionalized Bench-Stable Organozinc Reagents. Angew. Chem. Int. Ed. 55, 13714–13718 (2016). [DOI] [PubMed] [Google Scholar]
  • 33.Bickerton GR, Paolini GV, Besnard J, Muresan S & Hopkins AL Quantifying the Chemical Beauty of Drugs. Nat. Chem. 4, 90–98 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Firth NC, Brown N & Blagg J Plane of Best Fit: A Novel Method to Characterize the Three-Dimensionality of Molecules. J. Chem. Inf. Model. 52, 2516–2525 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Chen F, Huang G & Huang H Sugar Ligand-mediated Drug Delivery. Future Med. Chem. 12, 161–171 (2020). [DOI] [PubMed] [Google Scholar]
  • 36.Heidel KM & Dowd CS Phosphonate Prodrugs: an Overview and Recent Advances. Future Med. Chem. 11, 1625–1643 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Tice CM et al. Spirocyclic Ureas: Orally Bioavailable 11β-HSD1 Inhibitors Identified by Computer-Aided Drug Design. Bioorg. Med. Chem. Lett. 20, 881–886 (2010). [DOI] [PubMed] [Google Scholar]
  • 38.Tichenor MS et al. Heteroaryl Urea Inhibitors of Fatty Acid Amide Hydrolase: Structure–Mutagenicity Relationships for Arylamine Metabolites. Bioorg. Med. Chem. Lett. 22, 7357–7362 (2012). [DOI] [PubMed] [Google Scholar]
  • 39.Vardanyan R 2-Substituted and 1,2-Disubstituted Piperidines. in Piperidine-Based Drug Discovery 103–125 (Elsevier, 2017). doi: 10.1016/B978-0-12-805157-3.00003-X. [DOI] [Google Scholar]
  • 40.Wallace HM, Fraser AV & Hughes A A Perspective of Polyamine Metabolism. Biochem. J. 376, 1–14 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Krska SW, DiRocco DA, Dreher SD, Shevlin M The Evolution of Chemical High-Throughput Experimentation To Address Challenging Problems in Pharmaceutical Synthesis. Acc.Chem. Res. 50, 2976–2985 (2017). [DOI] [PubMed] [Google Scholar]
  • 42.Dombrowski AW; Aguirre AL; Shrestha A; Sarris KA; Wang Y The Chosen Few: Parallel Library Reaction Methodologies for Drug Discovery. J. Org. Chem. 87, 1880–1897 (2022). [DOI] [PubMed] [Google Scholar]
  • 43.Schrödinger, Inc. MacroModel Release 2019–3 (Schrödinger, LLC, 2019). [Google Scholar]
  • 44.Roos K; Wu C; Damm W; Reboul M; Stevenson JM; Lu C; Dahlgren MK; Mondal S; Chen W; Wang L; Abel R; Friesner RA; Harder ED OPLS3e: Extending Force Field Coverage for Drug-Like Small Molecules. J. Chem. Theory Comput. 15, 1863–1874 (2019). [DOI] [PubMed] [Google Scholar]
  • 45.Wenderski TA, Stratton CF, Bauer RA, Kopp F & Tan DS Principal Component Analysis as a Tool for Library Design: A Case Study Investigating Natural Products, Brand-Name Drugs, Natural Product-Like Libraries, and Drug-Like Libraries. in Chemical Biology: Methods and Protocols (eds. Hempel JE, Williams CH & Hong CC) 225–242 (Springer; New York, 2015). doi: 10.1007/978-1-4939-2269-7_18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Tukey JW Exploratory Data Analysis. (Reading, Mass: Addison-Wesley Pub. Co., 1977). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SI

Data Availability Statement

The authors declare that all of the data supporting the findings of this study are available within the paper and its supplementary information file.

RESOURCES