Abstract
Five synthetic combinatorial libraries of 2,080 components each were screened as mixtures for inhibition of DNA binding to two transcription factors. Rapid, solution-phase synthesis coupled to a gel-shift assay led to the identification of two compounds active at a 5- to 10-μM concentration level. The likely mode of inhibition is intercalation between DNA base pairs. The efficient deconvolution through sublibrary synthesis augurs well for the use of large mixtures of small, nonpeptide molecules in biological screens.
Keywords: diversity, combinatorial chemistry, ureas, gel-shift assay, DNA binding
One of the most rapidly developing areas in the chemical sciences is concerned with molecular diversity. In organic and bioorganic chemistry the activity takes the form of synthetic combinatorial libraries, and current issues deal with solid phase vs. solution methods; massive parallel synthesis of single compounds vs. synthesis of mixtures; the use of rigid core structures vs. flexible linear sequences; devising deconvolution strategies vs. tagging techniques; and generating unbiased molecular landscapes for lead discovery vs. biased structures for lead development. These and other topics are addressed in a commendable review (1), but the issues that appear to be resolved are the need for automation and utility. Mere synthesis of molecular libraries is not enough; the synthesis must be connected to a selection process. We describe here our recent efforts in the latter context.
Although the majority of chemical diversity studies employ insoluble supports, recent innovations make solution-phase approaches more attractive (2–6). In addition, methodology to synthesize (7) and analyze (8) tetraurea-based libraries was recently introduced and permits replacement of the secondary amide (peptide) bond with a more bioavailable functionality (9, 10). Elaborated herein is a simplified and more general solution-phase route to tetraurea libraries derived from the isolated tetraisocyanate of xanthene. Derivatives that are differentially protected have also been prepared, allowing for synthetic access to individual tetraureas.
The general method for library synthesis is outlined in Fig. 1. An activated core molecule is condensed with a number of building blocks (11), resulting in a combinatorial library of covalently linked, core–building block ensembles. The shape and rigidity of the core determines the orientation of the building blocks in shape space. The libraries can be biased by changing the core, linkage, or building blocks to target a characterized biological structure (“focused libraries”) or synthesized with less structural bias using flexible cores. The latter was the case in the situation described herein, because small molecules that inhibit transcription factor–DNA binding are rare (12). Active components in these mixtures were identified through an iterative synthesis/screening protocol known as deconvolution.
Figure 1.
A schematic depiction of the activated core approach to solution-phase chemical diversity.
Rationale.
A number of considerations, including synthetic access, versatility (13), lack of apparent toxicity (14), and attractive physical properties make the tetra-substituted xanthene (Fig. 2) a desirable core molecule. In the latter regard, xanthenes have a strong UV chromophore for easy detection and usually are soluble in chlorinated organic solvents but nearly insoluble in hexanes/ether mixtures. This differential solubility allows for crude purifications following library synthesis using liquid–liquid extractions and, after deprotection, by precipitation from ether/hexane. Finally, the four sites for reaction on the xanthene core allow a large number of compounds to be formed using only a limited number of building blocks (8 building blocks give 2,080 compounds, whereas 16 building blocks give 32,896 compounds).
Figure 2.
A polytube rendering of an MM2* minimized tetraalanine (free acid) tetraurea. Carbon atoms are gray, nitrogens are blue, and oxygens are red. Hydrogen atoms are omitted for clarity except for those attached to heteroatoms. Several contact surfaces are available, including the 2,4; 4,5; 5,7; and 2,7.
For the transcription factor–DNA target, the xanthene tetraureas (Fig. 2) present an electron-rich aromatic scaffold capable of intercalation into DNA (15) and forming π–π stacks with aromatic side chains of proteins. The thickness of the 9,9-dimethyl center may disfavor some of these binding modes, but the ureas at the 4 and 5 positions provide a cavity (see Figs. 2 and 3) that is known to bind carboxylates strongly (16). In addition, all four ureas provide numerous hydrogen bonding donor and acceptor sites for molecular recognition. The attached building blocks were also expected to enhance interactions with biological targets, because most were based on biologically relevant amino acids. Molecular modeling calculations (17) were used to minimize a few tetraurea structures and revealed a planar, disc-shaped presentation of the building blocks (see Fig. 2). Finally, the use of the urea linkage conformationally limits the building blocks, because rotation about the urea N—C bond and the core–carbonyl bonds at the 4 and 5 positions are restricted. For example, the solution and solid-state structure of a 4,5-diurea xanthene has been determined (Fig. 3) (18). These studies showed that the urea N—H groups are directed inward (toward the xanthene oxygen) and create a binding pocket between them. The carbonyl oxygens are directed outward and are available for intermolecular hydrogen bonding. This preorganization reduces the entropic cost of binding to target molecules.
Figure 3.
A polytube rendering of the x-ray crystal structure of a xanthene diurea (Right) (18). Note the orientation of the ureas. Nuclear Overhauser effect NMR studies suggest a similar solution-phase structure (16).
MATERIALS AND METHODS
Synthesis.
9,9-Dimethylxanthene-2,4,5,7-tetraacid chloride 1; 9,9-dimethylxanthene-4,5-diacid chloride 5 and 2,7-dibenzyl-9,9-dimethylxanthene-4,5-diacid chloride dicarboxylate 7 were prepared according to literature procedures (3, 7). Other experimental details are outlined in the Ph.D. thesis of one of the authors (19).
HPLC.
For tetraester tetraureas (before deprotection) a C18 reverse-phase column (Rainin Instruments Microsorb-MV, 4.6 × 250 mm, 5 μm, 100 Å) was used for HPLC analysis with a flow rate of 1 ml/min. A linear gradient of 100% water to 100% MeCN over 30 min with a total run time of 45–50 min was used for most runs. The detection wavelength was 268 nm (λmax for the xanthene tetraurea fragment), and the extinction coefficients of the tetraureas were assumed to be equal. After deprotection, a C8 reverse-phase column (Waters Symmetry, 3.9 × 150 mm, 5 μm, 100 Å) was used employing similar gradients and a 1 ml/min flow rate.
Molecular Modeling.
All molecular modeling energy minimizations were done with macromodel 5.5 (17) and the MM2* force field, without solvent parameters.
RESULTS
Library Synthesis.
The primary synthetic challenge involved finding a reaction path from a functionalized xanthene to the tetraisocyanate with high yield (>90% overall, >97% per site). The xanthene tetraamine itself is too unstable to oxidation to manipulate, and after a number of approaches failed, it was found that conversion of the xanthene tetraacid chloride (3) 1 to the tetraacyl azide, followed by heating, effected the Curtius rearrangement to the desired tetraisocyanate (Scheme S1) (20). To confirm that this reaction sequence would be compatible with a number of building blocks, we synthesized several tetraureas derived from a single type of building block. These homotetraureas were characterized by NMR and HPLC and were found to be ≥90% pure. Isolated yields were also above 90%.
Scheme 1.
The tetraurea libraries were then synthesized by reacting the tetraisocyanate core molecule simultaneously with a mixture of amines in a single reaction vessel. Simple liquid–liquid extraction (dichloromethane–1 N citric acid) eliminated excess amines after the reaction, then the libraries were treated with neat trifluoroacetic acid to cleave the acid-labile protecting groups. This deprotection liberates hydrophilic functions and enhances solubility in aqueous solutions for screening. Following deprotection, the libraries could be precipitated upon addition of ether/hexanes (1:1); filtration allowed for removal of the soluble remnants of the amino acid side-chain protecting groups and isolation of the tetraureas.
Because the xanthene 4 and 5 positions are close in space, steric interactions between a building block attached at one of these positions and a nucleophile attacking at the other could introduce concentration biases during library synthesis. Previously published methods (7) were used to verify that statistical mixtures of compounds were indeed formed in the libraries. Accordingly, we synthesized the 4,5-xanthenediisocyanate (4, via Scheme S2), which was used to synthesize diureas for HPLC and capillary electrophoresis-MS analysis. The results of these diurea studies supported the claims of predictable, approximately statistical product distributions. This diisocyanate 4 proved to be such a versatile intermediate that it was used to synthesize diureas for studies in molecular recognition reported elsewhere (18).
Scheme 2.
Synthesis of a Core with Two Addresses.
The development of a method for obtaining unsymmetrical tetraureas is shown in Scheme S3. The dibenzylester diacid chloride 5 (3) served as a precursor to a urea-based core structure that distinguished the “top” two sites from those at the bottom. The benzyl ester protecting groups are easily removed using hydrogenolysis (21), a procedure compatible with most peptide protecting groups. The diacid chloride 5 was converted to the diacyl azide using sodium azide in acetone (20) and then heated to give the diisocyanate dibenzyl ester 6.
Scheme 3.
Xanthene 6 could then be reacted with one or more amines, depending on the stage of the deconvolution sequence (Scheme S4). If one (in the case shown, Leu t-butyl ester) or two different amines are condensed with 6, a separation can be performed to give purified diureas (7). Hydrogenolysis cleaves the benzyl ester groups and gives the diacid diurea 8. Activation of the acid groups as acid chlorides proved, unfortunately, to be incompatible with acid-sensitive functionality on the protected amino acids (i.e., the t-butyl esters) or the urea function itself, for that matter. Instead, the diacid was converted to the mixed anhydride with ethyl chloroformate using triethylamine as the base in acetone (Scheme S4). Without purification, the mixed anhydride was treated with sodium azide to furnish the diacyl azide (22). Rearrangement was effected by heating the diacyl azide in toluene, and the diisocyanate dileucine diurea 9 was obtained. Then, one (Phe t-butyl ester in the case of 10) or more amines could be reacted with 9 to give the desired tetraurea(s).
Scheme 4.
. Synthesis of a tetraurea using the deconvolution protocol.
Screening.
Two dissimilar DNA oligonucleotides were prepared to screen the libraries. The two were derived from naturally occurring target sites for a transcription factor, SpP3A2, a member of a small family of novel transcription factors identified so far in sea urchins, Drosophila, and humans (23), and for the sea urchin Zn-finger transcription factor SpZ12–1 (24). A gel-shift assay (25) was used to quantify the apparent inhibition of the transcription factor–DNA binding event (Fig. 4). During the screening process it evolved that the mechanism of inhibition was non-sequence-specific binding of library components to DNA, because the libraries proved to be active against both DNA–transcription factor complexes at comparable concentrations, despite the entirely unrelated sequences of their respective target sites.
Figure 4.
Gel-shift assays of combinatorial libraries and unique compounds resolved by deconvolution. Procedures followed were as described (24). Reactions were made up in 10-μl volumes and contained either recombinant SpP3A2 (22) or SpZ12–1 (23) transcription factors (which earlier had been isolated and cloned from sea urchin embryo nuclear extract); an aliquot of the combinatorial library being tested; poly(dA/dT) or poly(dT/dC); double-stranded [32P]oligonucleotide probes labeled by the kinase reaction, which contained target sites for the appropriate transcription factor; and binding buffer (see refs. 22–24 for details). Arrows indicate transcription factor–DNA complexes; arrowheads indicate free probe. (A) Initial library assay. Only the activity of the positive one out of the five libraries is illustrated here (see Table 1 and text). Lanes: 1, probe alone; 2, plus SpP3A2; 3, plus combinatorial library 1. The library was present in the reaction at 5.4 mM total; because it contained 2,080 compounds at approximately equal concentrations (see text), the concentration of each was nominally about 0.3 μM. (B) Assays of first-stage deconvolution. Each sublibrary now contains 666 compounds, including six of the initial eight amino acids; effect of SpP3A2–DNA complex formation of libraries is shown as in A. Only the active sublibrary is shown. Lanes: 1, free probe; 2, plus SpP3A2; 3–6, plus combinatorial libraries at increasing concentrations. Activity is observed (lane 5) at a nominal level of 0.8 nM for individual compounds. (C) Assays of a further stage of deconvolution. Activity is tested against SpZ12–1–DNA complex formation. Each sublibrary now contains only three compounds. Lanes: 1, probe only; 2, plus SpZ12–1; 3–6, SpZ12–1 with four different sublibraries at total concentrations of 14 μM, or ≈5 μM per compound, except for lane 7, which contains the same compounds as lane 2 at five times higher concentration. Activity is clearly seen in lanes 4 and 6, which correspond to sublibraries 1A and 2A in Scheme S5 and text, and not in lanes 3 and 5, which correspond to sublibraries 1C and 2C in Scheme S5. (D) Final deconvolution. Single compounds of sublibrary 1A and 2A. Assayed against SpP3A–DNA complex formation. Lanes: 1, probe alone; 2, plus SpP3A2; 3–6, individual compounds, assayed at 50 μM concentration. The compound in lane 5 is 12 of Scheme S5; other experiments demonstrated activity at 5- to 10-μM concentration.
Five libraries, each containing a calculated 2,080 tetraureas, were synthesized according to a previously published procedure (7); the amino acid derivatives used to make these libraries are shown in Table 1. Of the five libraries screened, only one proved active. Of particular significance was the failure to observe any inhibition by the other, seemingly similar, libraries. The active mixture (library 1) contained the amino acid derivatives Asp (α) methyl ester; methyl esters of Gly, Lys, Ser, and Tyr; and the free acids Leu, Phe, and Pro, and the level of its activity was sufficiently high that it warranted deconvolution through the synthesis of sublibraries.
Table 1.
Five 2,080-member tetraurea libraries were synthesized for initial screening by the condensation of xanthene tetraisocyanate and the tabulated amino acid derivatives
Library 1 | Library 2 | Library 3 | Library 4 | Library 5 |
---|---|---|---|---|
Gly-OMe | Gly-OMe | Ala-OMe | Asp(tBu)-OtBu | Ala-OtBu |
Leu-OtBu | Val-OtBu | Ile-OtBu | Gly-OMe | Asp(tBu)-OtBu |
Phe-OtBu | Trp-OMe | Phe-OtBu | His-OMe⋅2 HCl | Glu(OtBu)-OMe |
Tyr-OMe | Thr(tBu)-OMe | Arg-OMe⋅2 HCl | Ile-OtBu | Gly-OMe |
Ser-OMe | His-OMe⋅2 HCl | Ser-OMe | Lys(Boc)-OMe | Ile-OtBu |
Lys(Boc)-OMe | Met-OMe | Lys(Boc)-OMe | Met-OMe | Lys(Boc)-OMe |
Asp(Me)-OtBu | Asp(tBu)-OMe | Asn-OtBu | Pro-OtBu | Ser-OMe |
Pro-OtBu | Asn-OtBu | Pro-OtBu | Val-OtBu | Tyr-OtBu |
All are monohydrochloride salts except as noted. Side-chain protecting groups are noted in parentheses.
For the first round of deconvolution, four sublibraries were synthesized, each missing two out of the eight building blocks listed above. Each of these four sublibraries contain only 666 compounds, and screening revealed that only two sublibraries were active (those that omitted Gly, Lys, and Ser methyl esters, and Leu free acid) at essentially the same concentrations as the initially active library. The implication was that these components were not essential. The other two sublibraries (those that omitted Asp (α) methyl ester, Phe and Pro free acids, and Tyr methyl ester) were inactive.
With four building blocks only 136 combinations are possible, and because two or more building blocks were important, 4 of the 136 possibilities could be excluded (those presenting the same building block at all four positions), leaving 132 different compounds. At this point assumptions were made to reduce the remaining possibilities and minimize synthesis. The first one—admittedly, no more than a guess—was that no single building block occupied three out of the four positions. The validity of the assumption would soon be put to the test, and, if failure appeared, the problem could be revisited and quickly corrected. This eliminated a large number—at 48 structures, more than a third—leaving 84 possibilities. The deconvolution sequence (Scheme S4), which specifies the building blocks first at the 4 and 5 positions and then at the 2 and 7 positions, was followed. This method accessed 56 of 84 (two-thirds) of the remaining structural possibilities. If the activity was not due to any of these 56 compounds, those that were excluded by the above assumptions would then be synthesized. Happily, two of these sublibraries retained the previous activity; an estimate for the activity could be made at micromolar concentrations. All of the 19 possibilities are listed in Scheme S5. In short, those compounds that featured a combination of Phe and/or Asp derivatives at the 4,5 positions with Pro and/or Tyr derivatives at the 2,7 positions or those that contained Phe and/or Pro at the 4,5 positions and Asp and/or Tyr derivatives at the 2,7 positions were identified as being responsible for the activity. One structure is common to both active mixtures (compound 11), and it indeed was one of the most active molecules in the assay, but all of the other compounds were accessed by synthesis.
Scheme 5.
. Deconvolution of an active tetraurea library. D, Asp (α) methyl ester; F, Phe; P, Pro; Y, Tyr methyl ester.
The six sublibraries were obtained by reacting 6 with two components, then separating the three disubstituted derivatives at the 4 and 5 positions (two symmetrical and one unsymmetrical), and then reacting the pure compounds with the other two building blocks at the 2,7 positions. This resulted in mixtures of three or four compounds or six sets of sublibraries (mixtures 1A–1C and 2A–2C, Scheme S5). The screening of these six revealed that the two sets of four tetraureas (1B and 2B), as well as two sets of three (1C and 2C), were inactive. Activity was found in two groups (mixtures 1A and 2A, Scheme S5) containing Phe at both the 4 and 5 positions. Synthesis and isolation of pure compounds according to the methods described in Scheme S3, followed by a final round of screening, indicated that two compounds (Fig. 5) were active in the 5- to 10-μM concentration range—11 and the slightly more active 12.
Figure 5.
Structures of the two xanthene tetraureas active in binding to DNA.
DISCUSSION
The most likely mode for binding given the nonsequence specificity was intercalation between the DNA base pairs. The structures of the two active compounds were consistent with this hypothesis. The core itself could intercalate, and both of the actives contained aromatic Phe residues at the 4 and 5 positions. In fact, 11 contained four building blocks capable of intercalation. In addition, two families of intercalators are known that have skeletons resembling xanthene (acridines and actinomycins) (26). Studies of these systems indicated that seemingly small structural changes can lead to large differences in potency (27) that are similar to observations made here. In the case of xanthene tetraureas the most active compound 12 is, like DNA, negatively charged under the assay conditions. The selection process that identified two active compounds from a pool of approximately 10,000 required only 15 assays and is therefore one of the most efficient procedures in that regard.
That only two residues ended up being essential was surprising but was consistent with the screening results of the initial large libraries, because none of the inactive 2,080-component libraries contained Phe with Tyr methyl ester or Asp methyl ester. Although assumptions were made during the deconvolution, the activity remained roughly consistent throughout the process.
In conclusion, approximately statistical ensembles of tetraureas were synthesized in solution by reacting the isolated tetraisocyanate of xanthene with a number of amine building blocks. A successful synthetic protocol was devised to allow access to individual library components. The tetraureas were tailored to produce compounds that were likely to interact with biomolecules, and activity was found in an assay measuring binding of DNA to two transcription factors. Although the net result of our screening and deconvolution efforts was the discovery of two DNA intercalators, rather than more novel transcription factor binders, our efforts nevertheless demonstrated the use of a synthetic and numerical deconvolution strategy to identify single, active components in a library of more than 2,000 tetraureas.
Acknowledgments
We thank the Skaggs Foundation for funding and the National Science Foundation for a fellowship to K.E.P. CalTech work was supported by the Beckman Institute of the California Institute of Technology, the National Institutes of Health Grant HD-05753, and by a subcontract from the National Science Foundation Science and Technology Grant BIR9214821. J.X. was supported by National Institutes of Health Training Grant HD-07257.
References
- 1.Balkenhol F, von dem Bussche-Hünnefeld C, Lansky A, Zechel C. Angew Chem Int Ed Engl. 1997;35:2288–2337. [Google Scholar]
- 2.Boger D L, Chai W, Ozer R S, Anderson C. Bioorg Med Chem. 1997;7:463–468. and references therein. [Google Scholar]
- 3.Carell T, Wintner E A, Sutherland A J, Rebek J, Jr, Dunayevskiy Y M, Vouros P. Chem Biol. 1995;2:171–183. doi: 10.1016/1074-5521(95)90072-1. [DOI] [PubMed] [Google Scholar]
- 4.Gravert D J, Janda K D. Chem Rev. 1997;97:489–509. doi: 10.1021/cr960064l. [DOI] [PubMed] [Google Scholar]
- 5.An H, Cummins L L, Griffey R H, Bharadwaj R, Haly B D, Fraser A S, Wilson-Lingardo L, Risen L M, Wyatt J R, Cook P D. J Am Chem Soc. 1997;119:3696–3708. [Google Scholar]
- 6.Curran D P. Chemtracts: Org Chem. 1996;9:75–87. [Google Scholar]
- 7.Shipps G W, Jr, Spitz U P, Rebek J., Jr Bioorg Med Chem. 1996;4:655–657. doi: 10.1016/0968-0896(96)00059-4. [DOI] [PubMed] [Google Scholar]
- 8.Dunayevskiy Y M, Vouros P, Wintner E A, Shipps G W, Carell T, Rebek J., Jr Proc Natl Acad Sci USA. 1996;93:6152–6157. doi: 10.1073/pnas.93.12.6152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Olsen G L, Bolin D R, Bonner M P, Bös M, Cook C M, Fry D C. J Med Chem. 1993;36:3039–3049. doi: 10.1021/jm00073a001. [DOI] [PubMed] [Google Scholar]
- 10.DeLucca G V. Bioorg Med Chem Lett. 1997;7:495–504. [Google Scholar]
- 11.Carell T, Wintner E A, Bashir-Hashemi A, Rebek J., Jr Angew Chem Intl Ed Engl. 1994;33:2059–2061. [Google Scholar]
- 12.Gottesfeld J M, Neely L, Trauger J W, Baird E E, Dervan P B. Nature (London) 1997;387:202–205. doi: 10.1038/387202a0. [DOI] [PubMed] [Google Scholar]
- 13.Shimizu K D, Dewey T M, Rebek J., Jr J Am Chem Soc. 1994;116:5145–5149. [Google Scholar]
- 14.Birman V B, Chopra A, Ogle C A. Tetrahedron Lett. 1996;37:5073–5076. [Google Scholar]
- 15.Dugas H. Bioorganic Chemistry. New York: Springer; 1996. [Google Scholar]
- 16.Hamann B C, Branda N R, Rebek J., Jr Tetrahedron Lett. 1993;34:6837–6840. [Google Scholar]
- 17.Mohamadi F, Richards N G J, Guida W C, Liskamp R, Lipton M, Caufield C, Chang G, Hendrickson T, Still W C. J Comput Chem. 1990;11:440–467. [Google Scholar]
- 18.Hamann B C. Ph.D. Thesis. Cambridge: Massachusetts Institute of Technology; 1996. [Google Scholar]
- 19.Shipps G W. Ph.D. Thesis. Cambridge: Massachusetts Institute of Technology; 1997. [Google Scholar]
- 20.Capson T L, Poulter C D. Tetrahedron Lett. 1984;25:3515–3518. [Google Scholar]
- 21.Greene T W, Wuts P G M. Protective Groups in Organic Synthesis. New York: Wiley; 1991. [Google Scholar]
- 22.Kaiser, C. & Weinstock, J. Org. Syn. Coll. Vol. 6, 910–913.
- 23.Calzone F J, Höög C, Teplow D B, Cutting A E, Zeller R W, Britten R J, Davidson E H. Development. 1991;112:335–350. doi: 10.1242/dev.112.1.335. [DOI] [PubMed] [Google Scholar]
- 24.Wang D G-W, Kirchhamer C V, Britten R J, Davidson E H. Development. 1995;121:1111–1122. doi: 10.1242/dev.121.4.1111. [DOI] [PubMed] [Google Scholar]
- 25.Calzone F J, Thézé N, Thiebaud P, Hill R L, Britten R J, Davidson E H. Genes Dev. 1988;2:1074–1078. doi: 10.1101/gad.2.9.1074. [DOI] [PubMed] [Google Scholar]
- 26.Silverman R B. The Organic Chemistry of Drug Design and Drug Action. San Diego: Academic; 1992. [Google Scholar]
- 27.Cain B F, Atwell G J, Denny W A. J Med Chem. 1975;18:1110–1116. doi: 10.1021/jm00245a013. [DOI] [PubMed] [Google Scholar]