Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2010 Jul 15.
Published in final edited form as: Bioorg Med Chem Lett. 2009 May 3;19(14):3779–3782. doi: 10.1016/j.bmcl.2009.04.097

CSI-FID: High throughput label-free detection of DNA binding molecules

Karl E Hauschild a, James S Stover b, Dale L Boger b,*, Aseem Z Ansari a,*
PMCID: PMC2760306  NIHMSID: NIHMS113257  PMID: 19435662

Abstract

Determining the sequence specifity of DNA binding molecules is a nontrivial task. Here we describe the development of a platform for assaying the sequence specificity of DNA ligands using label free detection on high density DNA microarrays. This is achieved by combining Cognate Site Identification (CSI) with Fluorescence Intercalation Displacement (FID) to create CSI-FID. We use the well-studied small molecule DNA ligand netropsin to develop this high throughput platform. Analysis of the DNA binding properties of protein- and small molecule-based libraries with CSI-FID will advance the development of genome-anchored molecules for therapeutic purposes.


A critical challenge at the interface of biology, chemistry, and molecular medicine is developing highly specific small molecules that target the genome to regulate its function110. A greater understanding of the principles that govern specificity will enhance our ability to predict their biological action on genomes, advancing the development of genome-anchored therapeutics. Similarly, understanding natural DNA binding proteins will help elucidate their regulatory function in cells. Given its importance, many methods have been developed to assess the binding of small molecules and proteins to DNA, including low throughput methods, such as nuclease protection11, 12, affinity cleavage13, and electrophoretic mobility shift assays14, 15 (EMSAs), mid throughput assays including fluorescence anisotropy16 and fluorescence resonance energy transfer17, label free methods such as surface plasmon resonance (SPR)18 and photonics based approaches19, and high throughput assays including SELEX20 and DNA microarrays2124. Among these, two high throughput methods which can determine DNA binding specificity of biomolecules or synthetic ligands in a rapid and unbiased manner are the Cognate Site Identifier (CSI) arrays and the Fluorescent Intercalator Displacement (FID) assay (Fig 1A)23, 25, 26. CSI arrays determine the specificity and affinity of DNA ligands using a microarray displaying double-stranded DNA (dsDNA) hairpin oligonucleotides containing all permutations of up to 12 positional variants (~2 million sequences). For CSI, a fluorescently labeled DNA ligand of interest is applied to the array to provide a distribution of intensities related to DNA binding affinity. Sequences with the highest intensities are evaluated to identify a consensus motif. The array data, therefore, provides the full-spectrum of binding specificities across the entire sequence space of a 12mer. The FID assay is a plate-based technique that measures the amount of ligand-induced displacement of an intercalated dye (commonly ethidium bromide, EtBr) from a DNA hairpin to determine the sequence preference of an unlabeled DNA binding ligand. The assay measures affinities in solution and provides a rapid means of measuring binding affinities (Fig 1A).

Figure 1. CSI-FID combines CSI and FID for label free detection on highly complex DNA libraries.

Figure 1

A. The left panel illustrates the CSI array technology. First a high density oligonucleotide array (up to two million sequence variants) is synthesized using maskless array synthesis. The oligos are hairpinned to form dsDNA, and a fluorescently tagged DNA-binding molecule is applied to the array. A readout of fluorescent intensities is then used to generate a sequence motif of the highest bound DNA sequences. The right panel shows the procedure for the FID assay. First a library of hairpinned DNA oligos are individually arrayed in 96 well plates, with one DNA hairpin sequence per well. EtBr is then added to each well and the fluorescence is measured. Next an unlabeled DNA molecule of interest is applied to the plate and the diplacement of EtBr from the DNA is determined. The sequences with the highest EtBr displacement bind the molecule of interest with the highest affinity and these sites are used to generate a consensus motif for the molecule. B. Depiction of FID being adapted for use on CSI arrays to generate the CSI-FID assay. First EtBr is incubated with the CSI array, then an unlabeled molecule of interest is applied to the array. The ratios of the EtBr intensity for the displaced versus undisplaced values for each DNA feature provides a measurement of the affinity of the small molecule for each DNA sequence. This technology measures the sequence specific binding for unlabelled DNA-binding molecules on CSI microarrays bearing highly complex DNA libraries.

Both CSI and FID assays have been used successfully to determine the specificity and affinity of several DNA binding small molecules, as well as triplex forming oligonucleotides, proteins, and polyamides23, 2630. These methods offer complementary strengths toward the goal of understanding DNA ligand specificity. CSI can be used to interrogate the entire sequence space of at least 12mer DNA binding sites with a high dynamic range, and FID offers the benefit of label-free detection. Both approaches can also be used to determine DNA dissociation constants for the DNA ligand of interest23, 2528. However, with CSI the detection of DNA binding factors to the dsDNA array is dependent on either direct fluorescent labeling or indirect detection methods. This hinders its ability to analyze the DNA binding of unlabeled proteins and small molecules. Also, the labeling of small molecules or proteins with fluorescent tags may perturb their DNA binding properties. With FID, the major limitation is the DNA library size, with most assays being performed on all permutations of 5mer DNA (512 sequences). Larger DNA binding sites are possible; however, these typically only use a subset of the total library members to avoid an exponential increase in the cost of synthesis and purification of complex (>5mer) libraries. By combining CSI and FID we can overcome the limitations of both by utilizing the label-free detection of ligand-DNA interactions on arrays with a 3–4 order of magnitude increase in DNA library complexity, which is the genesis of CSI-FID (Fig 1B). Using CSI-FID, we examined the comprehensive binding profile of netropsin, a minor groove DNA binding small molecule, that exhibits antiviral and antitumor properties (Fig 2A) 31. Netropsin has served as a model for a class of sequence specific minor groove DNA binding agents32, 33, and previous FID assays have shown that it is an excellent candidate for assay validation of CSI-FID26.

Figure 2. CSI-FID analysis of the small molecule DNA ligand netropsin.

Figure 2

A. The structure of netropsin. B. Histograms of the intensities derived from CSI-FID with EtBr (blue) and EtBr + netropsin (yellow). On the left side of the histogram the yellow region shows the displacement of EtBr by netropsin, green shows overlap between both arrays. C. A displacement display of all 7mer probes (using FD = 1−(1−Rseq)/(1−RMax), where Rseq is determined using the ratio derived from the displaced and undisplaced EtBr intensities of the sequence, and RMax is the ratio of the sequence with the highest fluorescent displacement). This display shows a clear trend of a preference of netropsin for increasingly AT-rich DNA. D. A logo diagram constructed from top netropsin DNA binding sequences (bracket). E. A sequence specificity landscape (SSL) display generated for the motif 5′-WWWWWW-3′ (W = A/T). SSLs display sequences that match the consensus on the inner ring, with mismatches to the consensus on the outer rings. The height of each sequence is calculated using 1-FD. Netropsin shows a clear preference for regions of DNA with greater than 4 bp AT stretches of DNA (center ring red and yellow peaks, 1 mismatch ring ridgeline and peaks). The valleys indicated by the red arrows in the 1 mismatch ring are regions of weak netropsin DNA binding and are dominated by sequences which contain 2–3 bp AT stretches (WWSWWW or WWWSWW).

Development and optimization of CSI-FID

Successful implementation of CSI-FID arrays required adapting the dye displacement ability of FID for a CSI microarray platform. The intercalating dye EtBr was chosen as it has several desirable properties; it increases in fluorescence upon DNA binding thereby decreasing background signal, equilibrates rapidly with DNA, and has low sequence specificity26, 34. For the CSI-FID assay we designed an array which contained all possible 9 base pair (bp) DNA permutations. This makes the low sequence specificity of EtBr an especially important property for the optimal performance of the assay as the EtBr dye should be bound to each dsDNA probe. Initial titrations of EtBr concentrations in binding buffer (100mM NaCl, 100mM Tris pH 8.0) with the CSI-FID arrays indicated an optimal range of 3–6μM EtBr, above its micromolar KD 35. This EtBr amount generated a 10-fold signal-to-noise (S/N) ratio for the intensity of EtBr binding to dsDNA over the array surface. Larger concentrations of EtBr increased surface binding and decreased the S/N ratio, while lower concentrations showed minimal dsDNA binding.

CSI-FID for the analysis of netropsin DNA binding

To assess assay performance netropsin DNA binding was examined using a CSI-FID array. First, EtBr at 6μM was incubated with the array for 1 hour. Subsequently, 3μM of netropsin (KD of 1–100nM26, 36) was added to the EtBr solution, and the array was incubated for another hour to allow the binding reaction to achieve equilibrium. To account for any sequence bias of EtBr, and as a control for displacement, a second array was run with EtBr alone. The subsequent imaging of the CSI-FID arrays was performed with a readily available 5 micron microarray scanner using a standard 532 nm (Cy3 compatible) laser.

For both arrays we obtained a distribution of intensities indicating EtBr bound to the dsDNA probes. A comparison of the histograms for the EtBr array shows a gaussian distribution centered on 100% EtBr binding (Fig 2B), whereas the histogram of the netropsin array shows a subset of sequences with a distinct decrease in EtBr binding (Fig 2B, yellow bars on left of center), demonstrating the sequence dependent displacement of EtBr by netropsin. Based on the full displacement of EtBr from the CSI-FID array by netropsin, a fluorescent decrease of 23–29% was expected based on a 4–5 bp binding site of netropsin from 17 bp total for each DNA hairpin (9 bp of variable region plus 4 bp of constant flanking sequence on either side). The data indicate that a displacement of 20% was obtained for the best netropsin probes, close to the maximal allowable for this array design. Further analysis of the netropsin CSI-FID data indicated that a library of 7mer sequences (16,384 members) was sufficient to represent the full binding profile of netropsin. Using 9mer arrays therefore had the added benefit of increasing the number of internal replicates and adding greater sequence context for each 7mer probe. When the netropsin displacement data is plotted, there is a clear preference for netropsin binding to AT-rich DNA (Fig 2C). A sequence motif obtained from the strongest netropsin binding sites further confirms this result (Fig 2D).

Sequence specificity landscapes reveal insight into netropsin binding specificity

To further distill the sequence binding preferences of netropsin, we displayed all binding intensities in a sequence specificity landscape (SSL) format (Fig 2E)37. SSLs display all sequences that are a perfect match for a chosen motif in the innermost ring, and subsequent rings display those sequences which contain mismatches to the chosen motif. The height of each peak corresponds to the fluorescent displacement (FD) of sequences on the CSI-FID array. The SSL display allows an unbiased and comprehensive analysis of the entire binding data, which is particularly beneficial for netropsin as most motif finding algorithms are unable to identify motifs using input sequences less than 8 bp in length38, 39.

For netropsin, the highest peaks (Fig 2E, red to yellow) are present in the innermost ring indicating that netropsin prefers DNA regions with a high AT content (perfect match to consensus 5′-WWWWWW-3′). Interestingly, in the 1 mismatch ring, the majority of the ridgeline (Fig 2D, light blue) contains DNA sequence stretches with at least 4 or more AT bp (WSWWWW or SWWWWW), with some higher peaks interspersed on the ridge (Fig 2E). However, the sequences present in the valley regions (dark blue, red arrows in Fig 2E) are dominated by DNA sequences with only 2 to 3 bp AT stretches (WWSWWW or WWWSWW). These results indicate a strong preference of netropsin for DNA with at least 4 or more AT bp and agrees well with previous studies on the sequence specificity of netropsin13, 26, 31.

CSI-FID versus solution-based assays

Binding to solid-surface immobilized oligonucleotides, as in CSI-FID, can be affected by mass transfer, probe density, surface characteristics, and washing steps. The data obtained by CSI-FID was therefore compared to previously obtained solution-based netropsin FID data26. For this analysis we parsed both datasets to represent all 4mer binding sites (136 members). Comparisons indicate that a clear correlation exists between both datasets (R2 of 0.76, Fig 3A). Of note is that all 10 of the possible 4mer AT rich sequences are represented in the top 10 binders for both datasets. There is also a distinct step (decrease in affinity) when moving from the top CSI-FID sequences to those containing even one GC bp (Fig 3B). This comparison indicates that surface-tethered probes yield similar results as solution-based methodologies. Taken together these results represent a strong validation of the specificity data obtained by the CSI-FID assay.

Figure 3. A comparison of netropsin data from the solution-based FID assay to the CSI-FID assay.

Figure 3

A. A correlation plot of all 4mer DNA sequences for both CSI-FID versus FID. The plot shows a clear correlation (R2 = 0.76) between the two netropsin datasets and demonstrates that the data obtained from CSI-FID are not affected by any possible surface based effects. B. A displacement display for the 4mer CSI-FID data showing a distinct step when moving from sites with 4 AT stretches (10 sites) to sites with 3 and 2 bp AT stretches.

CSI-FID: complex dsDNA libraries and label free detection

While there are many assays available for the study of small molecule-DNA and protein-DNA interactions, CSI-FID surmounts several inherent shortcomings of these techniques. CSI-FID can overcome the throughput and library complexity limitations inherent with other label-free detection assays by providing a high throughput assay capable of assessing ligand binding to large DNA libraries. CSI-FID is a rapid, technically non-challenging, cost effective, and adaptable assay for the label-free detection of DNA binding by natural or engineered DNA binding molecules.

In the future, CSI-FID will be applied to other additional DNA targets, including complex mixtures of proteins and small molecule ligands. Therefore, CSI-FID will greatly enhance our ability to determine DNA binding motifs for unlabeled proteins and small molecules, which has direct applications for proteomic approaches and small molecule screening. CSI-FID will contribute dramatically to the understanding of ligand-DNA binding toward the development of genome-anchored therapeutics.

Supplementary Material

01

Acknowledgments

We thank Mary Ozers and Christopher Warren for helpful discussions of the manuscript, and Clayton Carlson for assistance with specificity landscapes. We gratefully acknowledge the support of the NIH (AZA: GM069420, DLB: CA041986, CA078045), March of Dimes (AZA: FY07-511), USDA (AZA: HATCH) as well as Shaw Scholar, W. M. Keck Foundation and Vilas Associate awards (A.Z.A.). K.E.H. was supported by an NSERC PGS fellowship.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References and Notes

  • 1.Geierstanger BH, Wemmer DE. Annu Rev Biophys Biomol Struct. 1995;24:463. doi: 10.1146/annurev.bb.24.060195.002335. [DOI] [PubMed] [Google Scholar]
  • 2.Chaires JB. Curr Opin Struct Biol. 1998;8:314. doi: 10.1016/s0959-440x(98)80064-x. [DOI] [PubMed] [Google Scholar]
  • 3.Yang XL, Wang AHJ. Pharmacol Ther. 1999;83:181. doi: 10.1016/s0163-7258(99)00020-0. [DOI] [PubMed] [Google Scholar]
  • 4.Gottesfeld JM, Turner JM, Dervan PB. Gene Expr. 2000;9:77. doi: 10.3727/000000001783992696. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Neidle S. Nat Prod Rep. 2001;18:291. doi: 10.1039/a705982e. [DOI] [PubMed] [Google Scholar]
  • 6.Ptashne M, Gann A. Genes & Signals. Cold Springs Harbor Laboratory Press; New York: 2002. [Google Scholar]
  • 7.Darnell JE., Jr Nat Rev Cancer. 2002;2:740. doi: 10.1038/nrc906. [DOI] [PubMed] [Google Scholar]
  • 8.Ansari AZ, Mapp AK. Curr Opin Chem Biol. 2002;6:765. doi: 10.1016/s1367-5931(02)00377-0. [DOI] [PubMed] [Google Scholar]
  • 9.Waring MJ. Sequence-Specific DNA Binding Agents. Royal Society of Chemistry; Cambridge: 2006. [Google Scholar]
  • 10.Hauschild KE, Carlson CD, Donato LJ, Moretti R, Ansari AZ. In: Wiley Encyclopedia of Chemical Biology. Begley T, editor. John Wiley & Sons, Inc; New York: 2008. [Google Scholar]
  • 11.Galas DJ, Schmitz A. Nucleic Acids Res. 1978;5:3157. doi: 10.1093/nar/5.9.3157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Trauger JW, Dervan PB. Methods Enzymol. 2001;340:450. doi: 10.1016/s0076-6879(01)40436-8. [DOI] [PubMed] [Google Scholar]
  • 13.Dyke MWV, Hertzberg RP, Dervan PB. Proc Natl Acad Sci U S A. 1982;79:5470. doi: 10.1073/pnas.79.18.5470. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Fried M, Crothers DM. Nucleic Acids Res. 1981;9:6505. doi: 10.1093/nar/9.23.6505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Garner MM, Revzin A. Nucleic Acids Res. 1981;9:3047. doi: 10.1093/nar/9.13.3047. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Heyduk T, Ma Y, Tang H, Ebright RH. Methods Enzymol. 1996;274:492. doi: 10.1016/s0076-6879(96)74039-9. [DOI] [PubMed] [Google Scholar]
  • 17.Heyduk T, Heyduk E. Nature Biotechnol. 2002;20:171. doi: 10.1038/nbt0202-171. [DOI] [PubMed] [Google Scholar]
  • 18.Brockman JM, Frutos AG, Corn RM. Anal Biochem. 1993;214:251. [Google Scholar]
  • 19.Chan LL, Pineda M, Heeres JT, Hergenrother PJ, Cunningham BT. ACS Chem Bio. 2008;3:437. doi: 10.1021/cb800057j. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Tuerk C, Gold L. Science. 1990;249:505. doi: 10.1126/science.2200121. [DOI] [PubMed] [Google Scholar]
  • 21.Ren B, Robert F, Wyrick JJ, Aparicio O, Jennings EG, Simon I, Zeitlinger J, Schreiber J, Hannett N, Kanin E, Volkert TL, Wilson WJ, Bell SP, Young RA. Science. 2000;290(5500):2306–2309. doi: 10.1126/science.290.5500.2306. [DOI] [PubMed] [Google Scholar]
  • 22.Wang JK, Li TX, Lu ZH. J Biochem Biophys Methods. 2005;63:100. doi: 10.1016/j.jbbm.2005.03.006. [DOI] [PubMed] [Google Scholar]
  • 23.Warren CL, Kratochvil NC, Hauschild KE, Foister S, Brezinski ML, Dervan PB, Phillips GN, Jr, Ansari AZ. Proc Natl Acad Sci U S A. 2006;103:867. doi: 10.1073/pnas.0509843102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Mukherjee S, Berger MF, Jona G, Wang XS, Muzzey D, Snyder M, Young RA, Bulyk ML. Nature Genet. 2004;36:1331. doi: 10.1038/ng1473. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Tse WC, Boger DL. Acc Chem Res. 2004;37:61. doi: 10.1021/ar030113y. [DOI] [PubMed] [Google Scholar]
  • 26.Boger DL, Fink BE, Brunette SR, Tse WC, Hedrick MP. J Am Chem Soc. 2001;123:5878. doi: 10.1021/ja010041a. [DOI] [PubMed] [Google Scholar]
  • 27.Puckett JW, Muzikar KA, Tietjen J, Warren CL, Ansari AZ, Dervan PB. J Am Chem Soc. 2007;129:12310. doi: 10.1021/ja0744899. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Tse WC, Ishii T, Boger DL. Bioorg Med Chem. 2003;11:4479. doi: 10.1016/s0968-0896(03)00455-3. [DOI] [PubMed] [Google Scholar]
  • 29.Keles S, Warren CL, Carlson CD, Ansari AZ. Nucleic Acids Res. 2008;36:3171. doi: 10.1093/nar/gkn057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Yeung BKS, Tse WC, Boger DL. Bioorg Med Chem Lett. 2003;13:3801. doi: 10.1016/j.bmcl.2003.07.005. [DOI] [PubMed] [Google Scholar]
  • 31.Kopka ML, Yoon C, Goodsell D, Pjura P, Dickerson RE. Proc Natl Acad Sci U S A. 1985;82:1376. doi: 10.1073/pnas.82.5.1376. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Dervan PB, Edelson BS. Curr Opin Struct Biol. 2003;13:284. doi: 10.1016/s0959-440x(03)00081-2. [DOI] [PubMed] [Google Scholar]
  • 33.White S, Szewczyk JW, Turner JM, Baird EE, Dervan PB. Nature. 1998;391:468. doi: 10.1038/35106. [DOI] [PubMed] [Google Scholar]
  • 34.Waring MJ. J Mol Biol. 1965;13:269. doi: 10.1016/s0022-2836(65)80096-1. [DOI] [PubMed] [Google Scholar]
  • 35.LePecq JB, Paoletti C. J Mol Biol. 1967;27:87. doi: 10.1016/0022-2836(67)90353-1. [DOI] [PubMed] [Google Scholar]
  • 36.Rentzeperis D, Marky LA, Dwyer TJ, Geierstanger BH, Pelton JG, Wemmer DE. Biochemistry. 1995;34:2937. doi: 10.1021/bi00009a025. [DOI] [PubMed] [Google Scholar]
  • 37.Warren CL, Carlson CD, Hauschild KE, Ozers MS, Qadir N, Bhimsaria D, Ansari AZ. doi: 10.1073/pnas.0914023107. Submitted for publication. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Das MK, Dai HK. BMC Bioinformatics. 2007;8(Suppl 7):S21. doi: 10.1186/1471-2105-8-S7-S21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Tompa M, Li N, Bailey TL, Church GM, De Moor B, Eskin E, Favorov AV, Frith MC, Fu Y, Kent WJ. Nat Biotechnol. 2005;23:137. doi: 10.1038/nbt1053. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

01

RESOURCES