Skip to main content
UKPMC Funders Author Manuscripts logoLink to UKPMC Funders Author Manuscripts
. Author manuscript; available in PMC: 2018 Sep 6.
Published in final edited form as: Nat Protoc. 2016 Mar 17;11(4):764–780. doi: 10.1038/nprot.2016.039

Automated screening for small organic ligands using DNA-encoded chemical libraries

Willy Decurtins 1, Moreno Wichert 1, Raphael M Franzini 2, Fabian Buller 3, Michael A Stravs 4, Yixin Zhang 5, Dario Neri 1,*, Jörg Scheuermann 1,*
PMCID: PMC6126613  EMSID: EMS79202  PMID: 26985574

Abstract

DNA-encoded chemical libraries (DECLs) are collections of organic compounds, which are individually linked to different oligonucleotides, serving as amplifiable identification barcodes. Since all compounds in the library can be identified by their DNA tag, they can be mixed and used in affinity capture experiments on target proteins of interest. In this protocol, we describe the screening process that allows the identification of the few binding molecules within the multiplicity of library members. First, the automated affinity selection process physically isolates binding library members. Second, the DNA codes of the isolated binders are PCR-amplified and subjected to high-throughput DNA sequencing. Third, the obtained sequencing data are evaluated using a C++ program and the results displayed using MATLAB software. The resulting selection fingerprints facilitate the discrimination of binding from non-binding library members. The described procedures allow the identification of small organic ligands to biological targets from a DECL within 10 days.

Keywords: Affinity-based selection, automated screening, combinatorial chemistry, DNA-encoded chemical libraries, drug discovery, enrichment of binders, high-throughput DNA sequencing, lead identification, magnetic particle processor, single- and dual-pharmacophore libraries

Introduction

Currently available methods for the discovery of novel small molecule protein binders

Virtually all drugs are characterized by their ability to selectively bind to one or more target proteins. In conventional drug discovery, when a protein has been validated as a drug target, the hit discovery process for small molecules begins with the high-throughput screening (HTS, nomenclature in Box 1)13 of individual compounds (one by one) in large collections of molecules, which are generally called chemical libraries. In order to discover binding molecules, an assay is needed, which is compatible with automation and reveals a desired biomolecular interaction with the target protein of interest. The conventional screening of large chemical libraries by high-throughput methods works well for certain classes of protein targets (e.g., kinases3,4) and represents one of the drug discovery backbones in the pharmaceutical industry.

Box 1. Nomenclature.

DECL DNA-encoded chemical library. General term for all types of DNA-encoded compound collections.
ESAC Encoded self-assembling chemical library. Term for dual-pharmacophore libraries, which are constructed by self-assembly of two complementary sub-libraries.
HTDS High-throughput DNA sequencing using a next-generation DNA sequencing technology.
HTS High-throughput screening. One-by-one screening of large compound collections (= chemical libraries).
Set of building blocks Indicates the spatial position where building blocks are incorporated (= diversity element20,21).
Building block A chemical moiety which is chemically coupled to a growing molecule (the displayed compound).
Displayed compound A chemical compound (resulting from building block conjugations) covalently linked to on an oligonucleotide
Single-pharmacophore DECL A chemical compound displayed at the end of one oligonucleotide strand.
Dual-pharmacophore DECL A pair of chemical moieties simultaneously displayed at the end of two assembled oligonucleotide strands.
Recorded DECL DECL where the oligonucleotide part serves the purpose of identifying the attached compound.
Templated DECL DECL where the oligonucleotide assembly also directs the building block in close proximity to the nascent structure, thus allowing its conjugation.

However, HTS procedures have several limitations: the synthesis, quality control and management of conventional chemical libraries is associated with high costs, lengthy procedures and complex logistics. These constraints de facto restrict the practice of HTS to large industries or to few large consortia with sufficient economic resources. Furthermore, not all protein targets can be “drugged” with this procedure or can be produced in sufficient amounts for the individual screening of hundreds of thousands of compounds.

Further developments in drug discovery include fragment-based discovery approaches5,6 and virtual drug discovery79, where small libraries of soluble chemical fragments or large virtual collections of molecules, respectively, are screened for binders.

DNA-encoded chemical libraries

The recent development of DNA-encoded chemical libraries (DECLs)1012 (Figure 1) has allowed for the creation and screening of libraries of very large size, which can no longer be handled in a “one well, one compound” fashion as in HTS. The main principle of DECLs, as suggested by Brenner and Lerner13 as well as by Gallop and co-workers14 in the early 1990s, is to directly link chemical building blocks to oligonucleotides. The identity of the linked building blocks can easily be determined from the DNA coding tags, since it is known, which DNA sequence is associated with which building block. A number of different types of DECLs have been developed and can be classified as single-pharmacophore and dual-pharmacophore chemical libraries, depending on whether the displayed compounds are attached to one or two strands of a double-stranded oligonucleotide.

Figure 1.

Figure 1

Comparison of different DECL types. (a) Single-pharmacophore two building block “DNA-recorded” library20. (b) Single-pharmacophore three building block “DNA-templated” library27. (c) Dual-pharmacophore two building block ESAC library31. (d) Dual-pharmacophore two building block PNA/DNA hybrid library37.

In DNA-encoded single-pharmacophore libraries, small organic molecules are coupled to one DNA strand. In “DNA-recorded” single-pharmacophore libraries1521 (Figure 1a), the oligonucleotide part mainly serves as a DNA barcode, allowing the identification of the individual chemical compounds. The split-and-pool synthesis approach18,20,22 has been shown to be a versatile tool for the incorporation of various types of chemical moieties into encoded libraries. In this synthetic strategy, sets of building blocks are chemically coupled to form more complex molecular structures. After each synthesis step, suitable oligonucleotides containing a coding sequence are added to the molecular entity, thus “recording” the identity of the individual compounds. Alternatively, when performing “DNA-templated” synthesis2327, single-pharmacophore libraries are generated on a library of pre-formed oligonucleotide templates, which contain the DNA codes for the identification of the individual compounds (Figure 1b). At the same time, the hybridization of complementary oligonucleotide derivatives to the DNA template facilitates the transfer of building blocks to a nascent molecular structure. Bringing pairs of building blocks into close spatial proximity, the DNA hybridization step enables chemical reactions which normally do not work efficiently in water23. Furthermore, “DNA-templated” synthesis is potentially compatible with the execution of multiple rounds of library synthesis and selection. The use of a universal template28, e.g., by using the ambiguous base-pairing property of deoxyinosine, could enable the generation of larger libraries by DNA-templated synthesis. Alternative ways for constructing single-pharmacophore libraries are the yoctoreactor system18 and fluidic routing29.

Dual-pharmacophore DECLs feature pairs of chemical building blocks, attached to adjacent sites on oligonucleotide assemblies, such as the extremities of complementary DNA strands or the junction of two oligonucleotides hybridized to a common template. Individual chemical moieties are typically brought into close spatial proximity with flexible linkers, thereby facilitating their interaction with cognate binding sites on the target protein of interest. Once synergistic building blocks are identified, some synthetic efforts are needed in order to find optimal linkers for the generation of binding molecules in the absence of DNA. Encoded self-assembling chemical (ESAC) libraries30,31 (Figure 1c) are formed by the combinatorial self-assembly of two complementary sub-libraries, carrying a chemical moiety on both the 5` as well as the 3` end at the same side of a double stranded DNA heteroduplex. For some applications, the use of peptide nucleic acids (PNAs)32,33 rather than DNA offers certain advantages, such as a larger variety of compatible chemical reactions. In dual-pharmacophore PNA libraries (Figure 1d), two PNA sub-libraries, each carrying a coding sequence and a chemical fragment, are hybridized to a complementary DNA template library, containing two specific coding regions32,3438.

Advantages of DECLs for drug discovery applications

The use of DECL technology presents distinctive advantages, compared to classical HTS lead discovery:

  • -

    A very large library size18,39 can be obtained, depending on the number of sets of building blocks used12. For example, 3 sets of building blocks with 1000 members each will yield a library with 10003 = 109 displayed compounds. A variety of DNA-compatible synthesis approaches is available40,41.

  • -

    A DECL may be stored in only one small vessel, since the DNA barcodes allow the unambiguous identification of each library member. In contrast, members of conventional chemical libraries must be stored separately, typically in microtiter plates.

  • -

    Only a minute amount of a DECL is necessary to perform affinity-based selections against a target protein. One selection is performed in only one reaction vessel. This reduces costs significantly and facilitates the parallel screening of different target proteins.

  • -

    No protein-specific assays are needed, as selections with DECLs are solely based on affinity. The principle of affinity selections has been validated extensively for the discovery of target-specific biomacromolecules using a multitude of display technologies, e.g. antibody phage display technology4244, mRNA display45,46, ribosome display47, yeast display48 as well as systematic evolution of ligands by exponential enrichment (SELEX) technology49,50.

  • -

    The DNA linkage site of the small molecule compound serves as a “modifiable handle” for further medicinal chemistry optimization steps (e.g., introduction of solubility enhancing groups).

  • -

    Structure-activity relationships (SAR) can be obtained from the selection results if structurally related compounds are incorporated into the library20.

  • -

    Simultaneously binding fragment pairs can be obtained from dual-pharmacophore fragment-based selections31.

  • -

    The affordable setup makes DECLs an ideal tool for the academic community and for small companies.

Challenges and limitations of DECLs

  • -

    Sets of building blocks used for library construction are typically limited to 2-3, as larger compounds are less “drug-like”. In libraries based on the combinatorial assembly of multiple sets of building blocks, library size grows exponentially with the number of sets of building blocks. However, the incorporation of multiple building blocks in a molecule leads to properties, which are more likely to violate Lipinski’s rule of five51 and possibly limit pharmaceutical development possibilities.

  • -

    The functional relevance of library size is well illustrated by the following analogy: Human monoclonal antibodies are routinely discovered using combinatorial phage display libraries, containing billions of different antibody clones42,43. The technology works exceedingly well and it is virtually always possible to isolate a specific antibody against any target protein of interest. However, if sub-aliquots of the antibody libraries are used, which contain only a few million antibodies, the process becomes much less efficient and good-quality antibodies are rarely isolated as a result of the process52.

  • -

    In addition to library size, molecular design and library purity also play an important role, impacting on the functional quality of a DECL. While it is possible to purify conjugates containing only one building block to very high purity by HPLC, this procedure cannot be repeated after the introduction of a second building block in a split-and-pool synthesis approach. Although methods for enhancing library purity have been proposed53, the preparation of high-quality single-pharmacophore libraries remains challenging. In this respect, dual-pharmacophore ESAC libraries31 offer the advantage of superior purity, as each member of sub-libraries capable of self-assembly can be individually purified by HPLC and subsequently characterized by mass spectrometry.

  • -

    The total number and diversity of the used building blocks may be as important as the total number of compounds in a library, for selecting molecules with the desired functional properties.

  • -

    Affinity-based selection assays may lead to target binders which do not exert a functional agonistic or antagonistic effect. On the other hand, allosteric binders may be obtained which might stay undiscovered in a target-specific functional assay.

  • -

    For statistically relevant results, analysis of selections by high-throughput DNA sequencing (HTDS)54,55 should best be performed by oversampling the library, i.e. sequence counts exceeding the library size56,57. Despite the advances in HTDS technology, it may be difficult to achieve oversampling with very large libraries.

  • -

    Hit validation of selected binders may be cumbersome if too many hits are obtained from the affinity-based selection. Narrowing down the number of small organic molecules to be synthesized without DNA handle will speed up hit-to-lead development. Technologies based on locked nucleic acid (LNA) display31 or hybridization onto DNA-slides30,34 have been proposed.

Selection types for DECLs

Different methodologies for affinity-based selections using DNA-encoded chemical libraries have been developed58,59. In this protocol, we describe affinity-based selections on solid support: A target protein is immobilized on a surface matrix and subsequently incubated with the DECL. Washing steps remove non-binding library members, while binding compounds are eventually removed from the protein and identified. Alternatively, both binding partners are allowed to interact in solution, and the formed complex is then captured on solid support. These solid-phase selection approaches, widely used in the field, yielded hit compounds against many classes of target proteins16,18,20,26,31.

Solution-phase methods rely on the detection of ligand-target interactions without physical removal of non-binding library members, thereby superseding the application of a solid support. Interaction-dependent PCR (IDPCR)60 aims at revealing binding events between DNA-linked small molecules and DNA-linked target proteins. Enabled by close proximity of the binding partners, the DNA-tags may anneal and form a hairpin structure which contains the information of both the ligand as well as the target. For interaction determination using unpurified proteins (IDUP)61, an advanced set-up of this methodology, the covalent DNA-linkage to the target protein is replaced by antibody-binding or fusion-proteins. The indirect target linkage might facilitate the application of this method to unpurified targets in crude cell lysates. Binder trap enrichment (BTE)59, in analogy to IDPCR, may allow to select DNA-linked small molecules together with a DNA-linked target protein. After reaching equilibrium in solution, the ligand-protein solution is diluted and emulsified. The binding molecules are thus trapped in water-in-oil droplets, where the DNA fragments are ligated and eventually identified after breaking the emulsion.

DNA-programmed affinity labeling (DPAL)62 makes use of irradiation in order to covalently label previously unmodified target proteins. A single DNA-tagged compound, hybridized to a shorter capture probe, is incubated with a range of target proteins. Binding of the small molecule brings the capture probe in close proximity to the target protein, thereby facilitating its covalent attachment by irradiation. The ligand-specific target protein may then be identified using the introduced DNA-tag. The DPAL labeling system can be applied for selections63, when a library of encoded small molecules, hybridized to a photo-reactive probe, is incubated with one target protein. Non-binding library members can be removed by enzymatic digestion, while covalently attached library members would be protected from digestion by proximity to the target.

Development and application of the protocol

The screening procedure described in this protocol was first applied in 2004 by S. Melkko, J. Scheuermann, C.E. Dumelin and D. Neri30. Since then, the protocol was used to screen different DECLs, allowing for the identification of small molecule binders against alpha-1-acid glycoprotein31, B-cell lymphoma-extra large15,64, bovine serum albumin19, carbonic anhydrase II30, carbonic anhydrase IX17,31, human serum albumin15,19,20,22,30, interleukin 216, matrix metalloproteinase 365, prostate-specific membrane antigen20, rabbit serum albumin19, streptavidin66,67, tankyrase 120,21, trypsin15,68,69 and tumor necrosis factor alpha15. Over the past decade, literally all parts of the protocol were improved significantly: sepharose beads were replaced by magnetic beads, automated followed after manual selections and selection analysis evolved from microarray-based decoding to more cost-efficient HTDS approaches. In the initial protocol, sepharose beads were used for affinity selections15,17,22,30,56,6469. Target proteins were either covalently immobilized on cyanogen bromide (CNBr)-activated sepharose beads or bound to streptavidin-sepharose, following protein biotinylation. Later, magnetic beads with cobalt-based chemistry16 for the immobilization of His-tagged proteins or streptavidin-coated beads1921,31 (SA beads) for the immobilization of biotinylated proteins were introduced. Magnetic beads enabled the transition from manual to automated selections1921,31, using the KingFisher magnetic particle processor.

Means employed for selection decoding have drastically improved over the past years. Selection decoding provides sequence counts for individual library members before and after selection, thereby enabling the assessment of the enrichment factor achieved with the selection step. While in former times we used DNA microarrays30, the advent of HTDS technology54,55 has greatly impacted on decoding, as efficient DNA sequencing procedures are crucially important for the success of DECL technology with large libraries. Our DECL team was first to report the use of Roche`s 454 HTDS technology for the decoding of DECL selections56, replacing formerly practiced microarray-based methods. Nowadays, the Illumina/Solexa technology is more often used57, enabling the decoding of larger libraries. For smaller libraries, however, decoding by microarray hybridization continues to be a convenient methodology37.

Considerations on selection parameters

Many selection parameters may be adjusted to individual needs. For example, the solid support can consist of a streptavidin-coated magnetic bead, binding a biotinylated protein, or an activated sepharose bead to which the target protein is covalently linked. An alternative to the selection automation using magnetic beads, as described in this protocol, is the application of sepharose columns that fit on liquid handlers (e.g., PhyTips from Phynexus).

Besides covalent protein binding and immobilization via biotinylation, tagging strategies such as His-tag, FLAG-tag, Strep-tag and GST-fusion7074 may be used. We prefer the streptavidin-biotin system, since the biotinylation of a target protein can easily and reliably be obtained and the interaction is strong enough to prevent dissociation during the course of the affinity-based selection.

In solid-phase selections, the capacity of the beads eventually determines the amount of tagged protein that can be displayed. We usually perform selections with at least two different protein concentrations: First, at conditions where the beads are saturated with protein and second, at conditions with lower protein display. The latter allows for more stringency and the identification of higher-affinity hits from the DECL75. Optimally, known protein-ligand systems (e.g., carbonic anhydrases/sulfonamides) should be included as positive control selections.

One of the parameters to be examined is the concentration of the DECL. A higher input generally leads to more DNA recovery from the selection, as monitored by quantitative PCR methods. If a high amount of DNA can be recovered, the selection eluate can be used as starting material for further rounds of affinity-based selections on fresh target beads (so-called pseudo-rounds of selection). In addition, different incubation times and buffer compositions may be tested and stringency may be increased with the number of washing steps75.

Experimental design

Protocol overview

With this protocol, we aim at providing a generally applicable, inexpensive and fast procedure to obtain small molecule protein ligands from DECLs for pharmaceutical or chemical biology applications. The protocol is divided into three parts (Figure 2). First, we describe the affinity selection step, which physically separates binding molecules from non-binding library members. Second, the DNA part of selected binders is amplified in a two-step polymerase chain reaction (PCR) and subjected to Illumina HTDS. Third, the resulting DNA sequences are processed and analyzed, thereby revealing the relative enrichment of individual library members in relation to the target protein used. The protocol delineates the procedure for libraries comprising two sets of building blocks, yet it can easily be adapted to DECLs comprising 3 or even 4 sets of building blocks, as described in the last section.

Figure 2.

Figure 2

Overview of the screening process. In Part I, binding library members are identified from the multiplicity of library members. Part II describes the amplification of the eluted library members` DNA tags, followed by Illumina HTDS. The sequencing results are analyzed in Part III using a C++ program and displayed in MATLAB.

Part I: Affinity selection

In affinity selection procedures, the target protein of interest is typically immobilized on magnetic beads and subsequently incubated with the encoded chemical library (Part I in Figure 2). If fitting small organic ligands happen to be present in the DECL, they will be retained on the magnetic beads due to a binding interaction with the immobilized target protein. Non-binding library members are removed by several washing steps, while binding library members remain bound to the solid support and are eventually dissolved in Tris buffer during the last step of the affinity selection. The use of Tris buffer is compatible with freezing of selection material and with the subsequent PCR amplification procedure.

We perform selections using biotinylated proteins. The chemical biotinylation procedure with N-hydroxysuccinimide-biotin reagents is mild and reliable. Our biotinylation protocol, adapted from the supplier's instructions, is provided as Supplementary Method 1. After the biotinylation reaction, the biotinylated protein needs to be purified from the free biotin by size-exclusion purification and should be quality-controlled on a SDS-PAGE gel. The success of the biotinylation reaction can be assessed using a bandshift assay, or, for some proteins, MS analysis.

The selection buffer used for the affinity selection depends on the target protein. Most proteins can readily be stored and screened in phosphate-buffered saline (PBS). Standard PBS types, as used for the majority of selections, are detailed in the reagent setup section. If PBS may not be used as selection buffer, e.g. for the screening of phosphatases, alternative buffers such as HEPES may be employed. For simplicity, all steps of the affinity selection protocol are described for PBS.

At the beginning of the selection procedure, the biotinylated target protein is immobilized on magnetic beads in PBS, followed by two washing steps with PBST-Biotin. The biotin blocks free binding sites on the SA beads in order to reduce selection of streptavidin binders from the library. Tween-20 is added to the selection solution in order to prevent the coagulation of magnetic beads and the sticking of beads to the plastic of plate and tip comb. Tween-20 possibly reduces background from false positive hits, which bind to the target protein by unspecific hydrophobic interactions.

Different types of magnetic beads can be considered for DECL selections: SA beads1921,31, beads with cobalt-based chemistry16 for the immobilization of His-tagged proteins as well as beads for covalent protein immobilization. We have made good experience with the use of SA beads and these procedures are detailed in this protocol. Four types of SA beads are commercially available from Life Technologies: Dynabeads M-270 SA, M-280 SA, MyOne SA C1 and MyOne SA T1, varying in diameter and whether the beads are pre-blocked with bovine serum albumin. Depending on the DECL and target protein in use, a different type of SA beads may be optimal. We mostly use Dynabeads MyOne SA T120,31 and Dynabeads M-270 SA1921. For each selection, 0.1 mg of magnetic beads1921,31 are used. The typical binding capacity varies with the type of beads. In the case of Dynabeads MyOne SA T1, 0.1 mg of beads have a binding capacity of 40 pmol of biotinylated peptide. However, also lower and higher amounts for protein immobilization may be considered.

The number of washing steps may be adjusted for the individual selection. If a high-affinity ligand can be expected, the stringency (e.g., number and duration) of the washing steps can be increased. By contrast, if a lower binding affinity is expected, fewer washing steps of shorter duration should be considered. A good compromise was found in the use of 5 washing steps with a duration of 30 seconds each1921,30.

Before a selection experiment, the DECL working solution needs to be prepared and the magnetic beads equilibrated to PBS. The Neri lab DECLs are stored in water as concentrated stock solutions, which are diluted to 5 nM working solutions, defined as concentration of individual library members multiplied by the number of library members (Preparation of the DECL working solution, steps 1| - 2|). Herring sperm DNA is added to the DECL as blocking agent in order to prevent unspecific binding of the library-oligonucleotides to the target protein, streptavidin, or the beads themselves. The magnetic beads are washed and resuspended in the appropriate amount of PBS (Washing of the magnetic beads, steps 3| - 5|).

Originally, affinity selection assays using magnetic beads were performed in a manual fashion with the use of a magnetic rack (Manual affinity selection, step 6|(B)). Manual handling possibly introduces some operator bias, especially on the washing steps and when handling many selections in parallel. If short washing times are required, only a few (twelve or less) samples may be handled in parallel. With the adaptation of the protocol to the use on the Thermo Scientific KingFisher magnetic particle processor, the quality, reproducibility and throughput of the selection assays were improved significantly (Automated affinity selection, step 6|(A)). The KingFisher device can process up to 24 samples per run, which typically takes 135 minutes. A standard affinity-selection protocol on the KingFisher device is depicted in Figure 3, while the according program parameters are summarized in Table 1. The complete program, ready for import into the KingFisher BindIt software, is provided as Supplementary Software 1, together with an example protocol status report as Supplementary Data 1.

Figure 3.

Figure 3

Plate loading scheme. The upper panel shows two KingFisher 200 µl plates from above. As the magnetic particle processor transfers all magnetic beads contained in a row (e.g., A1 to A12) during each step, the wells are loaded in a row-wise fashion (e.g., target proteins in row B of plate 1). The respective solutions are filled into the wells as depicted in the lower panel: 200 µl per well for all washing steps and 100 µl of washed beads, target protein, library and Tris buffer. The numbers above the arrows indicate the incubation time of the beads at each step. In this set-up, each column allows the performance of an independent selection. While the handling by the magnetic particle processor is identical for all plate columns, individual selection parameters may be varied in terms of target protein, DECL type and general buffer composition.

Table 1|. Standard KingFisher program for DECL affinity selections.

Beginning of step Mixing/pause End of step

Release time [s] Release speed Mixing time [min] Mixing speed Collect count Collect time [s]
p1 A beads - - 5 Bottom mix 5 10
p1 B target 30 Medium 30 Medium 5 10
p1 C wash1 30 Medium 3 Medium 5 10
p1 D wash2 30 Medium 3 Medium 5 10
p1 E wash3 30 Medium 3 Medium 5 10
p1 F library 30 Medium 60 Medium 5 10
p1 G wash1 30 Medium 0.5 Medium 5 10
p1 H wash2 30 Medium 0.5 Medium 5 10
p2 A wash3 30 Medium 0.5 Medium 5 10
p2 B wash4 30 Medium 0.5 Medium 5 10
p2 C wash5 30 Medium 0.5 Medium 5 10
p2 D elution 30 Medium 3 Medium - -

The options “Precollect”, “Pause” and “Postmix” are not used.

Part II: PCR and high-throughput sequencing

The DNA part of eluted binding library members is amplified in a two-step PCR procedure (PCR amplification of oligonucleotide tags, steps 7| - 19|) and the resulting DNA amplicons subjected to Illumina HTDS (Illumina high-throughput sequencing, steps 20| - 24|). The first PCR (Figure 4, nt sequences provided in Supplementary Figure 1) amplifies the selected encoded library members for each selection separately. The DNA fragments, linked to compounds which have been recovered on beads at the end of the selection procedure (row D of plate 2 in Figure 3), serve as template for these PCR reactions. Primers IlluminaPCR1a and IlluminaPCR1b introduce two additional DNA codes suitable for the identification of individual affinity selections, thus allowing the parallel HTDS of different selection experiments on the same Illumina flow lane. These PCR 1 reactions are purified using a PCR purification kit, pooled to equimolar concentration and used as template for the next amplification step. The number of selections which may be pooled and analyzed in parallel on the same Illumina flow lane depends on both library size, and the desired sequence count per selection. The second PCR (Figure 4, nt sequences provided in Supplementary Figure 1), using primers IlluminaPCR2a and IlluminaPCR2b, eventually introduces the TruSeq adapter sequence55, which is required for HTDS on the Illumina HiSeq devices.

Figure 4.

Figure 4

Layout of the two-step PCR. In the first PCR reaction, selection specific codes (code 1, code 2) are added. These reactions are purified, pooled, and used as template for the second PCR reaction, which introduces DNA sequences required for Illumina sequencing.

HTDS technologies54,55 have greatly improved over the years while costs per sequence have decreased, making the technology affordable also for academic institutions. Driven by efforts in genomic research, different and even more powerful HTDS technologies have been, and continue to be developed, which may serve as an alternative to Illumina sequencing. The choice of a given HTDS technology for DECL selections also depends on the length of the DNA oligonucleotide to be sequenced (varying between 70 and 150 nt), and the desired number of sequence reads, as hundreds of millions of DNA sequences may be required for large libraries.

Illumina sequencing is based on the amplification of “clustered” DNA-strands, individually confined to small portions on the surface of the reaction chamber (i.e., an eight-lane flow cell)55. Each DNA cluster, originating from one individual DNA sequence, can be analyzed by the sequential incorporation of fluorescently labelled DNA nucleotides, followed by iterative scanning procedures (DNA sequencing by synthesis). As a good spatial scanning resolution of the clusters cannot be obtained if all clusters are incorporating the same base at a time (as it is the case when a constant region of a DECL is sequenced), dummy random genomic DNA (e.g., the “PhiX Control v3” library) needs to be added to the DNA amplicon to be analyzed, in order to obtain optimal sequencing results.

In the past, sequence length restrictions and obtained numbers of sequences posed severe constraints. Nowadays, Illumina HTDS constitutes a suitable platform for the application with DECL selections. Using the fourth version of Illumina HiSeq chemistry, 2 · 109 reads with a length of 125 nt may be obtained per flow cell in single-read mode. Since one flow cell comprises eight flow lanes, one flow lane provides 250 · 106 reads. Due to the addition of 30% dummy PhiX DNA, 175 · 106 reads per flow lane can be used in the best case. In general, it is desirable to over-sample the library size (e.g., by a factor 10). This means, for a library of 106 compounds, approximately 15 selections should best be sequenced on the same flow lane. Undersampling may be necessary for very large libraries (e.g., those with 4 or more sets of building blocks), but may still yield valuable hits.

Part III: Data analysis

HTDS delivers very large raw data files of up to 50 gigabytes per Illumina flow lane, which need to be analyzed (Data analysis, steps 25| - 35|). An overview about the analysis process is given in Figure 5. First, we convert the standard output *.fastq data files into *.fasta files, thereby deleting the two out of four lines per sequence containing the sequence quality information. This information is not needed for our purpose, since, by default, the evaluation program will count only intact sequences, i.e. full length sequences with correct constant parts and codes of proper length.

Figure 5.

Figure 5

Data processing workflow. Illumina HTDS raw data is converted from *.fastq to *.fasta. HTDS data, the structure file and the code lists are provided as input for the C++ program, which generates three output files. The normalized output file is imported into MATLAB. The selection fingerprint is obtained using a MATLAB script.

We developed the C++ evaluation program “count.cpp”, provided as Supplementary Software 2, which processes the *.fasta HTDS data based on a definition structure file (Box 2 and Supplementary Software 3). This C++ program can be compiled and used on the platform of choice (PC, Mac, Unix). The structure file contains the path to the *.fasta file (path_to_sequence_file) to be evaluated as well as a definition of the minimum length of the HTDS sequence (minimum_line_length) to be analyzed. The user may specify whether the applied DECL contains 2, 3 or 4 sets of building blocks (output_type) and how many errors are allowed per constant region (mismatch_limit, default=0). Four different coding positions (code1-4) are defined, together with the number of coding sequences per code (code1-4_count) and the individual start (code1-4_startpos) and end (code1-4_endpos) positions in the nt sequence (Supplementary Figure 1). Separate code lists, stored at the path indicated at path_to_code1-4_list, provide the sequences used in the coding positions. An example codelist can be found in Supplementary Software 4. Further, the constant regions between the codes are defined (const1-3), their start (const_region_1-3_startpos) and end (const_region_1-3_endpos) positions as well as the corresponding DNA sequence (const_region_1-3_seq). The example DECL setup provided in Supplementary Figure 1 and the corresponding structure file (Box 2 and Supplementary Software 3) show a 2-building block library31.

Box 2. Structure file.

Template structure file
path_to_sequence_file
minimum_line_length
code1_count  code2_count                 code3_count                code4_count
output_type
mismatch_limit
code1        code1_startpos              code1_endpos               path_to_code1_list
code2        code2_startpos              code2_endpos               path_to_code2_list
code3        code3_startpos              code3_endpos               path_to_code3_list
code4        code4_startpos              code4_endpos               path_to_code4_list
const1       const_region_1_startpos     const_region_1_endpos      const_region_1_seq 
const2       const_region_2_startpos     const_region_2_endpos      const_region_2_seq
const3       const_region_3_startpos     const_region_3_endpos      const_region_3_seq

Example structure file
sequences/HTDS_data.fasta
50
20           5                           575                        213
2bb 
0
x            1                           6                          codelists/codelist1.txt
y            87                          92                         codelists/codelist2.txt
z            31                          36                         codelists/codelist3.txt
$    	     58                          65                         codelists/codelist4.txt
1            7                           30                         GGAGCTTCTGAATTCTGTGTGCTG
2            37                          57                         CGAGTCCCATGGCGCAGCTGC
3            66                          86                         CACGGATCCATTCGATGCAGG

The C++ program checks the millions of sequences obtained from HTDS whether the constant domains match and counts the occurrence of the individual code combinations. Three output files are created: (1) “name_datum_Codecounts.txt”, (2) “name_datum_eval.txt” and (3) “name_datum_evalNorm.txt”. The output file “name_datum_Codecounts.txt” gives an overview of the evaluation: The number of sequences contained in the *.fasta file, the number of sequences that could be evaluated according to the structure file settings, the codelists used and the assignment to individual codes within the codelists. A code “0” means that no assignment was possible to a code of the respective codelist. The individual evaluation of a selection, i.e. the counting of code combinations, is given in the evaluation file “name_datum_eval.txt”. This text file contains comma separated values and includes a header row with all the code1_code2 combinations (i.e., the individual selections are given as columns), followed by a column with the codeA__codeB (= code3__code4) combinations and two columns with the code A (= code 3) and the code B (= code 4), respectively.

Eventually, each occurring codeA__codeB combination is counted for a given selection code1_code2. As the number of sequences obtained from HTDS per selection can vary considerably, we work with an internal normalization of a given selection (output file “name_datum_evalNorm.txt”): The average counts of all compounds (i.e., the codeA__codeB combinations) in a given selection (code1_code2) is determined and the counts for each individual compound codeA__codeB then divided by this average, multiplied with 100 and rounded to an integer value. This internal normalization, in our opinion, allows for an optimal comparison of selections for enrichment of individual code combinations. An alternative approach is the external normalization, where target selections are divided by empty bead selections or the naïve library. Of course, different ways for normalization can be considered and the user may prefer to process the raw evaluation file differently for a desired normalization.

The normalized file can be imported into the MATLAB software for graphical visualization of the selection results in a cartesian coordinate system plot (Figure 6). Our MATLAB script for 2-building block DECL selections is provided as Supplementary Software 5 and needs to be adjusted for input file name, the selection of interest and the desired count cutoff value (highlighted in yellow). Only z-values exceeding the cutoff are plotted. After affinity selection, certain library members may be enriched, as a result of their preferential interaction with the protein immobilized on the solid support (Figure 6a). This visualization enables the display of structure-activity relationships (SAR): binding building blocks are visible as lines for libraries of 2 sets of building blocks, while crosspoints indicate contributions to binding from both building blocks. The latter situation is especially remarkable for dual-pharmacophore31 libraries, profiting from the chelate effect67. Apart from counting the codes in selection experiments, it is important to also sequence the unselected, naïve library as well as selections performed with “empty” beads, in order to define enrichment factors and to check library homogeneity. In an unselected library (Figure 6b), all compounds should be present at comparable amounts57.

Figure 6.

Figure 6

Selection fingerprints. (a) Fingerprint of a 2-building-block DECL selection against horseradish peroxidase. NSC: normalized sequence count. Sequence counts normalized to 100; cutoff level = 1000. Binding building blocks have an elevated NSC value and are visible as lines. The crosspoints of binding building blocks feature the highest enrichment, indicating that both building blocks contribute to the binding (chelate effect). (b) Unselected, naïve 2-building block DECL library. (normalized to 100, cutoff level = 200). Before the affinity selection, all library members are present in comparable amounts. (c) Graphical representation of a 3-building block DECL selection. The axes represent the three sets of building blocks, while dot colour and size represent the sequence counts of a compound according to the heat scale given at the right.

After analysis of the selection results, the obtained hits need to be validated on- and off-DNA31,76 in different, target specific assays and optimized by medicinal chemical activities to become valuable lead structures and eventually drugs. These procedures cannot easily be generalized and are beyond the scope of this protocol.

Adaptation of the protocol to different types of DECLs

The presented protocol has been optimized for selections using dual-pharmacophore DECLs with two sets of building blocks. Following affinity-based selection, the primers used for the first PCR step introduce two additional coding regions, which are used to tag the individual selection (nt sequences provided in Supplementary Figure 1). The same set-up can be used for single-pharmacophore libraries containing two sets of building blocks. The encoding of selections with two additional coding domains allows the pooling of 100 and even more selections conducted with relatively small 2-builing block libraries (typical library sizes of 105 - 106 library members) for analysis on one Illumina flow lane, making the selection read-out a low cost effort.

Libraries with three and four sets of building blocks can be treated analogously. As our evaluation software can analyze four coding regions, DECL libraries consisting of three sets of building blocks can be decoded and one additional code for the selection encoding can be introduced in the first PCR step. Since 3-building block libraries typically have sizes in the millions range, only a handful of selections should be pooled and analyzed on one Illumina flow lane. Libraries with four sets of building blocks can theoretically be very large (in the billions range), and their selections cannot be barcoded in the first PCR step of our protocol. However, if desired, the individual selections can be tagged at the second PCR step by using barcoded Illumina TrueSeq primers.

The structure file for our evaluation program defines whether a library of 2-, 3- or 4 sets of building blocks is evaluated (Box 2 and Supplementary Software 3). For 3-building block libraries, the evaluation file can also be imported into MATLAB. We provide a script for the graphical 3-dimensional representation of a 3-building block library in Supplementary Software 6. Here, the 3 sets of building blocks form the x, y and z axes and the count for each compound is given as a coloured dot of different size (Figure 6c shows an example plot). A heatmap scale at the right correlates the colour and size of the dot with the normalized sequence counts. A graphical representation of selections with four sets of building blocks is out of the scope of this protocol.

Materials

Reagents

Acetic acid (Sigma-Aldrich, cat. no. 33209) ! CAUTION Concentrated acetic acid is corrosive. Use appropriate personal protective equipment such as nitrile gloves, protective eye goggles, chemical-resistant clothing and shoes. Handle in a fume hood.

Agarose (Sigma-Aldrich, cat. no. A9539)

D-Biotin (ChemPep, cat. no. 270201)

EZ-Link NHS-LC-Biotin (Thermo Fisher Scientific, cat. no. 21336)

Boric acid (Sigma-Aldrich, cat. no. B6768)

Deoxynucleoside Triphosphate Set (Roche, cat. no. 11969064001)

Dimethyl sulfoxide (Fluka, cat. no. 41641)

DNA-encoded chemical library (DECL). DECLs have been reported by various academic groups26,28,31,37,77 and by different companies18,21,39,78. Aliquots of DECLs for affinity-based selections may be obtained from ETH Zurich or Philochem AG upon contractual agreement.

Dynabeads MyOne Streptavidin C1 (Thermo Fisher Scientific, cat. no. 65001) CRITICAL For this protocol, only these beads have been evaluated. Products of other suppliers may lead to different selection results.

Dynabeads MyOne Streptavidin T1 (Thermo Fisher Scientific, cat. no. 65601) CRITICAL For this protocol, only these beads have been evaluated. Products of other suppliers may lead to different selection results.

Ethylenediaminetetraacetic acid (Sigma-Aldrich, cat. no. E9884)

Ethanol absolute (Fluka, cat. no. 02860)

GeneRuler Ultra Low Range DNA Ladder (Thermo Fisher Scientific, cat. no. SM1213)

HCl (Sigma-Aldrich, cat. no. 30721) ! CAUTION Hydrochloric acid is highly corrosive. Use appropriate personal protective equipment such as nitrile gloves, protective eye goggles, chemical-resistant clothing and shoes. Handle in a fume hood.

Herring Sperm DNA (Thermo Fisher Scientific, cat. no. 15634-017)

HiSeq SBS Kit V4 (Illumina, cat. no. FC-401-4002)

HiSeq SR Cluster Kit v4 cBot (Illumina, cat. no. GD-401-4001)

Isopropanol (Sigma-Aldrich, cat. no. 33539-1L-R)

NaCl (Merck, 1.06404.1000)

Na2HPO4·2H2O (Merck, cat. no. 1.06580.1000)

NaH2PO4·2H2O (Merck, cat. no. 1.06345.1000)

NaOH (Fisher Chemical, cat. no. 10675692) ! CAUTION Sodium hydroxide is highly corrosive. Use appropriate personal protective equipment such as nitrile gloves, protective eye goggles, chemical-resistant clothing and shoes. Handle in a fume hood.

NucleoSpin Gel and PCR Clean-up (Macherey-Nagel, cat. no. 740609.250)

PD-10 Desalting Columns (GE Healthcare, cat. no. 17-0851-01)

PhiX Control v3 (Illumina, cat. no. FC-110-3001)

Phusion High-Fidelity DNA Polymerase (New England Biolabs, cat. no. M0530S)

Sodium acetate (Sigma-Aldrich, cat. no. S2889)

Target protein(s) of choice, e.g.:

  • Alpha-1-Acid Glycoprotein31 (Athens Research & Technology, cat. no. 16-16-010700)

  • Carbonic Anhydrase IX31,79, amino acids 120-397

  • Human Serum Albumin20 (Sigma-Aldrich, cat. no. A3782)

  • Tankyrase 120,21, amino acids 1106-1325

Tris (Sigma-Aldrich, Trizma base, cat. no. T6066)

Tween-20 (Sigma-Aldrich, cat. no. P1379)

Water, deionized, filtered through a 0.2 μm filter (Millipore, Millipak 40, cat. no. MPGP04001)

Equipment

Agarose gel electrophoresis equipment

High-throughput sequencing service (Illumina, HiSeq 2500)

KingFisher magnetic particle processor (Thermo Fisher Scientific, cat. no. 5400000)

KingFisher plate 200 μl (Thermo Fisher Scientific, cat. no. 97002084)

KingFisher tip comb (Thermo Fisher Scientific, cat. no. 97002070)

Magnetic rack, MagRack 6 (GE Healthcare, cat. no. 28-9489-64)

Microcentrifuge (Eppendorf, Centrifuge 5424)

NanoDrop spectrophotometer (Thermo Fisher Scientific, NanoDrop 2000)

pH meter (Mettler-Toledo, FiveEasy Plus, FEP20)

Reaction tubes, 0.2 ml capped 8-strips (Sarstedt, cat. no. 72.991.002)

Reaction tubes, 1.5 ml (Sarstedt, cat. no. 72.690.001)

Reaction tubes, 2.0 ml (Greiner, cat. no. 623 201)

Rotator (Life Technologies, HulaMixer Sample Mixer, cat. no. 15920D)

Thermal cycler (Biometra, T-Gradient Thermoblock)

UV imaging system (Raytest, Diana3)

Vacuum filtration flask, PES membrane, 0.22 µm pore size (TPP, cat. no. 99500)

Software

BindIt (Thermo Fisher Scientific, provided with the KingFisher instrument)

Word (Microsoft)

MATLAB (MathWorks)

Compiler for C++ program

REAGENT SETUP

CRITICAL For optimal purity, filter all PBS buffers as well as the Tris and NaOAc buffers using a 0.22 µm PES membrane.

HCl 5 M HCl. Dilute the 37% solution in a fume hood. First, give water into a vessel, then add the 37% HCl solution to yield a 5 M solution. HCl is stable at room temperature (22 °C) for at least 12 months. ! CAUTION Hydrochloric acid is highly corrosive. Use appropriate personal protective equipment such as nitrile gloves, protective eye goggles, chemical-resistant clothing and shoes.

NaOH 5 M NaOH. Dissolve the white pellets in a fume hood. First, give water into a vessel (place it on ice), then add the pellets to yield a 5 M solution. Ensure constant stirring. NaOH is stable at room temperature for at least 12 months. ! CAUTION Sodium hydroxide is highly corrosive. Use appropriate personal protective equipment such as nitrile gloves, protective eye goggles, chemical-resistant clothing and shoes.

D-Biotin 200 mM D-Biotin. Dissolve D-Biotin in dimethyl sulfoxide to yield a 200 mM solution. D-Biotin is stable at -20 °C for at least 12 months.

PBS 50 mM NaPi, 100 mM NaCl, pH 7.4. Dissolve sodium phosphate and sodium chloride in water under constant stirring. Adjust pH with 5 M NaOH. The buffer should be prepared freshly.

PBST 50 mM NaPi, 100 mM NaCl, 0.05% Tween-20 (v/v), pH 7.4. Take PBS as described above, add 0.05% Tween-20 (v/v) under constant stirring. The buffer should be prepared freshly.

PBST-Biotin 50 mM NaPi, 100 mM NaCl, 0.05% Tween-20 (v/v), 100 µM D-Biotin, pH 7.4. Take PBST as described above, add 200 mM D-Biotin to yield 100 µM final concentration. The buffer should be prepared freshly.

PBST-HS 50 mM NaPi, 100 mM NaCl, 0.05% Tween-20 (v/v), 0.2 mg/ml Herring Sperm DNA, pH 7.4. Take PBST as described above, add 10 mg/ml Herring Sperm DNA solution to yield 0.2 mg/ml final concentration. The buffer should be prepared freshly.

Tris 10 mM Tris, pH 8.5. Dissolve Tris in water under constant stirring. Adjust pH with 5 M HCl. The buffer is stable at room temperature for at least 12 months.

NaOAc 3 M sodium acetate, pH 4.7. Dissolve sodium acetate in water. Adjust pH using glacial acetic acid in a fume hood. NaOAc is stable at room temperature for at least 12 months. ! CAUTION Concentrated acetic acid is corrosive. Use appropriate personal protective equipment such as nitrile gloves, protective eye goggles, chemical-resistant clothing and shoes.

EQUIPMENT SETUP

KingFisher KingFisher magnetic particle processor. A personal computer with Windows operating system is required for running the protocol.

Illumina sequencing Illumina HiSeq 2500. The sequencing was conducted by the Functional Genomics Center Zurich.

Hard disk space The storage and processing of high-throughput DNA sequencing data requires a large amount of hard disk space. Make sure to have at least 1 terabyte of free space available.

Procedure

Preparation of the DECL working solution ● TIMING 10 min

CRITICAL Library concentration is indicated for the total of all library members.

  • 1|

    Dilute the DECL stock solution to a concentration of 100 nM using deionized water.

  • 2|
    Mix the 100 nM DECL solution with PBST-HS and PBST in order to receive the DECL working solution.
    Component Volume per reaction (µl) Final concentration
    DECL in mQ (100 nM) 5 5 nM
    PBST-HS (200 µg/ml HS) 5 10 µg/ml
    PBST up to 100

Washing of the magnetic beads ●TIMING 20 min

CRITICAL The magnetic beads stay in suspension for a short time, then they sink to the bottom of the vessel. Before pipetting a defined amount of beads, ensure that the beads are resuspended thoroughly.

  • 3|

    Pipet the desired amount of magnetic beads from the manufacturer`s vial into a new 2 ml reaction tube.

  • 4|

    Place the 2 ml reaction tube in the magnetic rack, discard the buffer and resuspend the magnetic beads in 1 ml PBS. Repeat three times in total.

  • 5|

    Place the 2 ml reaction tube in the magnetic rack, discard the buffer and resuspend the magnetic beads in the final volume of PBS.

Affinity selection

CRITICAL Next to the selections against target proteins, it is vital to also perform selections against uncoated magnetic beads in order to identify potential signal not originating from the target protein.

  • 6|

    In this step, an automated affinity selection procedure using the KingFisher magnetic particle processor can be used by following option A. If this device is not available, a manual selection procedure can be performed as described in option B.

    • (A)

      Automated affinity selection ● TIMING 3 h

      Timing indicates the time required for the performance of 24 selections.

      • (i)

        Ready the KingFisher magnetic particle processor and load the appropriate program on the connected computer.

      • (ii)

        Place a new tip comb into the KingFisher.

      • (iii)

        Take two new KingFisher 200 µl plates and label them “plate 1” and “plate 2”, respectively.

      • (iv)

        Pipet buffers PBST-Biotin (200 µl/well), PBST (200 µl/well) and Tris (100 µl/well) into the plates as described in Figure 3.

      • (v)

        Add the DECL working solution (100 µl/well), prepared in step 2|, to the appropriate wells of plate 1 (Row F, see Figure 3).

      • (vi)

        Dilute the biotinylated target protein using PBS to the final concentration and add it (100 µl/well) to plate 1 (Row B, see Figure 3).

      • (vii)

        Resuspend the washed magnetic beads (from step 5|) thoroughly.

      • (viii)

        Distribute the washed magnetic beads (100 µl/well) to plate 1 (Row A, see Figure 3) and immediately start the KingFisher program.

        CRITICAL STEP The magnetic beads stay in suspension for a short time, then they sink to the bottom of the well, where they can be resuspended by manual pipetting but not by the KingFisher device. Thus, it is crucial to thoroughly mix the beads before distribution to plate 1 and to start the KingFisher program immediately after addition of the beads.

      • (ix)

        The KingFisher transfers the beads from row A to row H (Figure 3). Once the process in row H is completed, remove plate 1 from the KingFisher and insert plate 2 in the same position. Immediately continue with the program.

        CRITICAL STEP While the KingFisher waits for the user to change plates, the magnetic beads are being held on the tip comb in the air, outside of the plate. Thus, it is crucial to immediately change the plates and continue the program, as the protein may degrade if held outside of buffer for a prolonged period of time.

        ? TROUBLESHOOTING

      • (x)

        Upon completion of the program, the KingFisher releases the magnetic beads into the wells containing Tris buffer. Transfer the Tris buffer with the magnetic beads into capped PCR strips.

      • (xi)

        Heat the magnetic beads for 10 min at 95 °C in order to denature the target protein.

        PAUSE POINT The oligonucleotide part of the library members is stable and can be stored in Tris buffer up to 1 year at -20 °C.

    • (B)

      Manual affinity selection ● TIMING 6 h

      Timing indicates the time required for the performance of 24 selections.

      • (i)

        Dilute the biotinylated target protein using PBS to the final concentration.

      • (ii)

        Ready one 1.5 ml reaction tube per selection.

      • (iii)

        Distribute 100 µl washed magnetic beads (from step 5|) to each 1.5 ml reaction tube.

      • (iv)

        Place the 1.5 ml reaction tube in the magnetic rack and discard the buffer.

      • (v)

        Resuspend the magnetic beads in 100 µl protein solution.

      • (vi)

        Incubate 30 min on a rotator.

        CRITICAL STEP It is important to use a rotator that provides end-over-end rotation. If the incubation step is not done using a rotator (e.g., using a shaker), the beads may sink down to the bottom of the reaction tube, resulting in insufficient protein immobilization.

      • (vii)

        Place the 1.5 ml reaction tube in the magnetic rack, discard the buffer and resuspend the beads in 200 µl PBST-Biotin. Repeat two times in total.

      • (viii)

        Place the 1.5 ml reaction tube in the magnetic rack, discard the buffer and resuspend the beads in 200 µl PBST.

      • (ix)

        Place the 1.5 ml reaction tube in the magnetic rack and discard the buffer.

      • (x)

        Resuspend the magnetic beads in 100 µl DECL working solution.

      • (xi)

        Incubate 1 h on a rotator.

        CRITICAL STEP It is important to use a rotator that provides end-over-end rotation. If the incubation step is not done using a rotator (e.g., using a shaker), the beads may sink down to the bottom of the reaction tube, resulting in suboptimal selection results.

      • (xii)

        Place the 1.5 ml reaction tube in the magnetic rack, discard the buffer and resuspend the beads in 200 µl PBST. Repeat three times in total.

      • (xiii)

        Place the 1.5 ml reaction tube in the magnetic rack and discard the buffer.

      • (xiv)

        Resuspend the magnetic beads in 100 µl Tris buffer.

      • (xv)

        Heat the magnetic beads for 10 min at 95 °C in order to denature the target protein.

        PAUSE POINT The oligonucleotide part of the library members is stable and can be stored in Tris buffer up to 1 year at -20 °C.

PCR amplification of oligonucleotide tags ● TIMING 2 d

Timing indicates the time required for the processing of 24 samples.

CRITICAL In addition to performing PCR amplification reactions using your selections against target proteins and magnetic beads as template, also include PCR amplification reactions using the unselected library as template. For this purpose, take an aliquot of the DECL working solution, dilute it 1:10, and process it as additional sample.

  • 7|

    Perform PCR 1 using the eluted library members as template. It is optional to remove the magnetic beads, as they do not impede the PCR reaction.

    CRITICAL STEP Use a different primer combination for every selection performed. This way, after HTDS on the same Illumina flow lane, all selections can be evaluated independently.
    Component Volume per reaction (µl) Final concentration
    Eluted library members 5 -
    Phusion HF buffer (5x) 10 1x
    Phusion MgCl2 Solution (50 mM) 2 2 mM
    dNTPs (5 mM) 2.5 250 µM
    Primer IlluminaPCR1a (10 µM) 3 0.6 µM
    Primer IlluminaPCR1b (10 µM) 3 0.6 µM
    Phusion (2 U·µl-1) 0.25 0.5 U per 50 µl
    Water up to 50 -
  • 8|
    Run the Phusion PCR program.
    Cycle number Denature Anneal Extend
    1 98 °C for 3 min
    2-36 98 °C for 45 sec 69 °C for 45 sec 72 °C for 45 sec
    37 72°C for 5 min

    PAUSE POINT Amplified DNA is stable. All intermediate steps can be stored at -20 °C for up to 1 year.

  • 9|

    Analyze the length and purity of the PCR 1 products by agarose gel (2%, wt/vol) electrophoresis.

    CRITICAL STEP Check whether the PCR 1 product shows a band of the expected size (example set-up provided in Supplementary Figure 1). There should not be any additional bands or smears visible on the gel.

    ? TROUBLESHOOTING

  • 10|

    Purify each PCR product using the Macherey-Nagel NucleoSpin Gel and PCR clean-up kit, according to the manufacturer`s instructions. Elute in 20 µl buffer NE.

    CRITICAL STEP Perform two washing steps with buffer NT3 for highest DNA purity.

  • 11|

    Measure the concentration of the purified PCR 1 products using the NanoDrop spectrophotometer.

  • 12|

    Pool the PCR 1 amplification products in one 1.5 ml reaction tube to equimolar concentration (250 nM total). Dilute to 10 nM using buffer NE.

  • 13|
    Perform PCR 2 using the pooled PCR 1 products (10 nM) as template.
    Component Volume per reaction (µl) Final concentration
    Pooled PCR 1 from step 12| 10 1 nM
    Phusion HF buffer (5x) 20 1x
    Phusion MgCl2 Solution (50 mM) 4 2 mM
    dNTPs (5 mM) 5 250 µM
    Primer IlluminaPCR2a (10 µM) 6 0.6 µM
    Primer IlluminaPCR2b (10 µM) 6 0.6 µM
    Phusion (2 U·µl-1) 0. 5 1.0 U per 100 µl
    Water up to 100 -
  • 14|
    Run the Phusion PCR program.
    Cycle number Denature Anneal Extend
    1 98 °C for 3 min
    2-16 98 °C for 45 sec 69 °C for 45 sec 72 °C for 45 sec
    17 72°C for 5 min
  • 15|

    Analyze the length and purity of the PCR 2 products by agarose gel (2%, wt/vol) electrophoresis.

  • 16|

    Perform agarose gel (2%, wt/vol) purification of the PCR 2 products using the Macherey-Nagel NucleoSpin Gel and PCR clean-up kit, according to the manufacturer`s instructions. Elute in 40 µl buffer NE.

    CRITICAL STEP Perform two washing steps with buffer NT3 for highest DNA purity.

  • 17|

    Perform an ethanol precipitation of the gel extracted PCR 2 product. Detailed instructions are available in Supplementary Method 2.

  • 18|

    Measure the concentration of the purified PCR 2 product using the NanoDrop spectrophotometer.

  • 19|

    Dilute the PCR 2 product with buffer NE to a final concentration of 100 nM.

Illumina high-throughput sequencing ● TIMING 6 d

Timing indicates the time required for the processing of one Illumina flow cell in single-read mode.

  • 20|

    Submit 50 µl of 100 nM PCR 2 product to Illumina HTDS.

  • 21|

    Choose the appropriate read length. In case of the 2-building block ESAC library, the read length needs to be 100 nt or higher.

  • 22|

    Add PhiX Control v3 to a final concentration of 30%.

  • 23|

    Run high-throughput sequencing using Illumina HiSeq 2500 or comparable device as single-read sequencing assay.

  • 24|

    Sequencing results are prefiltered according to the Illumina sequencing primer and arrive in the *.fastq format. The expected file size is up to 50 gigabytes per flow lane. One flow cell comprises eight flow lanes.

Data analysis ● TIMING 1 d

Timing indicates the time required for the analysis of 24 selections.

CRITICAL The following protocol is given for Macintosh/Unix platforms. An overview of the data analysis process is given in Figure 5.

  • 25|

    Convert the *.fastq file to a *.fasta file using an open source program (e.g., FASTX-Toolkit: http://hannonlab.cshl.edu/fastx_toolkit/).

  • 26|

    Compile the C++ program “count.cpp” (code provided in Supplementary Software 2) for your platform (Windows, Mac, Unix) using an appropriate compiler (e.g., using “g++ -o count count.cpp –lpthread”). Place the compiled program into a new folder, e.g. named “evaluation” and create the empty sub-folders “codelists” and “sequences”.

    ? TROUBLESHOOTING

  • 27|

    Prepare the structure file “structure.txt” (see Box 2 and Supplementary Software 3) and place it in the “evaluation” folder

    ? TROUBLESHOOTING

  • 28|

    Prepare the codelists as *.txt files (example codelist as Supplementary Software 4) and place them in the “codelists” folder.

  • 29|

    Place the raw data *.fasta file it in the “sequences” folder.

  • 30|

    Open a shell/terminal and change the directory to the “evaluation” folder. Type “./count”.

  • 31|

    Input the desired evaluation filename "name".

  • 32|

    Check if the following evaluation files have been created: name_datum_Codecounts.txt, name_datum_eval.txt and name_datum_evalNorm.txt.

    ? TROUBLESHOOTING

  • 33|

    In MATLAB, import the normalized evaluation file using “Home -> Import Data”. A matrix window opens. Go to “Import”, make sure to select “Matrix” in the list, and then click on “Import Selection”. After importing, which may take some time for large files, close the matrix window.

  • 34|

    The MATLAB processing script for 2-building block libraries is provided in Supplementary Software 5; the one for 3-building block libraries in Supplementary Software 6. Open the appropriate script in Word and adjust it according to your needs: define the input file, the selection number to be displayed and the desired cutoff value.

  • 35|

    Copy the script and paste it into the command window of MATLAB. A new window will open and display the 3D plot of the chosen selection. For 2-building block libraries, the x- and y-axes represent code A and code B, respectively, while the z-axis represents the normalized sequence counts (average count = 100). The 3D plot for 3-building block libraries shows the codes A, B and C as x,y and z-axes. Dot color and size represent sequence counts.

    ? TROUBLESHOOTING

    ? TROUBLESHOOTING

    Troubleshooting advice can be found in Table 2.

Table 2|. Troubleshooting table.

Step Problem Possible reason Solution
6|(A)(ix) The beads clump and are not carried efficiently The immobilized protein alters the properties of the beads Increase the percentage of Tween-20 in the buffers
9| The negative controls show the same band as the samples Contamination with template Handle the KingFisher plates with care. Change gloves if appropriate. Use filter tips for pipetting the PCR reactions
26| Compiling problems in Windows Lack of a pthread library Make sure that the “POSIX threading library for Win32” (mingw32-pthreads-w32) is installed
27| The structure file or the code lists are not properly recognized Line feeding is not according to MS-DOS convention Make sure that MS-DOS line-feeding is used: save as plain text file *.txt from Word choosing MS-DOS, CR/LF
32| Segmentation fault error occurs Stack size is too small It may be necessary to first increase the stack size. In the shell/terminal, type: “ulimit –s 16384”
35| Not enough hits can be detected Stringency may be too high Redo selection with quality-controlled protein at higher concentration and at less stringent conditions (less washing steps of shorter duration)
35| Too many hits are detected Stringency may be too low Use more stringent selection conditions (more washing steps of longer duration)

TIMING

Steps 1| - 2|, Preparation of the DECL working solution: 10 min

Steps 3| - 5|, Washing of the magnetic beads: 20 min

Step 6| (A), Automated affinity selection (24 selections): 3 h

Step 6| (B), Manual affinity selection (24 selections): 6 h

Steps 7| - 19|, PCR amplification of oligonucleotide tags (24 samples): 2 d

Steps 20| - 24|, Illumina high-throughput sequencing (per run): 6 d

Steps 25| - 35|, Data analysis (24 selections): 1 d

Anticipated Results

The first PCR (Figure 4) uses the selected library as template (row D of plate 2 in Figure 3) and adds two selection codes using the primers IlluminaPCR1a (48 nt) and IlluminaPCR1b (46 nt). With the design of the ESAC library31, as shown in Supplementary Figure 1, the PCR 1 product has a length of 134 nt. The agarose gel electrophoresis of the PCR 1 product ideally shows a clear band and thus requires no gel purification. Using the first PCR amplification product as template, the second PCR (Figure 4) introduces sequences required for HTDS through the primers IlluminaPCR2a (58 nt) and IlluminaPCR2b (63 nt), resulting in a final PCR 2 product of 213 nt length. In order to guarantee optimal purity for HTDS, the second PCR product is purified by agarose gel extraction, followed by ethanol precipitation.

The final products of this protocol are individual selection fingerprints, which optimally have the properties of the ones shown in Figure 6. As a control experiment, we process the unselected 2-building block library with PCR and HTDS. The resulting fingerprint in Figure 6b demonstrates that all library members are present in comparable amounts. The fingerprint of a 2-building block DECL selection against horseradish peroxidase (HRP) in Figure 6a shows enriched building blocks, visible as lines, in both sub-libraries. Importantly, the crosspoints feature the highest enrichment, indicating that both building blocks contribute to the binding. Figure 6c displays the result of a selection using a 3-building block DECL. The normalized sequence count (NSC) is represented by dot colour and dot size.

Supplementary Material

Supplementary information
Supplementary software 1
Supplementary software 2
Supplementary software 3
Supplementary software 4
Supplementary software 5
Supplementary software 6
Supplementary materials

Acknowledgments

This work was supported by ETH Zurich, the Swiss National Science Foundation (SNSF), the Commission for Technolgy and Innovation (CTI) of the Swiss Confederation, Krebsliga Schweiz/Krebsforschung Schweiz, the European Union's Research and Innovation funding programme and Philochem AG. R.M.F. acknowledges a VPFW-ETH postdoctoral fellowship endowed by ETH Zurich and Marie-Curie actions. The authors thank S. Melkko, C.E. Dumelin, L. Mannocci, M. Jaggi, M. Leimbacher, M. Steiner, H. Röst, A. Nauer, D. Bajic, M. Valk, S. Biendl, and F. Samain for their help with establishing and improving the protocol. The authors thank C. Aquino, L. Poveda and L. Opitz (Functional Genomics Center Zurich) for help with high-throughput DNA sequencing.

Footnotes

Author Contributions

W.D. established the selection protocol using magnetic beads and the KingFisher magnetic particle processor. W.D., M.W. and R.F. applied the protocol and optimized affinity-based selections. W.D. and J.S. optimized the PCR encoding system. F.B. adapted the protocol to the use with Illumina sequencing and developed the MATLAB scripts. M.S. and Y.Z. wrote and optimized the evaluation software. W.D., D.N. and J.S. wrote the manuscript.

Competing Financial Interests

D.N. is a co-founder and shareholder of Philochem AG (Otelfingen, Switzerland) and J.S. is a board member of Philochem AG.

References

  • 1.Mayr LM, Bojanic D. Novel trends in high-throughput screening. Curr Opin Pharmacol. 2009;9:580–588. doi: 10.1016/j.coph.2009.08.004. [DOI] [PubMed] [Google Scholar]
  • 2.Wigglesworth MJ, Murray DC, Blackett CJ, Kossenjans M, Nissink JW. Increasing the delivery of next generation therapeutics from high throughput screening libraries. Curr Opin Chem Biol. 2015;26:104–110. doi: 10.1016/j.cbpa.2015.04.006. [DOI] [PubMed] [Google Scholar]
  • 3.Macarron R, et al. Impact of high-throughput screening in biomedical research. Nat Rev Drug Discov. 2011;10:188–195. doi: 10.1038/nrd3368. [DOI] [PubMed] [Google Scholar]
  • 4.Zhang J, Yang PL, Gray NS. Targeting cancer with small molecule kinase inhibitors. Nat Rev Cancer. 2009;9:28–39. doi: 10.1038/nrc2559. [DOI] [PubMed] [Google Scholar]
  • 5.Shuker SB, Hajduk PJ, Meadows RP, Fesik SW. Discovering high-affinity ligands for proteins: SAR by NMR. Science. 1996;274:1531–1534. doi: 10.1126/science.274.5292.1531. [DOI] [PubMed] [Google Scholar]
  • 6.Rees DC, Congreve M, Murray CW, Carr R. Fragment-based lead discovery. Nat Rev Drug Discov. 2004;3:660–672. doi: 10.1038/nrd1467. [DOI] [PubMed] [Google Scholar]
  • 7.Jorgensen WL. The many roles of computation in drug discovery. Science. 2004;303:1813–1818. doi: 10.1126/science.1096361. [DOI] [PubMed] [Google Scholar]
  • 8.Reker D, et al. Revealing the macromolecular targets of complex natural products. Nature Chem. 2014;6:1072–1078. doi: 10.1038/nchem.2095. [DOI] [PubMed] [Google Scholar]
  • 9.Reker D, Schneider G. Active-learning strategies in computer-assisted drug discovery. Drug Discov Today. 2015;20:458–465. doi: 10.1016/j.drudis.2014.12.004. [DOI] [PubMed] [Google Scholar]
  • 10.Mannocci L, Leimbacher M, Wichert M, Scheuermann J, Neri D. 20 years of DNA-encoded chemical libraries. Chem Commun (Cambridge, U. K.) 2011;47:12747–12753. doi: 10.1039/c1cc15634a. [DOI] [PubMed] [Google Scholar]
  • 11.Kleiner RE, Dumelin CE, Liu DR. Small-molecule discovery from DNA-encoded chemical libraries. Chem Soc Rev. 2011;40:5707–5717. doi: 10.1039/c1cs15076f. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Franzini RM, Neri D, Scheuermann J. DNA-encoded chemical libraries: advancing beyond conventional small-molecule libraries. Acc Chem Res. 2014;47:1247–1255. doi: 10.1021/ar400284t. [DOI] [PubMed] [Google Scholar]
  • 13.Brenner S, Lerner RA. Encoded combinatorial chemistry. Proc Natl Acad Sci U S A. 1992;89:5381–5383. doi: 10.1073/pnas.89.12.5381. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Needels MC, et al. Generation and screening of an oligonucleotide-encoded synthetic peptide library. Proc Natl Acad Sci U S A. 1993;90:10700–10704. doi: 10.1073/pnas.90.22.10700. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Buller F, et al. Discovery of TNF inhibitors from a DNA-encoded chemical library based on diels-alder cycloaddition. Chem Biol. 2009;16:1075–1086. doi: 10.1016/j.chembiol.2009.09.011. [DOI] [PubMed] [Google Scholar]
  • 16.Leimbacher M, et al. Discovery of small-molecule interleukin-2 inhibitors from a DNA-encoded chemical library. Chemistry. 2012;18:7729–7737. doi: 10.1002/chem.201200952. [DOI] [PubMed] [Google Scholar]
  • 17.Buller F, et al. Selection of Carbonic Anhydrase IX Inhibitors from One Million DNA-Encoded Compounds. ACS Chem Biol. 2011;6:336–344. doi: 10.1021/cb1003477. [DOI] [PubMed] [Google Scholar]
  • 18.Clark MA, et al. Design, synthesis and selection of DNA-encoded small-molecule libraries. Nat Chem Biol. 2009;5:647–654. doi: 10.1038/nchembio.211. [DOI] [PubMed] [Google Scholar]
  • 19.Franzini RM, Nauer A, Scheuermann J, Neri D. Interrogating target-specificity by parallel screening of a DNA-encoded chemical library against closely related proteins. Chem Commun (Cambridge, U. K.) 2015;51:8014–8016. doi: 10.1039/c5cc01230a. [DOI] [PubMed] [Google Scholar]
  • 20.Franzini RM, et al. Identification of structure-activity relationships from screening a structurally compact DNA-encoded chemical library. Angew Chem, Int Ed Engl. 2015;54:3927–3931. doi: 10.1002/anie.201410736. [DOI] [PubMed] [Google Scholar]
  • 21.Samain F, et al. Tankyrase 1 Inhibitors with Drug-like Properties Identified by Screening a DNA-Encoded Chemical Library. J Med Chem. 2015 doi: 10.1021/acs.jmedchem.5b00432. [DOI] [PubMed] [Google Scholar]
  • 22.Buller F, et al. Design and synthesis of a novel DNA-encoded chemical library using Diels-Alder cycloadditions. Bioorg Med Chem Lett. 2008;18:5926–5931. doi: 10.1016/j.bmcl.2008.07.038. [DOI] [PubMed] [Google Scholar]
  • 23.Gartner ZJ, et al. DNA-templated organic synthesis and selection of a library of macrocycles. Science. 2004;305:1601–1605. doi: 10.1126/science.1102629. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Kleiner RE, Dumelin CE, Tiu GC, Sakurai K, Liu DR. In vitro selection of a DNA-templated small-molecule library reveals a class of macrocyclic kinase inhibitors. J Am Chem Soc. 2010;132:11779–11791. doi: 10.1021/ja104903x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Georghiou G, Kleiner RE, Pulkoski-Gross M, Liu DR, Seeliger MA. Highly specific, bisubstrate-competitive Src inhibitors from DNA-templated macrocycles. Nat Chem Biol. 2012;8:366–374. doi: 10.1038/nchembio.792. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Maianti JP, et al. Anti-diabetic activity of insulin-degrading enzyme inhibitors mediated by multiple hormones. Nature. 2014;511:94–98. doi: 10.1038/nature13297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Tse BN, Snyder TM, Shen Y, Liu DR. Translation of DNA into a library of 13,000 synthetic small-molecule macrocycles suitable for in vitro selection. J Am Chem Soc. 2008;130:15611–15626. doi: 10.1021/ja805649f. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Li Y, Zhao P, Zhang M, Zhao X, Li X. Multistep DNA-templated synthesis using a universal template. J Am Chem Soc. 2013;135:17727–17730. doi: 10.1021/ja409936r. [DOI] [PubMed] [Google Scholar]
  • 29.Weisinger RM, Wrenn SJ, Harbury PB. Highly parallel translation of DNA sequences into small molecules. PLoS One. 2012;7:e28056. doi: 10.1371/journal.pone.0028056. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Melkko S, Scheuermann J, Dumelin CE, Neri D. Encoded self-assembling chemical libraries. Nat Biotechnol. 2004;22:568–574. doi: 10.1038/nbt961. [DOI] [PubMed] [Google Scholar]
  • 31.Wichert M, et al. Dual-display of small molecules enables the discovery of ligand pairs and facilitates affinity maturation. Nature Chem. 2015;7:241–249. doi: 10.1038/nchem.2158. [DOI] [PubMed] [Google Scholar]
  • 32.Zambaldo C, Barluenga S, Winssinger N. PNA-encoded chemical libraries. Curr Opin Chem Biol. 2015;26:8–15. doi: 10.1016/j.cbpa.2015.01.005. [DOI] [PubMed] [Google Scholar]
  • 33.Eberhard H, Diezmann F, Seitz O. DNA as a molecular ruler: interrogation of a tandem SH2 domain with self-assembled, bivalent DNA-peptide complexes. Angew Chem Int Ed Engl. 2011;50:4146–4150. doi: 10.1002/anie.201007593. [DOI] [PubMed] [Google Scholar]
  • 34.Winssinger N, et al. PNA-encoded protease substrate microarrays. Chem Biol. 2004;11:1351–1360. doi: 10.1016/j.chembiol.2004.07.015. [DOI] [PubMed] [Google Scholar]
  • 35.Ciobanu M, et al. Selection of a synthetic glycan oligomer from a library of DNA-templated fragments against DC-SIGN and inhibition of HIV gp120 binding to dendritic cells. Chem Commun (Cambridge, U. K.) 2011;47:9321–9323. doi: 10.1039/c1cc13213j. [DOI] [PubMed] [Google Scholar]
  • 36.Gorska K, Huang KT, Chaloin O, Winssinger N. DNA-templated homo- and heterodimerization of peptide nucleic acid encoded oligosaccharides that mimick the carbohydrate epitope of HIV. Angew Chem Int Ed Engl. 2009;48:7695–7700. doi: 10.1002/anie.200903328. [DOI] [PubMed] [Google Scholar]
  • 37.Daguer JP, et al. DNA display of fragment pairs as a tool for the discovery of novel biologically active small molecules. Chemical Science. 2015;6:739–744. doi: 10.1039/C4sc01654h. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Novoa A, Machida T, Barluenga S, Imberty A, Winssinger N. PNA-encoded synthesis (PES) of a 10 000-member hetero-glycoconjugate library and microarray analysis of diverse lectins. ChemBioChem. 2014;15:2058–2065. doi: 10.1002/cbic.201402280. [DOI] [PubMed] [Google Scholar]
  • 39.Litovchick A, et al. Encoded Library Synthesis Using Chemical Ligation and the Discovery of sEH Inhibitors from a 334-Million Member Library. Sci Rep. 2015;5 doi: 10.1038/srep10916. 10916. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Franzini RM, et al. Systematic evaluation and optimization of modification reactions of oligonucleotides with amines and carboxylic acids for the synthesis of DNA-encoded chemical libraries. Bioconj Chem. 2014;25:1453–1461. doi: 10.1021/bc500212n. [DOI] [PubMed] [Google Scholar]
  • 41.Satz AL, et al. DNA Compatible Multistep Synthesis and Applications to DNA Encoded Libraries. Bioconj Chem. 2015 doi: 10.1021/acs.bioconjchem.5b00239. [DOI] [PubMed] [Google Scholar]
  • 42.Smith GP. Filamentous fusion phage: novel expression vectors that display cloned antigens on the virion surface. Science. 1985;228:1315–1317. doi: 10.1126/science.4001944. [DOI] [PubMed] [Google Scholar]
  • 43.McCafferty J, Griffiths AD, Winter G, Chiswell DJ. Phage antibodies: filamentous phage displaying antibody variable domains. Nature. 1990;348:552–554. doi: 10.1038/348552a0. [DOI] [PubMed] [Google Scholar]
  • 44.Kang AS, Barbas CF, Janda KD, Benkovic SJ, Lerner RA. Linkage of recognition and replication functions by assembling combinatorial antibody Fab libraries along phage surfaces. Proc Natl Acad Sci USA. 1991;88:4363–4366. doi: 10.1073/pnas.88.10.4363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Liang P, Pardee AB. Differential display of eukaryotic messenger RNA by means of the polymerase chain reaction. Science. 1992;257:967–971. doi: 10.1126/science.1354393. [DOI] [PubMed] [Google Scholar]
  • 46.Josephson K, Ricardo A, Szostak JW. mRNA display: from basic principles to macrocycle drug discovery. Drug Discov Today. 2014;19:388–399. doi: 10.1016/j.drudis.2013.10.011. [DOI] [PubMed] [Google Scholar]
  • 47.Hanes J, Pluckthun A. In vitro selection and evolution of functional proteins by using ribosome display. Proc Natl Acad Sci USA. 1997;94:4937–4942. doi: 10.1073/pnas.94.10.4937. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Boder ET, Wittrup KD. Yeast surface display for screening combinatorial polypeptide libraries. Nat Biotechnol. 1997;15:553–557. doi: 10.1038/nbt0697-553. [DOI] [PubMed] [Google Scholar]
  • 49.Ellington AD, Szostak JW. In vitro selection of RNA molecules that bind specific ligands. Nature. 1990;346:818–822. doi: 10.1038/346818a0. [DOI] [PubMed] [Google Scholar]
  • 50.Tuerk C, Gold L. Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science. 1990;249:505–510. doi: 10.1126/science.2200121. [DOI] [PubMed] [Google Scholar]
  • 51.Lipinski CA, Lombardo F, Dominy BW, Feeney PJ. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Del Rev. 2001;46:3–26. doi: 10.1016/s0169-409x(00)00129-0. [DOI] [PubMed] [Google Scholar]
  • 52.Griffiths AD, et al. Isolation of high affinity human antibodies directly from large synthetic repertoires. EMBO J. 1994;13:3245–3260. doi: 10.1002/j.1460-2075.1994.tb06626.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Franzini RM, et al. "Cap-and-Catch" Purification for Enhancing the Quality of Libraries of DNA Conjugates. ACS Comb Sci. 2015;17:393–398. doi: 10.1021/acscombsci.5b00072. [DOI] [PubMed] [Google Scholar]
  • 54.Margulies M, et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature. 2005;437:376–380. doi: 10.1038/nature03959. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Bentley DR, et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456:53–59. doi: 10.1038/nature07517. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Mannocci L, et al. High-throughput sequencing allows the identification of binding molecules isolated from DNA-encoded chemical libraries. Proc Natl Acad Sci U S A. 2008;105:17670–17675. doi: 10.1073/pnas.0805130105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Buller F, et al. High-throughput sequencing for the identification of binding molecules from DNA-encoded chemical libraries. Bioorg Med Chem Lett. 2010;20:4188–4192. doi: 10.1016/j.bmcl.2010.05.053. [DOI] [PubMed] [Google Scholar]
  • 58.Chan AI, McGregor LM, Liu DR. Novel selection methods for DNA-encoded chemical libraries. Curr Opin Chem Biol. 2015;26:55–61. doi: 10.1016/j.cbpa.2015.02.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Blakskjaer P, Heitner T, Hansen NJ. Fidelity by design: Yoctoreactor and binder trap enrichment for small-molecule DNA-encoded libraries and drug discovery. Curr Opin Chem Biol. 2015;26:62–71. doi: 10.1016/j.cbpa.2015.02.003. [DOI] [PubMed] [Google Scholar]
  • 60.McGregor LM, Gorin DJ, Dumelin CE, Liu DR. Interaction-dependent PCR: identification of ligand-target pairs from libraries of ligands and libraries of targets in a single solution-phase experiment. J Am Chem Soc. 2010;132:15522–15524. doi: 10.1021/ja107677q. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.McGregor LM, Jain T, Liu DR. Identification of ligand-target pairs from combined libraries of small molecules and unpurified protein targets in cell lysates. J Am Chem Soc. 2014;136:3264–3270. doi: 10.1021/ja412934t. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Li G, et al. Photoaffinity labeling of small-molecule-binding proteins by DNA-templated chemistry. Angew Chem, Int Ed Engl. 2013;52:9544–9549. doi: 10.1002/anie.201302161. [DOI] [PubMed] [Google Scholar]
  • 63.Zhao P, et al. Selection of DNA-encoded small molecule libraries against unmodified and non-immobilized protein targets. Angew Chem, Int Ed Engl. 2014;53:10056–10059. doi: 10.1002/anie.201404830. [DOI] [PubMed] [Google Scholar]
  • 64.Melkko S, et al. Isolation of a small-molecule inhibitor of the antiapoptotic protein Bcl-xL from a DNA-encoded chemical library. ChemMedChem. 2010;5:584–590. doi: 10.1002/cmdc.200900520. [DOI] [PubMed] [Google Scholar]
  • 65.Scheuermann J, et al. DNA-encoded chemical libraries for the discovery of MMP-3 inhibitors. Bioconj Chem. 2008;19:778–785. doi: 10.1021/bc7004347. [DOI] [PubMed] [Google Scholar]
  • 66.Dumelin CE, Scheuermann J, Melkko S, Neri D. Selection of streptavidin binders from a DNA-encoded chemical library. Bioconj Chem. 2006;17:366–370. doi: 10.1021/bc050282y. [DOI] [PubMed] [Google Scholar]
  • 67.Melkko S, Dumelin CE, Scheuermann J, Neri D. On the magnitude of the chelate effect for the recognition of proteins by pharmacophores scaffolded by self-assembling oligonucleotides. Chem Biol. 2006;13:225–231. doi: 10.1016/j.chembiol.2005.12.006. [DOI] [PubMed] [Google Scholar]
  • 68.Melkko S, Zhang Y, Dumelin CE, Scheuermann J, Neri D. Isolation of high-affinity trypsin inhibitors from a DNA-encoded chemical library. Angew Chem, Int Ed Engl. 2007;46:4671–4674. doi: 10.1002/anie.200700654. [DOI] [PubMed] [Google Scholar]
  • 69.Mannocci L, et al. Isolation of potent and specific trypsin inhibitors from a DNA-encoded chemical library. Bioconj Chem. 2010;21:1836–1841. doi: 10.1021/bc100198x. [DOI] [PubMed] [Google Scholar]
  • 70.Kohn M. Immobilization strategies for small molecule, peptide and protein microarrays. J Pept Sci. 2009;15:393–397. doi: 10.1002/psc.1130. [DOI] [PubMed] [Google Scholar]
  • 71.Waugh DS. Making the most of affinity tags. Trends Biotechnol. 2005;23:316–320. doi: 10.1016/j.tibtech.2005.03.012. [DOI] [PubMed] [Google Scholar]
  • 72.Terpe K. Overview of tag protein fusions: from molecular and biochemical fundamentals to commercial systems. Appl Microbiol Biotechnol. 2003;60:523–533. doi: 10.1007/s00253-002-1158-6. [DOI] [PubMed] [Google Scholar]
  • 73.Smith DB. Generating fusions to glutathione S-transferase for protein studies. Methods Enzymol. 2000;326:254–270. doi: 10.1016/s0076-6879(00)26059-x. [DOI] [PubMed] [Google Scholar]
  • 74.Bornhorst JA, Falke JJ. Purification of proteins using polyhistidine affinity tags. Methods Enzymol. 2000;326:245–254. doi: 10.1016/s0076-6879(00)26058-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Satz AL. DNA Encoded Library Selections and Insights Provided by Computational Simulations. ACS Chem Biol. 2015 doi: 10.1021/acschembio.5b00378. [DOI] [PubMed] [Google Scholar]
  • 76.Lin W, Reddavide FV, Uzunova V, Gur FN, Zhang Y. Characterization of DNA-conjugated compounds using a regenerable chip. Anal Chem. 2015;87:864–868. doi: 10.1021/ac503960z. [DOI] [PubMed] [Google Scholar]
  • 77.Wrenn SJ, Weisinger RM, Halpin DR, Harbury PB. Synthetic ligands discovered by in vitro selection. J Am Chem Soc. 2007;129:13137–13143. doi: 10.1021/ja073993a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Seigal BA, et al. The discovery of macrocyclic XIAP antagonists from a DNA-programmed chemistry library, and their optimization to give lead compounds with in vivo antitumor activity. J Med Chem. 2015;58:2855–2861. doi: 10.1021/jm501892g. [DOI] [PubMed] [Google Scholar]
  • 79.Ahlskog JK, et al. Human monoclonal antibodies targeting carbonic anhydrase IX for the molecular imaging of hypoxic regions in solid tumours. Br J Cancer. 2009;101:645–657. doi: 10.1038/sj.bjc.6605200. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary information
Supplementary software 1
Supplementary software 2
Supplementary software 3
Supplementary software 4
Supplementary software 5
Supplementary software 6
Supplementary materials

RESOURCES