Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Aug 19.
Published in final edited form as: Bioconjug Chem. 2015 Jul 28;26(8):1811–1817. doi: 10.1021/acs.bioconjchem.5b00344

Efficient identification of murine M2 macrophage peptide targeting ligands by phage display and next-generation sequencing

Gary W Liu †,#, Brynn R Livesay †,#, Nataly A Kacherovsky , Maryelise Cieslewicz , Emi Lutz , Adam Waalkes , Michael C Jensen †,§, Stephen J Salipante ‡,*, Suzie H Pun †,*
PMCID: PMC4640889  NIHMSID: NIHMS734896  PMID: 26161996

Abstract

Peptide ligands are used to increase the specificity of drug carriers to their target cells and to facilitate intracellular delivery. One method to identify such peptide ligands, phage display, enables high-throughput screening of peptide libraries for ligands binding to therapeutic targets of interest. However, conventional methods for identifying target binders in a library by Sanger sequencing are low-throughput, labor-intensive, and provide a limited perspective (< 0.01%) of the complete sequence space. Moreover, the small sample space can be dominated by non-specific, preferentially amplifying “parasitic sequences” and plastic-binding sequences, which may lead to the identification of false positives or exclude the identification of target-binding sequences. To overcome these challenges, we employed next-generation Illumina sequencing to couple high-throughput screening and high-throughput sequencing, enabling more comprehensive access to the phage display library sequence space. In this work, we define the hallmarks of binding sequences in next-generation sequencing data, and develop a method that identifies several target-binding phage clones for murine, alternatively-activated (M2) macrophages with a high (100%) success rate: sequences and binding motifs were reproducibly present across biological replicates; binding motifs were identified across multiple unique sequences; and an unselected, amplified library accurately filtered out parasitic sequences. In addition, we validate the Multiple Em for Motif Elicitation tool as an efficient and principled means of discovering binding sequences.

Keywords: phage display, next-generation sequencing, peptide, M2 macrophage, MEME

INTRODUCTION

Targeting ligands facilitate cell recognition and are key technologies in targeted drug delivery and molecular imaging. In particular, peptides are an attractive class of targeting ligands: compared to proteins and antibodies, peptides are smaller and can therefore better penetrate tissues to reach target cells; peptides are less prone to clearance by the reticuloendothelial system; finally, peptides can be easily functionalized for well-defined conjugation to cargo. Indeed, several peptides have been used clinically for various targets, ranging from cancer to thrombi.1-3

A core methodology for ligand identification, phage display, has led to the discovery of new functional ligands through high-throughput screening of peptide and antibody fragment libraries for binding interactions with targets of interest.4-6 In peptide phage display, bacteriophage are engineered to display random peptide sequences off of coat proteins to create a peptide library. One popular commercial platform, the Phage Display (Ph.D.) system, uses M13 bacteriophage modified for pentavalent peptide display, where the randomized peptide is fused to each of the five pIII bacteriophage coat proteins.4 The input phage library is enriched for target binders through iterative rounds of negative (removal of non-target binders) and positive (enrichment of target binders) selection and amplification. Since displayed peptides are genetically encoded by the phage genome, the identity of target-binding peptides expressed by adherent phage may be retrieved by DNA sequencing. Using standard Sanger sequencing workflows, 20-100 randomly-selected phage clones are typically isolated for sequencing. The strategy used to select potential target-binding peptides relies on identifying consensus sequences or motifs, which appear multiple times within a given subset of the phage population, that are presumptive of a target-binding sequence enriched during selection.

Given the vast diversity encoded by peptide phage display libraries (~109 unique sequences), the current sequence analysis scheme is low-throughput (~8 h to isolate 20-40 clones) and provides a limited perspective (< 0.01%) of the complete sequence space. This small sample size is particularly sensitive to bias from non-specific, preferentially-amplifying “parasitic sequences”7 and plastic-binding sequences,8 leading to identification of false positives and exclusion of target-binding sequences (Figure 1). These limitations impose a major bottleneck in ligand identification, as functionally evaluating potential binding sequences can be costly in both time and resources.

Figure 1.

Figure 1

Relative perspectives offered by Sanger (left) and next-generation (right) sequencing of the sequence space. Represented in purple and yellow are targeting-binding sequences; represented in red are nonspecific sequences. Sanger sequencing may be biased by nonspecific, highly-proliferating sequences (red) that result in false positives; next-generation sequencing provides a much broader perspective.

To address these limitations, several groups have recently employed next-generation sequencing technologies to analyze multiple phage libraries on a high-throughput scale (~105-107 reads/sample), using a variety of strategies to define and identify target-binding peptides. ‘t Hoen and colleagues used Illumina sequencing to analyze libraries selected against KS483 osteoblasts.8 To eliminate potential parasitic sequences, they discarded sequences that appeared at least twice in an unselected, amplified library, and subsequently identified and confirmed KS483-binding peptides by confocal microscopy. Ngubane and colleagues similarly used Illumina sequencing, and compared sequence enrichment of the selected libraries to the input library with a Pearson Chi-squared test to identify an M. tuberculosis-binding peptide.9 Rebollo and colleagues utilized Ion Torrent sequencing and developed MATLAB scripts to identify potential binding motifs; however, these binding motifs were not experimentally validated.10

These reports and others have mostly been prospective, in which the identity of target-binding sequences was unknown during data processing and analysis. However, applying data filters in prospective analyses may potentially eliminate legitimate target-binding sequences; for example, discarding sequences that appear more than twice in an unselected, amplified library8 may inadvertently remove target-binding sequences displayed on phage that also amplify well in bacteria. As such, development of an empiric, experimentally validated method for identifying target-binding peptide sequences by next-generation sequencing could substantially improve the yield and diversity of biologically relevant sequences identified.

In this study, we define the hallmarks of target-binding sequences analyzed by next-generation sequencing data, and utilize these hallmarks to develop a method that identifies target-binding sequences with a high success rate. We have previously used peptide phage display coupled with Sanger sequencing to identify a murine M2 “anti-inflammatory” macrophage-binding peptide that mediates M2 macrophage-specific delivery of pro-apoptotic peptides, and have employed this peptide to selectively eliminate M2-like tumor-associated macrophages and improve survival in murine tumor models.11 As we have established a successful phage display strategy that has yielded an experimentally-validated peptide, here we characterize how target-binding sequences are represented in high-throughput sequencing readouts of phage display libraries performed in biological duplicate. Through this work, a recommended method for efficient identification of peptide targeting sequences is presented, as well as the discovery of six new peptide sequences that exhibit preferential binding to murine M2-like macrophages versus M1-like macrophages.

RESULTS AND DISCUSSION

Next-generation sequencing provides a broader perspective

Subtractive phage display using murine IL-4–polarized M2 (“anti-inflammatory”) macrophage for positive selection and IFN-γ– and LPS-polarized M1 (“pro-inflammatory”) macrophage for negative selection was completed in biological duplicate with one initial enrichment selection followed by three rounds of paired negative/positive selection. From the last round of each duplicate phage display experiment, 20 phage plaques were randomly selected for Sanger sequencing (Figure S1). While the exact M2pep sequence (YEQDPWGVKWWY) was not recovered from either replicate, sequences containing the M2pep motif DPWXXXXW (highlighted in orange), where X indicates any amino acid, were identified in two independent experiments. The absence of M2pep is likely due to batch variations between the phage libraries used: as New England BioLabs has produced and sold >10 independent lots of Ph.D.-12 libraries,7 the original M2pep sequence may not have been present in the initial phage display peptide library used in this study. While no consensus sequence was observed in the first replicate, multiple consensus sequences were observed in the second replicate (colored). Barcoded phage library amplicons were sequenced by Illumina, and the top 20 most abundant sequences from the last rounds are shown (6.3×105 and 8.7×105 reads, respectively) in Figure 2. The two replicates contained the same 13 sequences (highlighted in blue) in the top 20 most abundant sequences (65%), indicating a high degree of reproducibility between the two biological replicates.

Figure 2.

Figure 2

Illumina sequencing of the two biological duplicates. Common sequences from the duplicates are highlighted. Percentages indicate absolute abundance within the library; numbers indicate abundance in the unselected, 1× amplified (UA1) and unselected, 4× amplified (UA4) libraries.

Sanger sequencing provided only a limited representation of the phage population as revealed by next-generation sequencing: the most abundant sequence identified from next-generation sequencing data, WPTDHQMLRIPM (“WPT,” >40% and >32% of the selected phage library in replicates 1 and 2, respectively), appeared only 1 and 3 times, respectively, out of the 20 Sanger-sequenced clones in the first and second replicates. Based on the Sanger sequencing results in the first replicate, we would have been unable to identify a valid consensus sequence. In contrast, the second replicate revealed 5 distinct consensus sequences that did not reflect the proportions reported by next-generation sequencing. These observations underscore the high degree of variability inherent in a Sanger sequencing approach owing to sample size constraints, which are overcome by high-throughput next-generation sequencing.

We employed Clustal Omega alignment analysis of the top 20 sequences identified by next- generation sequencing in each biological replicate to identify potential motifs (defined as a minimum of three consecutive amino acid residues) for binding studies.12 The most abundant sequence, “WPT,” and motifs that were present in multiple sequences were considered strong candidates for binding studies. Multiple sequence alignment analysis identified four hypothesized motifs (reported along with their parent peptide sequence, in parentheses): “DPW” (QSDPWLVSRWFA), “YPS” (TYPSTQWFFAKF), “SEQ” (FFPSEQVLIAAL), and “LPS” (GLPSSAELERLW) (colored, Figure S2). In addition, a fifth sequence YPSSEQLLAWWG containing both the YPS and SEQ motifs in tandem, was also selected (“YPSSEQ”). To identify potential non-specific, preferentially-amplifying “parasitic” sequences,8 the absolute abundance of the top 20 sequences in unselected, 1× amplified (UA1) and 4× amplified (UA4) libraries were compared. Notably, many of the putative binding sequences were well represented in UA1 (>20). The most abundant sequence, “WPT,” was present in UA1 342 times. In contrast, only two of the top 20 sequences appeared in UA4, and these two sequences were present in only one of the biological replicates. Moreover, the common sequences of the two replicates (highlighted in blue) did not appear in UA4.

In general, an increase in phage titer throughout display rounds suggests an enrichment and amplification of target-binding sequences and concurrent collapse in library diversity. Indeed, these trends were confirmed by next-generation sequencing data. We observed that increases in phage titer corresponded with decreases in the number of unique peptide sequences (Figure S3), and that there was a 5- and 20-fold decrease in library diversity from rounds 1 to 4 in the two biological replicates, respectively. Collectively, these data suggest that the two phage display experiments enriched for M2 macrophage-binding phage with successive panning rounds.

Phage clones exhibit selective M2 macrophage binding

All of the strong candidate binding sequences identified from next-generation sequencing data, as well as M2pep and scrambled M2pep (scM2pep), were cloned into phage (Table S2) and experimentally tested for binding to M1 and M2 macrophages by flow cytometry. All of the tested phage clones, except for the scrambled control, exhibited significantly greater binding to M2 compared to M1 macrophages (Figure 3A). These findings were somewhat unexpected as the sequence “WPT,” given its high abundance in UA1, was considered a putative parasitic sequence. ‘t Hoen et al. identified binding peptides by filtering out presumptive false positive sequences (defined as sequences that appeared more than once in an unselected, 1× amplified library or in online peptide databases).8 Here, all of the tested sequences exhibited abundance >2 in UA1. In a prospective approach following the strategy recommended by ‘t Hoen, these selective binders would have been discarded.

Figure 3.

Figure 3

Phage clone binding of (A) selected phage clones and (B) a preferential amplifier to M1 and M2 macrophages by flow cytometry. Phage binding is expressed as median fluorescence intensity (MFI). Data reported are the mean MFI ± standard deviation (n = 3). AU, arbitrary units. Statistical analysis performed with a two-tailed Student’s t-test compared to phage-treated M1 macrophage. *p-value < 0.05.

An improved strategy to identify selective binding sequences may be to assess sequence abundance in a UA4 library, which better represents the amplification cycles of the selected libraries. With this approach, only the two sequences DWSSWVYRDPQT (“DWSS”) and WPLWSFDWPQNA (“WPLW”) would be identified as false positive sequences. Both “DWSS” and “WPLW” contain the plastic binding motif WXXW.13 We have also observed “DWSS” in the top 20 of next-generation sequencing results from phage display experiments against other targets (data not shown). To confirm non-specificity of parasitic sequences, we tested binding of VHWDFRQWWQPS (“VHW”), which was present in both the UA1 (abundance: 199, proportion of library: ~0.0069%) and UA4 (abundance: 41, proportion of library: ~0.0043%) libraries and has been identified as a consensus sequence in display strategies against other targets pursued by our group. As tested by flow cytometry, this sequence did not exhibit significant macrophage binding compared to controls (Figure 3B).

Binding sequences enrich throughout subtractive phage display rounds

We next utilized multiple analytical tools to observe how binding sequences emerge in next- generation sequencing data and to find a method that would successfully identify binding sequences while eliminating false positives. In this context, the six validated binding sequences served as positive controls and the known parasitic sequences as negative controls.

To characterize enrichment dynamics, the fractional abundance of the six binding sequences and three parasitic sequences were tracked throughout the four rounds of phage display for each biological replicate (Figure 4). Binding sequences exhibited lower abundance in the first and second rounds of phage display, but consistently increased in proportion after each round. Parasitic sequences generally exhibited higher abundance in early rounds of the panning experiments, but began to plateau or decrease in abundance in the last two rounds as binding sequences began to dominate the sequence space. Similar trends were observed when comparing the rank in abundance of binding and parasitic sequences throughout each round of phage display (Figure S4). Therefore, performing multiple rounds of phage display enhances discrimination between binding and parasitic sequences through changes in rank and proportion throughout the display rounds. While binding peptides may theoretically be identified without multiple subtractive rounds,8 in this report we find that additional rounds of selection enhanced the representation of desired target-binding phage versus undesired non-specific phage.

Figure 4.

Figure 4

Fraction of total sequences in Illumina next-generation data for confirmed binding sequences (blue) and non-specific preferential amplifying sequences (red) throughout four rounds of phage display panning in both the first (left) and second (right) biological replicate.

These observations emphasize the importance of experimental design for successful enrichment of targeting-binding sequences. We have observed a much higher prevalence of parasitic sequences in phage display experiments following less-stringent selection protocols, such as using immobilized proteins for positive selection and the immobilizing substrate for negative selection (data not shown). The selection strategy and amplification in bacteria between rounds both contribute to the “evolutionary” pressure placed on bacteriophage, selecting for phage that both amplify (leading to parasitic sequences) and bind to non-target materials (plastic binders, non-specific sequences) used in the display throughout an experiment. Here, the UA4 library enabled discrimination between target-binding phage and parasitic phage.

Algorithmic identification of binding motifs

As we observed the greatest phage titers and the least library diversity in the fourth round of each replicate (Figure S3), these libraries provided data to explore the use of automated motif analysis algorithms to provide a clear and principled representation of target binding peptide motifs with minimal representation of parasitic peptide motifs.

The Multiple Em for Motif Elicitation (MEME) analysis algorithm developed by Bailey utilizes an expectation maximization to fit a two-component, finite mixture model to identify multiple motifs of varying lengths within a DNA or protein sequence set.14,15 By using a “greedy” heuristic computation, MEME is able to sample a large number of sequences (up to 1000) in reasonable computing time while still identifying statistically significant motifs. Peptide sequences that appeared at least 10 times in the fourth round of both biological replicates were used to optimize analysis settings; we omitted sequences with abundance < 10, as this method equally weights all sequences and low abundant sequences could introduce noise. Scanning for motifs between 6 and 12 amino acids in length resulted in the most robust representation of confirmed binding sequences and hypothesized motifs. MEME analysis without any comparison of the phage display libraries with the UA4 library identified binding sequence motifs as well as a DWSS parasitic sequence motif in round 1.4 (Figure S5). Removal of sequences that were present in the UA4 library before MEME analysis resulted in a robust representation of motifs and their binding sequences and good agreement between the two biological replicates (Figure 5). This approach effectively captured all of the hypothesized motifs present in the validated M2- binding sequences “DPW,” “YPSSEQ,” “YPS,” “SEQ,” and “LPS.”

Figure 5.

Figure 5

Logos and number of sequences in motifs identified by Multiple Em for Motif Elicitation (MEME). Analysis includes sequences that were present >10 times in the fourth round of each biological replicate after removal of preferentially amplified sequences present in the UA4 library. All experimentally confirmed binding sequences are represented in these motifs. Only statistically significant motifs (E-value < 0.05) are shown.

In addition, we observed that MEME was most effective after a substantial collapse of the library to diversity below 2×105 unique sequences. MEME analysis was able to identify many of the hypothesized binding motifs using sequences from the third round of biological replicate 2 and the fourth round of both replicates, but was only able to identify one significant motif from the third round of biological replicate 1 (Figures S5 and S6). This further motivates the use of multiple rounds of subtractive phage display, which enhance the effectiveness of motif-finding algorithms.

Highly efficient identification of binding sequences by next-generation sequencing

Prior to this study, our group utilized Sanger sequencing to identify prospective target-binding peptide sequences. This conventional approach was low-throughput (20-40 sequences) and time- intensive (~8 h), often generating artifacts including sequencing errors due to poor DNA quality or sequencing of insertless phage that further diminished the power of an already small sample size. Moreover, sequencing costs accumulated as displays were repeated, costing $120-$240 per experiment. In contrast, next-generation sequencing technology has overcome these limitations due to its high-throughput and economy of scale.16 The use of barcoded primers enables multiple library samples to be sequenced on the same sequencing run, bringing the absolute per-sample cost of next-generation sequencing to a level comparable with Sanger sequencing and increasing the sequencing capacity by four orders of magnitude. Indeed, in this study, we were able to identify and validate 6/6 (100%) of the candidate sequences in ~1 month, using data generated from a single next-generation sequencing run.

CONCLUSIONS

Performing phage display in biological replicate and validating binding of selected phage clones to their target enabled us to define characteristics of target-binding sequences in the context of next-generation sequencing data. Specifically, we noted that (i) sequences and binding motifs (3- 7 amino acid residues) were reproducibly present across biological replicates; (ii) an unselected library amplified the same number of times as selected libraries enabled more accurate discrimination of parasitic sequences; and (iii) binding motifs could be identified by their presence in multiple unique sequences. Moreover, we validated MEME as a powerful tool for identifying targeting-binding sequences. In this particular study, six new sequences for targeting murine M2 macrophage were discovered by the combination of phage display selection, next generation sequencing, and data analysis.

Based on our experience, we suggest the following workflow and methodological considerations to increase the productivity of phage display experiments and to improve the success rate of identifying target-binding sequences:

  1. Experimental design. A 100-fold representation of the naïve library should be amplified the same number of rounds as the experimental library (UAX, where X denotes the number of amplifications), with no selection step between amplifications to aid in identifying parasitic sequences. Performing experiments in biological replicates enables comparison analyses that can be helpful in confirming conserved binding motifs.

  2. Next-generation sequencing. Sample-specific barcodes can be used to multiplex multiple experimental samples onto a single sequencing run in order to minimize cost and time. In these studies, allocating 1×106 sequencing reads per sample gave an adequate representation of the sequence space. Sequencing the last round of phage display performed and the UAX library is sufficient for identification of false-positives and can be used to identify conserved motifs. Fidelity of sequencing data is also a consideration, and efforts should be taken to utilize high-fidelity next-generation sequencing platforms17-19 to limit the number of artifactual sequences recovered.

  3. Data formatting and analysis. Translate quality-filtered sequencing data into counts of unique peptide sequences. Eliminate sequences from the selected library that are present in the UAX library and retain filtered sequences with abundance > 10 sequence reads. We have developed an analysis pipeline, “PepRS,” to facilitate these steps.

  4. MEME analysis. Perform MEME motif analysis over a range of 6-12 amino acid motif length (this may have to be adjusted depending on the size of the displayed peptide) and search for up to 10 motifs. Only consider motifs with E-values < 0.05.

  5. Candidate selection for binding validation. In general, strong candidates for binding exhibit consensus across the biological replicates and a high number of sequences within the displayed motif in MEME analysis. Select for the most abundant sequence within a strong candidate motif. Discard sequences that are present in the online databases SAROTUP (immunet.cn/sarotup)20,21 and PepBank (pepbank.mgh.harvard.edu).22

We have developed a method that allows for highly-efficient identification of target-binding peptide sequences within a phage display library, as well as provide an automated software package that generates and pools peptide sequences from next-generation sequencing reads. This method offers improvements in laboratory workflow efficiency, increased data quality and volume, and expedites target-binding sequence identification.

MATERIALS AND METHODS

Peptide phage display

Peptide phage display was conducted against murine M2 macrophage as previously reported except using lot number 0131402 of the Ph.D.-12 linear dodecapeptide library (New England BioLabs).11 Briefly, negative and positive selection was performed using bone marrow-derived murine IFN-γ? and LPS-polarized M1 and IL-4?polarized M2 macrophage, respectively. An initial enrichment round (positive selection only) was performed followed by three subtractive rounds (negative and positive selection). For the first round, 2×1011 PFU of the Ph.D. library was used as input; for each subsequent round, 2×1011 PFU of the amplified eluate from the previous round was used as the input. Stringency was applied throughout the rounds by increasing the Tween-20 content (0.1, 0.3, 0.5%) in wash steps. Phage titering, amplification, and DNA sequencing was performed according to the New England BioLabs manual. The experiment was performed in biological duplicate. From each experiment, 20 plaques from the third subtractive round were randomly selected for DNA Sanger sequencing (GENEWIZ). In addition, 2×1011 PFU of the Ph.D. library was amplified once or four times with no selection and used as the unselected library controls, UA1 and UA4, respectively.

Phage DNA isolation and purification for next-generation sequencing

DNA amplicons were prepared as previously described with some modifications.10 Single-stranded DNA (ssDNA) was isolated from 1×1011 PFU of the amplified phage eluate from each round of peptide phage display using QIAprep Spin M13 Kit (QIAGEN), and 50 ng of ssDNA was amplified by PCR using primers containing Illumina-compatible sequencing adaptors and sample-specific barcodes (Table S1). For each library sample, the PCR reaction (50 μL) contained 500 nM of both the forward primer and reverse barcode primer unique to that sample, 250 μM dNTP, 1 U Phusion High Fidelity DNA Polymerase (New England BioLabs), and 1× Phusion High Fidelity buffer. Thirteen PCR cycles (10 s at 98 °C, 30 s at 59 °C, 30 s at 72 °C) were performed and PCR products were purified from a 2% agarose gel (AquaPor LE GTAC Agarose, National Diagnostics) using QIAquick Gel Extraction Kit (QIAGEN). The concentration of DNA was determined by a quantitation assay using a Qubit 3.0 Fluorometer (Life Technologies). Purified PCR products prepared from the unselected and selected phage libraries were combined in a 5:1 molar ratio, respective to each individual selected sample, prior to sequencing.

Illumina next-generation sequencing

Sequencing and data processing was performed as previously described with some modifications.8 Sequencing was performed on an Illumina MiSeq system in the presence of 7% PhiX control, using 150 bp single ended sequencing chemistries and custom sequencing primers (Read_1_primer, 5’-GCTCGACCTGTTCCTTTAGTGGTACCTTTCTATTCTCACTCT-3’; Index_read_primer, 5’- AGCAAAATCCCATACAGAAAATTCATTTACCGCAGGTCGCTCC-3’) according to manufacturer instructions. Raw sequence reads for all experiments are available from the Sequence Read Archive (SRA; http://www.ncbi.nlm.nih.gov/sra) under Study Accession Number SRP059048.

Illumina next-generation data processing

Sequence reads that exactly matched the first 10 nucleotides of the flanking sequence immediately downstream of the library insert variable region (5’-GGTGGAGGTT-3’) were retained for further analysis. Subsequences corresponding to the phage display variable region (excluding the terminal 3 glycines of the library insert) were extracted and translated into amino acids, substituting the amber stop codon (UAG) with glutamine to reflect the nonsense suppression activity of E. coli K12 ER2738. For each library, the count of sequence reads encoding each unique peptide sequence was tabulated. The relative abundance of specific peptides was calculated as the fraction of total reads in the library that encoded the peptide. Source code and executables for our analysis pipeline, “PepRS” (Peptide Retrieval Software), are freely available to academic users at https://bitbucket.org/stevesal/peprs.

Secondary analysis

Clustal Omega Motif Analysis. Clustal Omega (http://www.ebi.ac.uk/Tools/msa/clustalo)12 was used for initial motif identification using the 20 most abundant peptide sequences of the fourth round from each biological replicate. After, aligned sequences were visually inspected for binding motifs (defined as a minimum of three consecutive amino acids) that appeared multiple times across different peptide sequences. The most abundant peptide sequence that contained each motif was selected for cloning and subsequent in vitro binding studies. Multiple Em for Motif Elicitation (MEME) Analysis. Unique sequences that were present over 10 times in the third or fourth rounds of phage display for each biological replicate were analyzed for conserved motifs using Multiple Em for Motif Elicitation (MEME, http://meme-suite.org/tools/meme).14,15 The normal motif discovery mode, binary site distribution, and zero-order model of sequences were used to scan for up to 10 motifs with a width of 6-12 amino acids. Motifs with an E-value < 0.05 were considered statistically significant. This analysis was repeated for the fourth round sequences after eliminating all sequences that also appeared in the UA4 library.

Construction of phage clones

Oligonucleotides encoding the peptides of interest were designed with 5’ overhangs and an overlap region with Tm ~55-60 °C and synthesized by IDT (Table S2). Oligonucleotide pairs were annealed, filled in with Klenow (New England BioLabs), digested with Acc65I/EagI (New England BioLabs), and cloned into M13KE dsDNA vector according to the New England BioLabs Ph.D. manual. The resulting plaques were used for amplification and production of phage. Peptide sequences were confirmed by DNA sequencing.

Phage binding study

The selected phage clones (Table S2) were tested for differential binding to murine M1 and M2 macrophages. Macrophages (1×105 cells/well) were incubated with 1×109 PFU of phage diluted in PBS+1% BSA (PBSA)+Fc Block (BD Biosciences) for 30 min on ice. The cells were then washed twice with 1% PBSA, fixed with 4% paraformaldehyde for 15 min on ice, washed, stained with 1:1000 rabbit anti-M13 antibody (Sigma) for 20 min, washed, and stained with 1:500 goat anti-rabbit antibody (Sigma) for 20 min. After washing, cells were analyzed by a MACSQuant flow cytometer (Miltenyi Biotec). Gating was set with unstained and positively stained cells.

Supplementary Material

ACKNOWLEDGEMENTS

This work was supported by the National Institutes of Health [grant numbers 1R01CA177272 and R21NS086500]; and by National Science Foundation Graduate Research Fellowships [grant number DGE-1256082] to GWL and BRL. We would also like to thank Charles Grant for insightful discussion about MEME.

Footnotes

Supporting Information

Sequencing primers and peptide oligonucleotide sequences, Sanger sequencing results, Clustal Omega alignment, phage titer and rank in abundance of specific clones throughout rounds, and additional MEME analysis. This material is available free of charge via the Internet at http://pubs.acs.org.

REFERENCES

  • (1).Gaertner F, Kessler H, Wester H-J, Schwaiger M, Beer A. Radiolabelled RGD peptides for imaging and therapy. Eur. J. Nucl. Med. Mol. Imaging. 2012;39:126–138. doi: 10.1007/s00259-011-2028-1. [DOI] [PubMed] [Google Scholar]
  • (2).Putzer D, Kroiss A, Waitz D, Gabriel M, Traub-Weidinger T, Uprimny C, von Guggenberg E, Decristoforo C, Warwitz B, Widmann G. Somatostatin receptor PET in neuroendocrine tumours: 68Ga-DOTA0, Tyr3-octreotide versus 68Ga-DOTA0-lanreotide. Eur. J. Nucl. Med. Mol. Imaging. 2013;40:364–372. doi: 10.1007/s00259-012-2286-6. [DOI] [PubMed] [Google Scholar]
  • (3).Vymazal J, Spuentrup E, Cardenas-Molina G, Wiethoff AJ, Hartmann MG, Caravan P, Parsons EC., Jr Thrombus imaging with fibrin-specific gadolinium-based MR contrast agent EP-2104R: results of a phase II clinical study of feasibility. Invest. Radiol. 2009;44:697–704. doi: 10.1097/RLI.0b013e3181b092a7. [DOI] [PubMed] [Google Scholar]
  • (4).Smith GP. Filamentous fusion phage: novel expression vectors that display cloned antigens on the virion surface. Science. 1985;228:1315–1317. doi: 10.1126/science.4001944. [DOI] [PubMed] [Google Scholar]
  • (5).Hammers CM, Stanley JR. Antibody phage display: technique and applications. J. Invest. Dermatol. 2014;134:e17. doi: 10.1038/jid.2013.521. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (6).Dantas-Barbosa C, de Macedo Brigido M, Maranhao AQ. Antibody phage display libraries: contributions to oncology. Int. J. Mol. Sci. 2012;13:5420–5440. doi: 10.3390/ijms13055420. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (7).Matochko WL, Cory Li S, Tang SK, Derda R. Prospective identification of parasitic sequences in phage display screens. Nucleic Acids Res. 2014;42:1784–1798. doi: 10.1093/nar/gkt1104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (8).t Hoen PA, Jirka SM, Ten Broeke BR, Schultes EA, Aguilera B, Pang KH, Heemskerk H, Aartsma-Rus A, van Ommen GJ, den Dunnen JT. Phage display screening without repetitious selection rounds. Anal. Biochem. 2012;421:622–631. doi: 10.1016/j.ab.2011.11.005. [DOI] [PubMed] [Google Scholar]
  • (9).Ngubane NA, Gresh L, Ioerger TR, Sacchettini JC, Zhang YJ, Rubin EJ, Pym A, Khati M. High-throughput sequencing enhanced phage display identifies peptides that bind mycobacteria. PLoS One. 2013;8:e77844. doi: 10.1371/journal.pone.0077844. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (10).Rentero Rebollo I, Sabisz M, Baeriswyl V, Heinis C. Identification of target-binding peptide motifs by high-throughput sequencing of phage-selected peptides. Nucleic Acids Res. 2014;42:e169. doi: 10.1093/nar/gku940. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (11).Cieslewicz M, Tang J, Yu JL, Cao H, Zavaljevski M, Motoyama K, Lieber A, Raines EW, Pun SH. Targeted delivery of proapoptotic peptides to tumor-associated macrophages improves survival. Proc. Natl. Acad. Sci. U. S. A. 2013;110:15919–15924. doi: 10.1073/pnas.1312197110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (12).Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H, Remmert M, Soding J, et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol. 2011;7:539. doi: 10.1038/msb.2011.75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (13).Vodnik M, Zager U, Strukelj B, Lunder M. Phage display: selecting straws instead of a needle from a haystack. Molecules. 2011;16:790–817. doi: 10.3390/molecules16010790. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (14).Bailey TL, Elkan C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc. Int. Conf. Intell. Syst. Mol. Biol. 1994;2:28–36. [PubMed] [Google Scholar]
  • (15).Bailey TL, Williams N, Misleh C, Li WW. MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res. 2006;34:W369–373. doi: 10.1093/nar/gkl198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (16).Dias-Neto E, Nunes DN, Giordano RJ, Sun J, Botz GH, Yang K, Setubal JC, Pasqualini R, Arap W. Next-generation phage display: integrating and comparing available molecular tools to enable cost-effective high-throughput analysis. PLoS One. 2009;4:e8338. doi: 10.1371/journal.pone.0008338. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (17).Lam HY, Clark MJ, Chen R, Natsoulis G, O'Huallachain M, Dewey FE, Habegger L, Ashley EA, Gerstein MB, Butte AJ, et al. Performance comparison of whole-genome sequencing platforms. Nat. Biotechnol. 2012;30:78–82. doi: 10.1038/nbt.2065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (18).Loman NJ, Misra RV, Dallman TJ, Constantinidou C, Gharbia SE, Wain J, Pallen MJ. Performance comparison of benchtop high-throughput sequencing platforms. Nat. Biotechnol. 2012;30:434–439. doi: 10.1038/nbt.2198. [DOI] [PubMed] [Google Scholar]
  • (19).Salipante SJ, Kawashima T, Rosenthal C, Hoogestraat DR, Cummings LA, Sengupta DJ, Harkins TT, Cookson BT, Hoffman NG. Performance comparison of Illumina and ion torrent next-generation sequencing platforms for 16S rRNA-based bacterial community profiling. Appl. Environ. Microbiol. 2014;80:7583–7591. doi: 10.1128/AEM.02206-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (20).Huang J, Ru B, Zhu P, Nie F, Yang J, Wang X, Dai P, Lin H, Guo FB, Rao N. MimoDB 2.0: a mimotope database and beyond. Nucleic Acids Res. 2012;40:D271–277. doi: 10.1093/nar/gkr922. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (21).Huang J, Ru B, Li S, Lin H, Guo FB. SAROTUP: scanner and reporter of target-unrelated peptides. J. Biomed. Biotechnol. 2010;2010:101932. doi: 10.1155/2010/101932. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (22).Duchrow T, Shtatland T, Guettler D, Pivovarov M, Kramer S, Weissleder R. Enhancing navigation in biomedical databases by community voting and database-driven text classification. BMC Bioinformatics. 2009;10:317. doi: 10.1186/1471-2105-10-317. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

RESOURCES