Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Jul 22.
Published in final edited form as: Biochemistry. 2018 Jun 19;57(31):4717–4725. doi: 10.1021/acs.biochem.8b00410

Kinase substrate profiling using a proteome-wide serine-oriented human peptide library

Karl W Barber †,, Chad J Miller §, Jay W Jun ∥,, Hua Jane Lou §, Benjamin E Turk §, Jesse Rinehart †,‡,⊥,*
PMCID: PMC6644682  NIHMSID: NIHMS1035477  PMID: 29920078

Abstract

The human proteome encodes over five hundred protein kinases and hundreds of thousands of potential phosphorylation sites. However, the identification of kinase-substrate pairs remains an active area of research because the relationships between individual kinases and these phosphorylation sites remain largely unknown. Many techniques have been established to discover kinase substrates but are often technically challenging to perform. Moreover, these methods frequently rely on substrate reagent pools that do not reflect human protein sequences or are biased by human cell line protein expression profiles. Here, we describe a new approach called SERIOHL-KILR (serine-oriented human library-kinase library reactions) to profile kinase substrate specificity and to identify candidate substrates for serine kinases. Using a purified library of >100,000 serine-oriented human peptides expressed heterologously in E. coli, we perform in vitro kinase reactions to identify phosphorylated human peptide sequences by liquid chromatography-tandem mass spectrometry. We compare our results for protein kinase A to a well-established positional scanning peptide library method, certifying that SERIOHL-KILR can identify the same predominant motif elements as traditional techniques. We then interrogate a small panel of cancer-associated PKCβ mutants using our profiling protocol and observe a shift in substrate specificity likely attributable to loss of key polar contacts between the kinase and its substrates. Overall, we demonstrate that SERIOHL-KILR can rapidly identify candidate kinase substrates that can be directly mapped to human sequences for pathway analysis. Since this technique can be adapted for various kinase studies, we believe that SERIOHL-KILR will have many new victims in the future.

Graphical Abstract

graphic file with name nihms-1035477-f0001.jpg


The human genome encodes over 500 protein kinases, which phosphorylate proteins to post-translationally regulate substrate structure and function 1. A combination of high-throughput mass spectrometry experiments and classical biochemical techniques have uncovered over 105 putative sites of phosphorylation on human proteins 2, 3. However, the observation of a phosphorylated protein within a complex mixture provides no direct information about the identity of the kinase responsible for the phosphorylation event. One solution to this problem is to use substrate motif analysis. Kinases have been long recognized to exhibit telltale patterns in substrate preference related to the primary sequence directly surrounding a phosphorylatable amino acid 4. Thus, phosphorylation occurring within a certain amino acid context may provide clues to the identity of the responsible kinase. To experimentally identify the preferred amino acid sequence motif for individual kinases, several in vitro methods have been developed. One common technique makes use of a positional scanning peptide library (PSPL), in which a purified kinase is reacted with an array of peptides harboring a single invariable amino acid located at a fixed distance from a central phosphoacceptor, permitting construction of substrate motifs 5, 6. However, kinase substrates are not directly identified and can only be predicted by the presence of motif elements in phosphoproteomic datasets. Other techniques that use mammalian lysate-based proteomes as in vitro substrate reagent pools have been established, but may be biased by the protein expression profile of the cell type being used, therefore presenting an incomplete set of possible substrates to a kinase 7, 8.

Recently, we described a heterologous peptide-based representation of the human serine phosphoproteome, in which we were able to demonstrate the production of tens of thousands of phosphorylated or nonphosphorylated human peptide sequences from a single plasmid library in E. coli 9. These peptides are based on previously-observed instances of serine phosphorylation in human proteins. We realized that the nonphosphorylated (or “phosphorylatable”) human peptide library would serve as an ideal reagent for human kinase profiling in order to identify candidate substrates for serine kinases of interest. Here, we use the serine-oriented human library of peptides (SERIOHL) to perform in vitro kinase-library reactions (SERIOHL-KILR) and use liquid chromatography-tandem mass spectrometry (LC-MS/MS) to identify phosphorylated human substrate peptides of various kinases. We evaluate the results of our platform in parallel with the PSPL technique, demonstrating that SERIOHL-KILR performs comparatively with another rigorously established method. Overall, while our library presents some of the same limitations as other in vitro peptide-based methods, SERIOHL-KILR is rooted in common principles of recombinant protein expression and purification, making the technology very tractable for routine kinase substrate identification using techniques already employed in most molecular biology laboratories. Moreover, SERIOHL-KILR is capable of uncovering multiple motifs and combinatorial amino acid preferences in kinase substrate peptides and can be easily customized to encompass other user-defined peptide collections. Finally, the peptides in SERIOHL are based on nonrandom human sequences, thus enabling the direct identification of physiologically-relevant candidate kinase substrates and permitting gene network analysis.

Method Development

The design of the SERIOHL substrate reagent pool was described previously 9, but the resulting peptide library was not used in kinase reactions in that study. Briefly, 110,139 previously-observed instances of human serine phosphorylation were extracted from the PhosphoSitePlus database 3, and the amino acid sequences surrounding the phosphorylated serine residue were converted into ≤31 residue peptides centered around the phosphoacceptor serine (Figure 1a). An oligonucleotide library encoding these peptides was synthesized, amplified by PCR, and introduced into a bacterial expression vector in a single pool, which was shown to contain 94% of the anticipated DNA sequences by next-generation sequencing 9. In this plasmid library, the encoded peptides are fused to an N-terminal GST tag and a C-terminal 6xHis tag to facilitate expression and purification (Figure S1).

Figure 1.

Figure 1

SERIOHL-KILR workflow. (a) All instances of human serine phosphorylation were downloaded from the PhosphoSitePlus database 3. These sites were converted into DNA sequences encoding serine-oriented peptides retaining 15 amino acids N-terminal and C-terminal to the observed site of serine phosphorylation 9. These genes were introduced into a single vector library and transformed into E. coli for expression and subsequent purification of the serine-oriented human peptide library (SERIOHL). (b) The SERIOHL peptides were reacted in vitro with kinases of interest (SERIOHL-KILR). Phosphorylated peptides were enriched using TiO2 and then identified by LC-MS/MS.

To prepare the SERIOHL peptides, we made use of a recoded strain of E. coli known as C321.∆A (Addgene #68306) in which all genomic TAG codons have been replaced with TAA 10. Additionally, release factor 1 has been knocked out from this strain, such that UAG codons no longer serve as cue for translational termination (i.e. the UAG codon is functionally “unassigned”). By introducing a plasmid encoding the serine amber suppressor supD tRNA (Addgene #68307), we are able to drive serine incorporation in response to UAG codons at the ribosome 11. Each SERIOHL peptide-encoding DNA sequence contains a TAG codon, corresponding to the previously observed phosphorylated serine position. Although this method of peptide expression using amber suppression is not necessary for the synthesis of human peptides in E. coli, we were able to use the same precursor DNA library that previously enabled the expression of recombinant phosphopeptide collections via genetic code expansion 9. The direct incorporation of phosphoserine into these peptides at UAG codons remains a possibility for control experiments (e.g. to create a phosphorylated peptide as a mass spectrometry standard or to investigate the biochemical properties of a particular phosphorylated peptide) 9, 11, 12.

The SERIOHL plasmid library was first transformed into C321.∆A with supD tRNA by standard electroporation methods. Serial dilutions of electroporated cells were plated on selective media (LB agar with 100 ng/mL ampicillin for the library plasmid and 25 ng/mL kanamycin for the supD tRNA plasmid), and colonies were counted to ensure that >107 transformants were obtained in order to maintain high peptide library diversity. This number is approximately 100-fold greater than the number of variants in the plasmid library, which was shown previously to be sufficient to produce a peptide library with >56,000 unique members observable by mass spectrometry 9. Peptide library expression and sequential purification using Ni-NTA and glutathione resins were described previously 9. Purified GST/6xHis-fusion SERIOHL peptides were concentrated to ~500 μL using an Amicon Ultra-4 10 kDa molecular weight cutoff spin column (Millipore), transferred to an Amicon Ultra-0.5 10 kDa column (Millipore), and buffer exchanged (50 mM Tris pH 7, 150 mM NaCl, 20% glycerol) to a final volume of ~100 μL. This served as the SERIOHL substrate pool for kinase reactions.

We then proceeded to react the SERIOHL peptide pool with various kinases of interest and identify phosphorylated peptides by LC-MS/MS (SERIOHL-KILR, Figure 1b). To evaluate the performance of SERIOHL-KILR, we first reacted our peptide library with protein kinase A (PKA), an extensively characterized serine/threonine kinase with hundreds of known substrates. The human PKA catalytic domain was expressed and purified with an N-terminal 6xHis tag as previously described 13. 1 μg (~670 nM) purified PKA catalytic domain was reacted with approximately 20 μg (~630 nM) of GST-fused SERIOHL peptides in a buffer containing 50 mM Tris pH 7.4, 150 mM NaCl, 50 μM DTT, 20 mM MgCl2, and 200 μM ATP in a 50 μL final reaction volume. Triplicate kinase reactions were carried out at 30 °C for 4 hours, chilled on ice, and prepared for LC-MS/MS analysis. Peptides were then digested using trypsin, desalted, and phosphorylated peptides were enriched using titanium dioxide (TiO2) as described previously 9.

Dried TiO2-enriched peptides were then resuspended in 0.5 μL 70% formic acid, 0.3 μL 50% acetonitrile/0.1% formic acid, and 6.2 μL 0.1% trifluoroacetic acid. 5 μL of sample was injected onto a 50 cm, 75 μm ID PicoFrit column (New Objective) packed with 1.9 μm ReproSil-Pur 120 Å C18-AQ (Dr. Maisch) using an ACQUITY UPLC M-Class (Waters) paired with a Q Exactive Plus mass spectrometer (Thermo). Liquid chromatography gradients (for a 290 minute method), mass spectrometry operational parameters, and mass spectra search parameters (using MaxQuant v1.5.1.2) were described previously 9. Tryptic phosphopeptides identified by MS2 were then mapped back to the in silico cDNA database of the encoded SERIOHL peptides using a custom Python script (https://github.com/rinehartlab/synphospho). Only SERIOHL peptides phosphorylated at the central serine residue were considered for further analysis.

Results and Method Validation

In total, 519 unique phosphopeptides were identified as PKA substrates in three parallel SERIOHL-KILR replicates (Figure S2). Previously, we had seen evidence for >56,000 unique human peptides in SERIOHL by mass spectrometry 9, indicating that PKA phosphorylated approximately 1% of the library across the SERIOHL-KILR replicates. The phosphorylated SERIOHL peptides were then analyzed by pLogo motif analysis 14, using the theoretical 110,139 SERIOHL peptides encoded in the plasmid library as background. The resulting sequence analysis revealed a strong −3R/−2R preference, which corresponds to the known, canonical R-R/K-x-S PKA consensus motif (Figure 2a) 4. Parallel PSPL analysis with PKA showed a similar pattern as previously reported 5, thus validating the ability of SERIOHL-KILR to accurately identify kinase substrate motifs (Figure 2b). While the −3 and −2 Arg signature is dominant in both methods, the overrepresentation of hydrophobic residues Leu and Val at the +1 position is present but slightly less evident in SERIOHL-KILR. We also observe a prominent deselection of proline at the +1 position by SERIOHL-KILR, which is also observed by PSPL but at comparable levels to acidic amino acids not observed by SERIOHL-KILR. This interesting difference may reflect an improved signal-to-noise ratio gained by direct substrate interrogation/identification in SERIOHL-KILR compared to motif reassembly from the randomized peptides in PSPL. One of the advantages of SERIOHL-KILR is the ability to investigate combinatorial effects of residues surrounding the central phosphoacceptor Ser. Using motif-x, we observe overrepresentation of −5R/−3R and −3R/−2S dual-residue motifs (Figure S3)15. Though slight preferences for −5R or −2S are also observed by PSPL, this technique cannot infer positional interdependence as each position is analyzed independently of the others.

Figure 2.

Figure 2

Comparison of SERIOHL-KILR and PSPL techniques. (a) pLogo motif analysis 14 of human peptides phosphorylated by PKA and identified by SERIOHL-KILR. Results from triplicate experiments were combined (n = 490 unique, library-mapped phosphopeptides). Red line indicates p = 0.05 significance threshold with Bonferroni correction. (b) PSPL results using PKA. Log2-transformed selectivity scale shown at right.

SERIOHL-KILR identified 16 peptides corresponding to bona fide human cellular PKA targets according to the PhosphoSitePlus database 3. While the number of previously-identified PKA substrates observed by SERIOHL-KILR is enriched compared to the total SERIOHL reagent library (3.1% of SERIOHL-KILR peptides compared to 0.4% in original library are listed as PKA substrates in PhosphoSitePlus), only 3.6% of known PKA sites were observed by SERIOHL-KILR. However, we note that SERIOHL-KILR performed very favorably in terms of motif element recognition of substrate peptides containing −3R and/or −2R (Figure 3). We found that 26% of phosphopeptides detected by by SERIOHL-KILR contained both −3R and −2R, compared with 31% of PKA substrates in PhosphoSitePlus. Our method fares similarly well for the enrichment of peptides containing Arg at either of these positions (69% and 48% of identified sequences contain −3R or −2R, respectively). While SERIOHL-KILR is an effective tool to elucidate important substrate motif elements, subsequent in vitro or in vivo validation using standard biochemical or genetic approaches must be performed to confirm that full-length proteins corresponding to the identified SERIOHL peptide substrates are true kinase substrates.

Figure 3.

Figure 3

Enrichment of canonical PKA motif using SERIOHL-KILR. Total SERIOHL precursor pool refers to all 110,139 theoretically encoded human peptides in the plasmid library. SERIOHL-KILR substrates refers to phosphopeptides identified by SERIOHL-KILR using PKA.

We then sought to use SERIOHL-KILR to characterize the effects of certain kinase mutations. Intriguingly, while cancer-associated kinase mutations typically cause hyperactivation, a subset of these mutations map to the catalytic cleft and in some cases appear to change substrate specificity. Recent studies have reported that kinase mutations observed in patient tumor samples can profoundly alter substrate recognition and may therefore contribute to pathogenesis by rewiring downstream signaling cascades 16, 17. It was also recently discovered that mutations in protein kinase C beta (PKCβ) were common in adult T cell leukemia/lymphoma 18. Several of the most recurrent mutations, including D427N, D470H, and E533K, occur in residues that likely confer kinase-substrate specificity 19, 20. We reasoned that SERIOHL-KILR could be leveraged to profile cancer-associated PKCβ mutations to better understand how these mutations might alter substrate choice and consequently rewire signaling networks.

We began by deploying SERIOHL-KILR to identify substrate recognition patterns and candidate substrates for WT PKCβ. We performed triplicate SERIOHL-KILR experiments using the same conditions as in the PKA reaction conditions described above, but replacing PKA with 1 μg (~490 nM) full-length WT PKCβ and 5 μL 10x PKC lipid activator (EMD Millipore, 20–133A) per 50 μL reaction (Figure S4). None of the 50 previously-identified PKCβ substrate peptides listed in the PhosphoSitePlus database were observed by SERIOHL-KILR, which is unsurprising given the relatively low number of known targets and comparatively larger SERIOHL peptide population size. By mass spectrometry, we observe a very strong preference for Arg at −3, hydrophobic residues at +1, and, to a somewhat smaller extent, Arg at +2. This matches well to the canonical PKCβ consensus sequence R-x-x-pS-ϕ-R (where ϕ is a hydrophobic residue) 21.We also carried out PSPL for WT PKCβ, which revealed similar preferences for Arg at the −3 and +2 positions and hydrophobic residues at the +1 position, but also exhibited global preference for basic residues at all positions (Figure 4a).

Figure 4.

Figure 4

Comparison of SERIOHL-KILR and PSPL to investigate cancer-associated PKCβ mutants. pLogo 14 of SERIOHL-KILR results shown on left and PSPL data shown on right for (a) WT PKCβ (n = 77), (b) PKCβ D427N (n = 323), (c) PKCβ D470H (n = 303), and (d) PKCβ E533K (n = 100). Unique, library-mapped phosphopeptides from triplicate experiments were combined for pLogo. Red line indicates p = 0.05 significance threshold with Bonferroni correction. Quantified PSPL data were normalized by position and log2 transformed, with the selectivity scale shown to the right of the heat map.

We then used SERIOHL-KILR to characterize the D427N, D470H, and E533K mutations in PKCβ. We observed that the D427N and D470H PKCβ mutants completely lost specificity for substrates containing a −3R, seemingly ablating a major determinant of specificity within PKCβ (Figure 4bc). Interestingly, the E533K mutant also showed decreased preference for −3R compared to WT PKCβ, but less so in comparison to the other mutants (Figure 4d). By PSPL analysis, we also observed a decrease in selectivity for Arg primarily at the −3 position in all mutants (Figure 4bd). One difference between SERIOHL-KILR and PSPL analysis was the relative prominence of the −2R signature in PSPL for WT PKCβ and its decreased prevalence in the PSPL profile of the E533K mutant. While this effect is still observed to a lesser extent by SERIOHL-KILR, it is one of the most striking differences between the PKCβ variants according to the PSPL analysis. These differences between methods may be due to sequence bias and limited diversity within the SERIOHL substrate pool, which is not a perfectly randomized collection of candidate substrates that can reveal subtleties in substrate preference, as is possible with the PSPL technique. Alternatively, this difference could reflect representation bias in the PSPL peptide mixtures. To provide independent validation of these results, we performed assays using WT and mutant PKCβ with matched pairs of individual peptide substrates with single-residue substitutions at the appropriate positions (Figure S5). As anticipated, we observed reduced selectivity for Arg at the −2 position with the D470H and E533K mutants and at the −3 position with the D427N mutant (Figure S5). These assays further suggest that PKCβ mutations change specificity by reducing the phosphorylation rate of Arg-containing peptides preferred by the WT kinase, while having similar activity to WT on peptides lacking the key Arg residue.

The results obtained by both substrate profiling methods can be rationalized by the kinase-substrate interaction architecture from the published structure of PKA in complex with a peptide inhibitor 22. We noted that the acidic residues at all three tested mutational positions make direct interactions via polar contacts with Arg residues in a substrate peptide (Figure 5). D427 interacts with Arg at the −3 position, while D470 and E533 interact with −2R. This corresponds well with our SERIOHL-KILR and PSPL data, which show a strong decrease in preference for −3R due to the PKCβ D427N mutation. This decrease in selectivity for the −3R is not as stark in the E533K mutant, which is consistent with the lack of direct contact between these amino acids. PKCβ D470H also seemingly exhibits a decrease in preference for the −3R position by SERIOHL-KILR, which could be explained by changes in electrostatics incurred by the mutations. Since Arg residues are very important in the interaction between the pseudosubstrate sequence N-terminal to the kinase domain in PKCβ 23, 24, these kinase mutations have been proposed to decrease PKCβ autoinhibition, leading to activation. Overall, our results provide evidence that these mutations are likely to affect the substrate repertoire of PKCβ, both reducing phosphorylation of native substrates and acquiring new ones.

Figure 5.

Figure 5

Theoretical rationale for loss of substrate specificity in PKCβ mutants. Crystal structure of catalytic domain of PKA complexed with peptide inhibitor (PDB: 1ATP) 22. Aligned positions corresponding to relevant PKCβ mutations (D427, D470, E533) are shown. Green lines represent hydrogen bonds.

One important benefit of using the SERIOHL peptide collection is that all library members are derived from authentic phosphorylation sites in human proteins. As such, the candidate kinase substrates identified by SERIOHL-KILR can be used for pathway analysis to provide potential mechanistic insights into the effects of the various PKCβ mutations. We used gene ontology (GO) enrichment analysis to identify biological processes correlated with the human genes corresponding to identified phosphorylated SERIOHL peptides (http://geneontology.org/page/go-enrichment-analysis, Figure S6) 25. Genes associated with peptides observed in experiments using WT PKCβ were not substantially enriched for any biological processes. We then looked specifically at SERIOHL peptides that were observed exclusively in mutant samples and not with WT PKCβ, and used genes corresponding to all possible encoded SERIOHL peptides as the background. We observed several enriched functions in PKCβ D427N samples (spindle assembly, regulation of cell cycle, regulation of organelle organization, and intracellular signal transduction) and PKCβ D470H samples (cytoskeleton organization and microtubule-based process). Loss of regulation of mitotic spindle/microtubule organization and cell cycle disruption have been previously associated with adult T-cell leukemia2628, but these functional alterations have not been directly tied to PKCβ mutations. These results provide an interesting starting point for further studies into the implications of kinase mutations on signaling output, and provide clues to potential molecular underpinnings of cancer pathogenesis related to these specific substitutions within the PKCβ kinase domain. The PKCβ E533K mutant exhibited no significantly overrepresented terms, which could be due to either the limited number of candidate substrates identified by SERIOHL-KILR (Figure S4) or because of the downstream effects of this specific mutation (Figure 4). The candidate substrates identified in all SERIOHL-KILR experiments appeared to be uniformly enriched in sites predicted to be substrates of PKC family kinases by the Netphorest algorithm, with no other obvious kinases selectively targeting the identified phosphorylation sites29.

Discussion

Overall, we have shown that SERIOHL-KILR is an effective technique to identify candidate substrates for serine kinases of interest. SERIOHL-KILR performs comparably to PSPL, identifying similar sequence elements that are essential to kinase-substrate interactions. Head to head, each technique offers several distinct advantages. PSPL reveals highly nuanced, quantitative information about kinase substrate selectivity by surveying individual amino acids surrounding a phosphoacceptor. These same measurements for all 20 natural amino acids at multiple positions cannot be observed by SERIOHL-KILR due to the limited number of substrates present in the reagent library and the non-randomized distribution of amino acid composition within SERIOHL peptides, which are based on human sequences. Because of the differences in abundance of various SERIOHL peptides as well as potential differences in peptide ionization, we note that that rank-ordering or weighing intensities of phosphopeptides observed by mass spectrometry would not be an appropriate analytical technique for this method. On the other hand, since the SERIOHL peptides directly correspond to physiologically-relevant protein sequences and in vivo phosphorylation sites, they can be mapped to full-length human proteins for candidate substrate identification and gene pathway analysis. By the same token, the smaller diversity of the substrate pool based on all known possible serine kinase substrates may decrease the background of SERIOHL-KILR experiments (i.e. fewer false positives because there are no randomized or non-physiological candidate substrates present). SERIOHL-KILR can also notably identify multiple multi-residue motifs (Figure S3) or uncover preferred sequence elements up to 15 amino acids away from the central phosphoacceptor residue. However, PSPL can also uncover the importance of modified residues such as pThr and pTyr in kinase substrate motifs, which are not accounted for in the current SERIOHL-KILR platform.

Our new platform also offers several advantages compared to other established kinase-substrate discovery techniques, but there are certain tradeoffs. One shortcoming of SERIOHL-KILR is that peptide-based representations of proteins fail to capture certain important secondary, tertiary and quaternary structural elements that may be influential or essential in kinase-substrate recognition. Ideally, kinase-substrate pairings should be uncovered in the context of native systems or human cell lines, therefore retaining as much protein sequence and spatiotemporal information as possible. However, several challenges are presented in eukaryotic platforms. Parsing the role of an individual kinase in eukaryotic cells is difficult, as hundreds of other kinases may be present simultaneously. Chemical genetic and rescue approaches, which respectively enable the selective inhibition or activation of chemically sensitive mutant kinase of interest, exhibit exquisite control over kinase activity 3032 yet may not be amenable to all kinases. Although membrane-associated and transmembrane proteins are notoriously difficult to isolate and study, there are many SERIOHL peptides that correspond to regions of these proteins, enabling their potential identification as candidate kinase substrates. Besides PSPL, other in vitro approaches have sought to elucidate kinase-substrate pairings such as on-bead peptide library screening, which performs partial Edman degradation-mass spectrometry to identify individual substrate sequences from a randomized peptide library 33. This powerful technique can uncover sequence covariance and favored/disfavored residues in substrates, but the resulting optimal substrate sequences do not directly correspond to human proteins. Other techniques, such as high-density microchips containing full-length human proteins 34, are difficult to construct in most laboratory settings, challenging to customize, and fail to offer site-specific information about the location of protein phosphorylation. By contrast, SERIOHL-KILR is simple to perform and customizable, and the sequence of each possible phosphorylation site is genetically preprogrammed and easy to fully identify by mass spectrometry. In vitro reactions using a kinase of interest with intact or protease/phosphatase-treated mammalian lysates have also been very important tools in identifying kinase-substrate pairings 7, 8, but these methods can be biased in favor of the most abundant proteins in cell lines, and endogenous kinases can complicate experimental findings. Finally, other highly effective E. coli-based techniques for kinase substrate profiling techniques have been developed, such as using bacterial surface display of hundreds of human peptides 35 or identifying E. coli substrates of human kinases 36. By comparison, our technology offers a very large collection of peptides (tens of thousands) that correspond to human protein sequences.

There are several reasons that the results obtained by SERIOHL-KILR may be incomplete or biased. A likely culprit in results bias is the distribution of peptides within the SERIOHL peptide reagent pool; the most abundant peptides may outcompete peptides that exist at lower concentrations within the mixture during kinase reactions. Highly prevalent phosphopeptides may also mask or suppress the signals of rarer phosphopeptides by mass spectrometry. For these reasons, SERIOHL-KILR may therefore selectively represent peptides that exist in higher concentrations in the reagent peptide library (i.e. increasing the incidence of false negatives, but not necessarily impacting the false positive rate). The use of peptide libraries in which each constituent sequence is more rigorously normalized would therefore improve the performance of this platform as would synthesis of smaller, targeted SERIOHL peptide pools. Still, the biochemical properties of certain phosphopeptides may make them difficult or impossible to observe by LC-MS/MS 9. Certain kinases that prefer substrates containing multiple basic residues, such as WT PKCβ which exhibits a −3R/+2R substrate motif preference, may yield tryptic fragments that are too short to observe by LC-MS/MS, decreasing the ability to detect certain candidate substrates by SERIOHL-KILR unless they are not properly cleaved by trypsin. This problem could be remediated by performing partial trypsin digests or by using alternative proteases. This is also a possible reason for why fewer phosphorylated SERIOHL peptides were identified as substrates of WT PKCβ compared to the mutant variants (Figure 4). This limitation exists, however, for most MS-based kinase substrate identification methods, and is not unique to SERIOHL-KILR. Additionally, only the catalytic domain of PKA was used in our experiments, and SERIOHL peptides correspond to only a small fragment of their corresponding human proteins. These structural simplifications likely eliminate key binding domains or other elements necessary for interaction coordination, limiting the scope of the SERIOHL-KILR platform as compared to in vivo systems. Any candidate substrates of interest identified using this technology will need to be further validating using full-length proteins and eukaryotic cellular assays.

There are several additional caveats SERIOHL-KILR method. The SERIOHL peptide reagent pool is very diverse and does not take into account tissue, cellular, or subcellular expression profiles that are encountered in mammalian systems. Therefore, our SERIOHL-KILR dataset likely contains proteins that can serve as kinase substrates but that do not come in contact with the tested kinases in intact eukaryotic cells. Future iterations of SERIOHL peptide collections could be constructed based on organ-, tissue-, or organelle-specific kinase/proteome expression profiles to better limit the scope of the identified candidate kinase substrates. The SERIOHL peptides are also not designed to study threonine or tyrosine phosphorylation or other classes of post-translational modifications. However, the same design principles could easily be extended to make peptide libraries that would be more applicable to other kinases or enzyme classes of interest, especially as large-scale single-stranded DNA library synthesis continues to dramatically drop in price. We do consider the low cost and ease of SERIOHL peptide isolation to be substantial platform advantages. Bacterial overexpression of the recombinant SERIOHL peptides is a convenient technique to renewably generate a reagent pool for substrate identification reactions. Using the GST-fused SERIOHL peptides, we obtain approximately 1 mg of the library per 1 L culture, which is enough reagent for 50 SERIOHL-KILR experiments. Finally, the reaction conditions for SERIOHL-KILR will likely need to be tailored to individual kinases. Time course SERIOHL-KILR assays could be performed in order to identify which SERIOHL substrates are preferred by a kinase of interest and to avoid false positives that may arise in long reactions. Labeled internal standards should also be utilized to allow phosphopeptide quantitation in SERIOHL-KILR 37.

In this work, we have described a new method for identifying candidate substrates for kinases of interest. We anticipate that this technique will be applicable to many different kinases, and may be used to identify potential downstream effects of clinically-relevant kinase mutations.

Supplementary Material

Figure S1
Figure S2
Figure S4
Figure S5
Supplement
supplemental data

Acknowledgements

We thank Shannon Hughes and the Cancer Systems Biology Consortium (CSBC) and Physical Sciences in Oncology Network (PS-ON) Summer Research Fellowship for financial support for J.W.J.

Funding

K.W.B. is supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE-1122492. B.E.T. is supported by the NIH (GM104047). J.R. is supported by the NIH (GM117230, GM125951, DK0174334, CA209992).

Footnotes

K.W.B. and J.R. have filed a provisional patent application with the US Patent and Trademark Office (US Patent Application No. 62/639,279) concerning phosphopeptide-encoding oligonucleotide libraries used in this work.

DESCRIPTION OF SUPPORTING INFORMATION

Included with this manuscript are the following supporting information files:

1) Supporting Information: Supplementary methods (concerning PSPL technique) and figures (replication analysis, motif-x analysis for PKA, peptide phosphorylation assays, and GO enrichment analysis).

2) Supporting Data: SERIOHL-KILR results (identified peptide substrates by LC-MS/MS for PKA and all tested PKCβ variants, and corresponding gene names)

REFERENCES

  • [1].Ubersax JA, and Ferrell JE (2007) Mechanisms of specificity in protein phosphorylation, Nature Reviews Molecular Cell Biology 8, 530–541. [DOI] [PubMed] [Google Scholar]
  • [2].Diella F, Cameron S, Gemünd C, Linding R, Via A, Kuster B, Sicheritz-Pontén T, Blom N, and Gibson TJ (2004) Phospho.ELM: A database of experimentally verified phosphorylation sites in eukaryotic proteins, BMC Bioinformatics 5, 1–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Hornbeck PV, Zhang B, Murray B, Kornhauser JM, Latham V, and Skrzypek E (2015) PhosphoSitePlus, 2014: mutations, PTMs and recalibrations, Nucleic Acids Research 43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Kemp BE, and Pearson RB (1990) Protein kinase recognition sequence motifs, Trends in Biochemical Sciences 15, 342–346. [DOI] [PubMed] [Google Scholar]
  • [5].Hutti JE, Jarrell ET, Chang JD, Abbott DW, Storz P, Toker A, Cantley LC, and Turk BE (2004) A rapid method for determining protein kinase phosphorylation specificity, Nature Methods 1, 27–29. [DOI] [PubMed] [Google Scholar]
  • [6].Miller CJ, and Turk BE (2016) Kinase Screening and Profiling, Methods in molecular biology (Clifton, N.J.) 1360, 203–216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Kettenbach AN, Wang T, Faherty BK, Madden DR, Knapp S, Bailey-Kellogg C, and Gerber SA (2012) Rapid Determination of Multiple Linear Kinase Substrate Motifs by Mass Spectrometry, Chemistry & Biology 19, 608–618. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Cohen P, and Knebel A (2006) KESTREL: a powerful method for identifying the physiological substrates of protein kinases, Biochemical Journal 393, 1–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Barber KW, Muir P, Szeligowski RV, Rogulina S, Gerstein M, Sampson JR, Isaacs FJ, and Rinehart J (2018) Encoding human serine phosphopeptides in bacteria for proteome-wide identification of phosphorylation-dependent interactions Nature Biotechnology [DOI] [PMC free article] [PubMed]
  • [10].Lajoie MJ, Rovner AJ, Goodman DB, Aerni H-R, Haimovich AD, Kuznetsov G, Mercer JA, Wang HH, Carr PA, Mosberg JA, Rohland N, Schultz PG, Jacobson JM, Rinehart J, Church GM, and Isaacs FJ (2013) Genomically Recoded Organisms Expand Biological Functions, Science 342, 357–360. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Pirman NL, Barber KW, Aerni HR, Ma NJ, Haimovich AD, Rogulina S, Isaacs FJ, and Rinehart J (2015) A flexible codon in genomically recoded Escherichia coli permits programmable protein phosphorylation, Nature communications 6, 8130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Barber KW, and Rinehart J (2017) Kinase Signaling Networks, Methods in molecular biology (Clifton, N.J.) 1636, 71–78. [DOI] [PubMed] [Google Scholar]
  • [13].Chen C, Ha B, Thévenin AF, Lou H, Zhang R, Yip KY, Peterson JR, Gerstein M, Kim PM, Filippakopoulos P, Knapp S, Boggon TJ, and Turk BE (2014) Identification of a Major Determinant for Serine-Threonine Kinase Phosphoacceptor Specificity, Molecular Cell 53, 140–147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [14].O’Shea JP, Chou MF, Quader SA, Ryan JK, Church GM, and Schwartz D (2013) pLogo: a probabilistic approach to visualizing sequence motifs, Nature Methods 10. [DOI] [PubMed] [Google Scholar]
  • [15].Schwartz D, and Gygi SP (2005) An iterative statistical approach to the identification of protein phosphorylation motifs from large-scale data sets, Nature Biotechnology 23, 1391–1398. [DOI] [PubMed] [Google Scholar]
  • [16].Creixell P, Palmeri A, Miller CJ, Lou H, Santini CC, Nielsen M, Turk BE, and Linding R (2015) Unmasking Determinants of Specificity in the Human Kinome, Cell 163, 187–201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [17].Creixell P, Schoof EM, Simpson CD, Longden J, Miller CJ, Lou H, Perryman L, Cox TR, Zivanovic N, Palmeri A, Wesolowska-Andersen A, Helmer-Citterich M, Ferkinghoff-Borg J, Itamochi H, Bodenmiller B, Erler JT, Turk BE, and Linding R (2015) Kinome-wide Decoding of Network-Attacking Mutations Rewiring Cancer Signaling, Cell 163, 202–217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Kataoka K, Nagata Y, Kitanaka A, Shiraishi Y, Shimamura T, Yasunaga J. i., Totoki Y, Chiba K, Sato-Otsubo A, Nagae G, Ishii R, Muto S, Kotani S, Watatani Y, Takeda J, Sanada M, Tanaka H, Suzuki H, Sato Y, Shiozawa Y, Yoshizato T, Yoshida K, Makishima H, Iwanaga M, Ma G, Nosaka K, Hishizawa M, Itonaga H, Imaizumi Y, Munakata W, Ogasawara H, Sato T, Sasai K, Muramoto K, Penova M, Kawaguchi T, Nakamura H, Hama N, Shide K, Kubuki Y, Hidaka T, Kameda T, Nakamaki T, Ishiyama K, Miyawaki S, Yoon S-S, Tobinai K, Miyazaki Y, Takaori-Kondo A, Matsuda F, Takeuchi K, Nureki O, Aburatani H, Watanabe T, Shibata T, Matsuoka M, Miyano S, Shimoda K, and Ogawa S (2015) Integrated molecular analysis of adult T cell leukemia/lymphoma, Nature Genetics 47, 1304–1315. [DOI] [PubMed] [Google Scholar]
  • [19].Zhu G, Fujii K, Liu Y, Codrea V, Herrero J, and Shaw S (2005) A Single Pair of Acidic Residues in the Kinase Major Groove Mediates Strong Substrate Preference for P-2 or P-5 Arginine in the AGC, CAMK, and STE Kinase Families, Journal of Biological Chemistry 280, 36372–36379. [DOI] [PubMed] [Google Scholar]
  • [20].Chen C, Nimlamool W, Miller CJ, Lou H, and Turk BE (2017) Rational Redesign of a Functional Protein Kinase-Substrate Interaction, ACS chemical biology 12, 1194–1198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [21].Nishikawa K, Toker A, Johannes F-J, Songyang Z, and Cantley LC (1997) Determination of the Specific Substrate Sequence Motifs of Protein Kinase C Isozymes, Journal of Biological Chemistry 272, 952–960. [DOI] [PubMed] [Google Scholar]
  • [22].Zheng J, Trafny EA, Knighton DR, Xuong N, Taylor SS, Eyck LF, and Sowadski JM (1993) 2.2 Å refined crystal structure of the catalytic subunit of cAMP-dependent protein kinase complexed with MnATP and a peptide inhibitor, Acta Crystallographica Section D: Biological Crystallography 49, 362–365. [DOI] [PubMed] [Google Scholar]
  • [23].House C, and Kemp BE (1990) Protein kinase C pseudosubstrate prototope: Structure-function relationships, Cellular Signalling 2, 187–190. [DOI] [PubMed] [Google Scholar]
  • [24].Newton AC (1995) Protein Kinase C: Structure, Function, and Regulation, Journal of Biological Chemistry 270, 28495–28498. [DOI] [PubMed] [Google Scholar]
  • [25].Mi H, Muruganujan A, and Thomas PD (2013) PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees, Nucleic Acids Research 41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].Kasai T, Iwanaga Y, Iha H, and Jeang K-T (2002) Prevalent Loss of Mitotic Spindle Checkpoint in Adult T-cell Leukemia Confers Resistance to Microtubule Inhibitors, Journal of Biological Chemistry 277, 5187–5193. [DOI] [PubMed] [Google Scholar]
  • [27].Sieburg M, Tripp A, Ma J-W, and Feuer G (2004) Human T-Cell Leukemia Virus Type 1 (HTLV-1) and HTLV-2 Tax Oncoproteins Modulate Cell Cycle Progression and Apoptosis, Journal of Virology 78, 10399–10409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Nejmeddine M, Negi VS, Mukherjee S, Tanaka Y, Orth K, Taylor GP, and Bangham CRM (2009) HTLV-1–Tax and ICAM-1 act on T-cell signal pathways to polarize the microtubule-organizing center at the virological synapse, Blood 114, 1016–1025. [DOI] [PubMed] [Google Scholar]
  • [29].Miller M, Jensen L, Diella F, Jørgensen C, Tinti M, Li L, Hsiung M, Parker SA, Bordeaux J, Sicheritz-Ponten T, Olhovsky M, Pasculescu A, Alexander J, Knapp S, Blom N, Bork P, Li S, Cesareni G, Pawson T, Turk BE, Yaffe MB, Brunak S, and Linding R (2008) Linear Motif Atlas for Phosphorylation-Dependent Signaling, Sci. Signal 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [30].Shah K, Liu Y, Deirmengian C, and Shokat KM (1997) Engineering unnatural nucleotide specificity for Rous sarcoma virus tyrosine kinase to uniquely label its direct substrates, Proceedings of the National Academy of Sciences 94, 3565–3570. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [31].Knight ZA, and Shokat KM (2007) Chemical Genetics: Where Genetics and Pharmacology Meet, Cell 128, 425–430. [DOI] [PubMed] [Google Scholar]
  • [32].Qiao Y, Molina H, Pandey A, Zhang J, and Cole PA (2006) Chemical Rescue of a Mutant Enzyme in Living Cells, Science 311, 1293–1297. [DOI] [PubMed] [Google Scholar]
  • [33].Trinh TB, Xiao Q, and Pei D (2013) Profiling the Substrate Specificity of Protein Kinases by On-Bead Screening of Peptide Libraries, Biochemistry 52, 5645–5655. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Hall DA, Ptacek J, and Snyder M (2007) Protein microarray technology, Mechanisms of Ageing and Development 128, 161–167. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [35].Shah NH, Wang Q, Yan Q, Karandur D, Kadlecek TA, Fallahee IR, Russ WP, Ranganathan R, Weiss A, and Kuriyan J (2016) An electrostatic selection mechanism controls sequential kinase signaling downstream of the T cell receptor, eLife 5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [36].Chou MF, Prisic S, Lubner JM, Church GM, Husson RN, and Schwartz D (2012) Using Bacteria to Determine Protein Kinase Specificity and Predict Target Substrates, PLoS ONE 7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [37].Kubota K, Anjum R, Yu Y, Kunz RC, Andersen JN, Kraus M, Keilhack H, Nagashima K, Krauss S, Paweletz C, Hendrickson RC, Feldman AS, Wu C-L, Rush J, Villén J, and Gygi SP (2009) Sensitive multiplexed analysis of kinase activities and activity-based kinase identification, Nature Biotechnology 27, 933–940. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure S1
Figure S2
Figure S4
Figure S5
Supplement
supplemental data

RESOURCES