Abstract
Tyrosine phosphorylation is a common protein posttranslational modification, which plays a critical role in signal transduction and the regulation of many cellular processes. Using a pro-peptide strategy to increase cellular uptake of O-phosphotyrosine (pTyr) and its nonhydrolyzable analog 4-phosphomethyl-L-phenylalanine (Pmp), we identified an orthogonal aminoacyl-tRNA synthetase/tRNA pair that allows the site-specific incorporation of both pTyr and Pmp into recombinant proteins in response to the amber stop codon in Escherichia coli in good yields. The X-ray crystal structure of the synthetase reveals a reconfigured substrate binding site formed by non-conservative mutations and substantial local structural perturbations. We demonstrate the utility of this method by introducing Pmp into a putative phosphorylation site whose corresponding kinase is unknown and determined the affinities of the individual variants for the substrate 3BP2. In summary, this work provides a useful recombinant tool to dissect the biological functions of tyrosine phosphorylation at specific sites in the proteome.
INTRODUCTION
Tyrosine phosphorylation is a key mechanism by which cells control a wide variety of biological processes1–5. However, the dynamic interconversion of phosphorylated and dephosphorylated isoforms of a protein, and phosphorylation at multiple sites often complicates the study of the roles of individual phosphorylation events6,7. Access to site-specifically phosphorylated proteins has greatly facilitated mechanistic investigations of this posttranslational modification8–10. Unfortunately, there is a lack of robust methods for generating individual phospho-protein isoforms11. Native or engineered kinases lack the selectivity to modify specific tyrosine residues, and such modifications are subject to relatively nonselective removal by phosphatases in the cell. Semisynthetic methods allow the site-specific introduction of phosphotyrosine (pTyr) residues into proteins; however, application to large proteins is limited by synthetic complexity and difficulties in refolding proteins after chemical synthesis. An in vitro translation system has also been attempted using a chemically amino-acylated suppressor transfer RNA (tRNA), but affords relatively low yields of the phospho-proteins12. Moreover, these approaches do not allow site-specific tyrosine phosphorylation in living cells; therefore, they lack the potential to study biological functions of this important post-translational modification in real time.
Several nonhydrolyzable phosphotyrosine analogs that are readily bioavailable in Escherichia coli (E. coli) have been genetically incorporated into proteins using cognate orthogonal amber suppressor tRNA/aminoacyl-tRNA synthetase (aaRS) pairs, including sulfotyrosine (sTyr) and 4-(carboxymethyl)phenylalanine (CMF)13–21. In addition, the nonhydrolyzable pTyr mimetic, 4-phosphomethyl-L-phenylalanine (Pmp) (Fig. 1a), has been selectively introduced into peptides to study the function of individual phosphorylation sites in SHP-1 and SHP-222–24. The ability to recombinantly incorporate Pmp at specific sites of proteins would simplify the study of native pTyr sites in the proteomes of living cells2. However, similar to pTyr, attempts to genetically encode Pmp have remained challenging due to poor cellular uptake of this amino acid.
Figure 1. Pro-peptide strategy to increase the intracellular concentrations of pTyr and Pmp.

(a) Structures of ncAAs used in this study. (b) The extracted-ion chromatogram for uptake of Lys-Pmp. 5 mM Lys-Pmp (orange) or Pmp (yellow) was added during incubation. Blue, 50 μM Pmp was added to the cell lysate as standard; gray, buffer from last wash before cell lysis.
Here, we report the use of a pro-peptide strategy to increase the cytoplasmic concentrations of pTyr and Pmp that allowed the identification of an aaRS that selectively incorporates pTyr or Pmp at desired sites in recombinant proteins in E. coli in response to an amber nonsense codon. The molecular basis of synthetase specificity was determined by X-ray crystallography and revealed a substantial reconfiguration of the active site. In addition, we demonstrated the utility of this method by the incorporation of Pmp into a putative phosphorylation site and determined the affinities of the individual variants for their substrate.
RESULTS
Dipeptides increase cellular availability of pTyr and Pmp
Many negatively-charged molecules including pTyr (1) and Pmp (2) (Fig. 1a) are impermeable to the E. coli cell wall, resulting in poor cellular bioavailability25. On the other hand, it has been shown in both E. coli26,27 and C. elegans28 that poorly bioavailable amino acids can be transported into cells in a dipeptide form with an ATP-binding cassette transporter. In E. coli, the dipeptide transporter DppA recognizes all twenty canonical amino acids as the N-terminal amino acid with little restriction towards the C-terminal residue29,30. Following uptake, the dipeptide is hydrolyzed by intracellular nonspecific peptidases to afford the free amino acids. To determine whether this strategy could be used to transport pTyr or Pmp into bacteria, we synthesized the two dipeptides (Supplementary Results, Supplementary Note), H2N-lysine-phosphotyrosine-COOH (Lys-pTyr, 3) and H2N-lysine-4-phosphomethyl-L-phenylalanine-COOH (Lys-Pmp, 4), which should be highly soluble and good substrates for DppA (Fig. 1a).
The metabolically stable Lys-Pmp was used for bacterial uptake studies since analysis is not complicated by the activity of endogenous phosphatases. Lys-Pmp and Pmp were incubated with E. coli DH10B cultures during exponential growth, and after repeated washing, the cytosolic concentrations of Pmp in each sample were determined by liquid chromatography–mass spectrometry (LC-MS) analysis after cell lysis (the buffer used for the last wash of each sample was also analyzed to exclude the possibility of contamination). Pmp (50 μM) was added to the lysate from cells as a standard for the LC-MS analysis. No signal corresponding to Pmp in cells treated with 5 mM Pmp was observed in the extracted-ion chromatogram, indicating poor uptake of Pmp by cells (Fig. 1b). In contrast, the lysate of cells treated with 5 mM Lys-Pmp showed a signal with the same mass and retention time as Pmp standard (Fig. 1b). Importantly, the peak was absent in the last wash sample standard, indicating that the observed peak is not due to contamination by extracellular material. The area-under-the-curve (AUC) for lysate from cells supplemented with 5 mM Lys-Pmp (AUC = 29,551) was comparable to that of a 50 μM Pmp standard (AUC = 27,751). These results confirmed the successful uptake of Lys-Pmp into E. coli cytoplasm, and encouraged us to pursue the genetic incorporation of pTyr and Pmp into proteins.
Genetic incorporation of pTyr and Pmp in E. coli
We next screened three Methanocaldococcus jannaschii (Mj) tyrosyl-tRNA synthetase (MjTyrRS) mutants for their ability to aminoacylate pTyr and Pmp. These synthetases were originally evolved to incorporate noncanonical amino acids (ncAAs) with negatively-charged side chains, including CMF (5)18, p-boronophenylalanine (BoroF) (6)31 and sTyr (7)13 (Fig. 1a). To assess the suppression efficiency of these aaRSs with Lys-pTyr and Lys-Pmp, a plasmid encoding a myoglobulin (Myo) mutant with a permissive amber codon at Lys99 (K99TAG) was transformed into E. coli DH10B along with the aaRS/tRNACUA pair encoded on a pEVOL vector32. Only cells transformed with the synthetase for CMF (CMFRS) (Y32S, L65A, F108K, Q109H, D158G, L162K) afforded full-length protein after nickel–nitrilotriacetic acid (Ni-NTA) purification in the presence of the ncAAs. Yields were 61 and 60 mg L−1 in Terrific Broth (TB) in the presence of 2 mM Lys-pTyr and Lys-Pmp, respectively (Fig. 2a); only background tyrosine incorporation into protein (1.4 mgL−1) was observed in the presence of CMFRS and 2 mM pTyr, 2mM Pmp, or the absence of ncAA (Fig. 2a). Mutant proteins were analyzed by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) (Supplementary Fig. 1) and electrospray ionization mass spectrometry (ESI-MS) to confirm the incorporation of pTyr or Pmp. Both Myo-K99Pmp and Myo-K99pTyr migrated as a single band at ~18 KDa on a SDS-PAGE gel. However, only Myo-K99Pmp showed the expected mass consistent with its amino-acid sequence (calculated 18,466 Da, observed 18,466 Da) (Fig. 2b); no background incorporation was observed. In the case of Myo-K99pTyr, only a signal with a mass of 18,386 Da, corresponding to the dephosphorylated product Myo-K99Tyr (calculated 18,388 Da) (Supplementary Fig. 2a) was identified, presumably due to posttranslational dephosphorylation of pTyr by endogenous phosphatases in E. coli. Indeed, when 1 mM sodium orthovanadate (a relatively nonspecific phosphatase inhibitor) was added during protein expression, a mass signal corresponding to Myo-K99pTyr (observed 18,465 Da, calculated 18,468 Da) was observed in addition to the signal corresponding to the dephosphorylated Myo-K99Tyr (calculated 18,388 Da, observed 18,387 Da). The apparent ratio of Myo-K99pTyr to Myo-K99Tyr was 1:10 (Supplementary Fig. 2b). Finally, site-specific incorporation of Pmp was further confirmed by GluC protease digestion and liquid chromatography–tandem mass spectrometry (LC-MS/MS) analysis of the Pmp-containing peptide (H2N-LKPLAQSHATKHPmpIPIKYLE-COOH; Fig. 2c). Collectively, these data demonstrate that CMFRS can be readily used to charge pTyr or Pmp onto Mj-tRNACUATyr, and the pro-peptide strategy with Lys-pTyr or Lys-Pmp generates sufficient quantities of pTyr or Pmp in the cytosol to allow efficient incorporation of these ncAAs into myoglobin.
Figure 2. Incorporation and characterization of variant myoglobins containing different ncAAs.

(a) SDS-PAGE analysis of Myo-K99TAG expressed with CMFRS in the presence or absence of various ncAAs or dipeptides and stained with Coomassie blue (full gel image in Supplementary Fig. 1). (b) Deconvoluted mass spectrum of Myo-K99Pmp; inset, mass spectrum before deconvolution. (c) MS/MS fragmentation of GluC-digested peptides derived from Myo-K99Pmp. The spectrum confirmed that Pmp had been incorporated at the desired site.
Analysis of recombinant Pmp-containing proteins
Next, we used this methodology to generate site-specifically phosphorylated proteins where their corresponding kinase was unknown and evaluate the effects of these modifications on protein–protein interactions. Phosphotyrosine sites Tyr30 and Tyr52 have been discovered in Abelson murine leukemia viral oncogene homolog 1 (ABL1) SH3 domain. While Tyr52 can be phosphorylated by commonly used kinases, no such kinase has been identified to selectively phosphorylate Tyr3033. Native chemical ligation (NCL) methods have been used to synthesize this domain with pTyr at positions 30 and 52; however, the low efficiency and the complexity of synthesis and purification steps afforded very small quantities of protein. To recombinantly express ABL1 variants containing Pmp, we inserted the CMFRS gene into the pUltra vector which was reported to exhibit a higher suppression activity34 (Supplementary Table 1). The genes for ABL1 SH3 domain with an amber codon at position 30 or 52 were synthesized and inserted into the pET-T5 vector, where their expression was controlled by a T5 promoter. After co-transformation of DH10B E. coli cells with both plasmids, Abl1-SH3 variants were expressed in liquid TB medium supplemented with 2 mM Lys-Pmp. Full-length Abl1-Pmp30 and Abl1-Pmp52 were purified using Ni-NTA chromatography in yields of 0.75 and 1.0 mg L−1, respectively (Fig. 3a). For comparison, wild-type Abl1 (ABL1-WT) was similarly expressed in the absence of ncAA with a purified yield of 7.0 mg L−1. The resulting Abl1 mutants were characterized by SDS-PAGE analysis, where they migrated as a single band. Liquid chromatography–quadrupole time-of-flight mass spectrometry (LC/QTOF-MS) analysis showed the correct masses for ABL1-WT (calculated: 8,129 Da, observed: 8,129 Da) as well as for both Abl1-Pmp30 variants (calculated: 8,207 Da, observed: 8,207 Da) and Abl-Pmp52 (calculated: 8,207 Da, observed: 8207 Da) (Supplementary Fig. 3). Finally, the site-specific incorporation of Pmp at position 52 was further confirmed by trypsin digestion and LC-MS/MS analysis of the Pmp-containing peptides (H2N-NGQGWVPSNPmpITPVNSVDHHHHHH-COOH) (Supplementary Fig. 4).
Figure 3. Expression and characterization of human Abl1 and its Pmp-containing variants in E. coli.

(a) SDS-PAGE analysis of Ni-NTA-purified human Abl1 mutants. Abl1-WT (lane 2) was expressed in the absence of ncAA. Abl1-Pmp30 (lane 3) and Abl1-Pmp52 (lane 4) were expressed in the presence of 2 mM Lys-Pmp. (b) Binding affinities of Abl1-WT (black), Abl1-Pmp30 (blue), and Abl1-Pmp52 (orange) to their native substrate 3BP2 peptide were determined by fluorescence polarization. Assays were performed in triplicate and error bars represent the standard deviation. The numbering of tyrosine residues reflects their position in the SH3 domain as in the literature.
We next determined the binding affinities of the mutants to the 3BP2 peptide, a native ligand from the adaptor protein 3BP235, using a reported fluorescence polarization assay33. Abl1-Pmp52 exhibited an equilibrium dissociation constant (KD) of 5.0 ± 0.8 μM, which is identical to that of Abl1-WT (5.0 ± 0.8 μM); Abl1-Pmp30 exhibited a somewhat reduced binding affinity (KD = 10.6 ± 2.2 μM) (Fig. 3b). These KD values are in agreement with previous measurements using chemically synthesized peptides containing pTyr (13 μM for Abl1-pTyr30 and 4 μM for Abl1-pTyr52)33. These experiments demonstrate that Pmp can be incorporated in response to the TAG codon and act as a pTyr mimic to probe the roles of protein phosphorylation.
Molecular basis for ncAA incorporation
CMFRS exhibits a unique poly-specificity towards CMF, pTyr and Pmp, which are all ncAAs with negatively charged side chains36. This prompted us to investigate the molecular basis for this specificity by X-ray crystallography; the structure of the CMFRS was solved to a resolution of 3.0 Å, with an Rcryst of 0.20 and Rfree of 0.27 (Fig. 4a, Supplementary Table 2). Although crystals were also obtained in the presence of Pmp, no electron density of the substrate could be observed. The overall structures of CMFRS (Y32S, L65A, F108K, Q109H, D158G, L162K) are aligned well with the wild-type MjTyrRS apo structure (PDB entry 1U7D) with root-mean-square deviations (rmsd) of 0.6 Å in Cα. However, substantial conformational changes were observed between CMFRS and the wild-type MjTyrRS apo synthetase for helix α8 with an rmsd of 3.34 Å in Cα for residues 144–164. The D158G mutation acts as a helix-breaker for α8 and alters residues N157–Y162 from a helical structure to a more flexible loop conformation (Fig. 4b, Supplementary Fig. 5). The B-factors for Cα of residues 157–162 range from 66 to 105 Å2 in contrast to 52 Å2 of surrounding residues (Supplementary Fig. 6). The glycine at this position has a ϕ of 55.5° and a ψ of 82.0° which are prohibited for any other canonical amino acids. In the MjTyrRS–tyrosine co-crystal structure, the helical structure of N157–L162 is part of the substrate binding pocket and D158 forms a hydrogen bond with the hydroxyl group of the substrate tyrosine. It is likely that the D158G mutation and the rearrangement of N157–L162 open up the pocket to accommodate the phosphate and carboxyl groups of the ncAAs. The Y32S and L65A mutations on β2 and β3 likely create additional space on the opposite side of the substrate binding pocket. The positively charged F108K, Q109H, and L162K mutations may form salt bridges with the side chains of negatively charged amino acid substrates, which may explain the selectivity of CMFRS to this series of ncAAs (Fig. 4c).
Figure 4. X-ray structure of CMFRS.

(a) Superposition of apo wild-type MjTyrRS (orange, PDB: 1U7D) and CMFRS (blue, PDB: 5U36). Secondary structures α6–α8, β2 and β3 are labeled accordingly; (b) Comparison of α6–α8 of CMFRS (blue) and apo wild-type MjTyrRS (orange) which shows the conformational changes in α8 induced by the D158G mutation. (c) Superposition of the active sites of apo wild-type MjTyrRS (orange, PDB: 1U7D) and CMFRS (blue); the side chains of mutated residues are shown as sticks. The electron densities of K108 and H109 were disordered and therefore not defined. Tyrosine, the substrate of wild-type MjTyrRS (grey) from the wild-type MjTyrRS–tyrosine complex (PDB: 1J1U), is shown by superposition to illustrate the spatial relationship between the mutant sites and the substrate.
Based on this structural analysis, we hypothesized that CMFRS could potentially aminoacylate other ncAAs with charged side chains. To test this notion, we evaluated BoroF (6) and sTyr (7), by expressing full-length Myo-K99TAG in the presence of 2 mM ncAA. After expression and purification, Myo-K99 variants containing BoroF, sTyr and CMF were produced in yields of 70, 61, and 30 mg L−1, respectively (Fig. 2b). Mass spectrometric analysis further confirmed the incorporation of sTyr (calculated 18,468 Da, observed 18,468 Da), pBoroF (calculated 18,417 Da, observed 18,416 Da), and CMF (calculated 18,431 Da, observed 18,431 Da) in the protein (Supplementary Fig. 7). These results suggest that CMFRS may be a useful polyspecific aaRS for charged amino acids and a valuable starting point for the evolution of aaRSs for other amino acids with charged side chains such as reactive pTyr analogs (e.g., phosphonodifluoromethyl phenylalanine) which could be useful tools for activity-based protein profiling37.
DISCUSSION
We have reported the site-specific incorporation of pTyr and its nonhydrolyzable analog Pmp into recombinantly expressed proteins in E. coli. Key to this advance is a pro-peptide strategy that increases cellular uptake of ncAAs, and the use of the polyspecific synthetase, CMFRS. It is worth noting that, unlike Pmp or pTyr, other ncAAs with charged side chains (i.e., sTyr, BoroF, and CMF) do not require a pro-peptide strategy to get into cells. This may be because their side chains carry less negative charge compared to pTyr or Pmp, allowing more efficient passive diffusion through the cell membrane, or they may share transporters with other amino acids or small molecules, as demonstrated for other ncAAs38. It has also been reported that tRNAs carrying negatively charged amino acids are poor substrates for elongation factor (EF)-Tu. For example, the incorporation of phosphoserine (pSer) was only observed in the presence of an engineered EF-Tu39. However, the wild type EF-Tu in E. coli seems adequate for the incorporation of Pmp with good efficiency, although it cannot be ruled out that EF-Tu engineering may further improve efficiency. Efforts to increase the cellular availability of pTyr by deleting the endogenous phosphatases in E. coli40 gave relatively low yields of protein containing pTyr. Nevertheless, it is very likely that the combination of these strategies could lead to increased yields of variant proteins.
This method enables the generation of homogeneous proteins containing nonhydrolyzable pTyr analogs at different tyrosine phosphorylation sites in a controlled fashion, thereby facilitating studies of the function of natural phosphotyrosine posttranslational modifications in a variety of protein sizes. In a proof-of-concept study, Pmp was site-specifically incorporated into Abl1 at different tyrosine sites to afford the nonhydrolyzable analogues of the tyrosine-phosphorylated Abl1. The binding affinities of Abl1-Pmp30 and Abl1-Pmp52 were comparable to their pTyr counterparts. However, it should be noted that in some cases, 5-fold lower affinity has been observed for Pmp-containing peptides in comparison with the corresponding pTyr-containing peptides, likely due to the higher pKa2 of the phosphonate group (7.1 vs 5.7 for phosphate) and the loss of a hydrogen bond between the phenolic oxygen atom and the binding partner.
In addition, this strategy allows construction of libraries of phosphorylated peptides or proteins for phage display, which was previously inaccessible by other approaches. Moreover, this method could be potentially expanded to encode a set of pTyr-containing proteins or even a set of pTyr- and pSer-containing proteins when coupled with genetic incorporation of pSer in a desired pattern. The crystal structure of CMFRS sheds light on the molecular basis of the polyspecificity towards ncAAs with charged side chains; this will enable the development of MjTyrRS mutants to incorporate other valuable charged or hydrophilic ncAAs. Finally, we are attempting to adapt a similar strategy to eukaryotic cells to allow the controlled expression of tyrosine-phosphorylated proteins in real-time.
ONLINE METHODS
Synthesis of dipeptides
Chemical synthesis of dipeptides is described in the Supplementary Note.
General information
E. coli DH10B was used for general cloning and propagation. DPBS and Tris/borate/EDTA buffers were obtained from Cellgro. LB medium and LB agar were purchased from Fisher Scientific. Isopropyl-β-thiogalactoside (IPTG) was purchased from Anatrace, and 4–12% (wt/vol) Bis-Tris gels for SDS/PAGE were purchased from Invitrogen. Q5 DNA polymerase kit, Q5 Site-Directed Mutagenesis Kit, restriction enzymes and T4 DNA ligase were obtained from New England Biolabs, and oligonucleotides were purchased from Integrated DNA Technologies (San Diego, CA). Plasmid DNA preparation was carried out with the QIAprep Spin Miniprep Kit (Qiagen). Unless otherwise mentioned, all other chemicals were purchased from Sigma-Aldrich and used without further purification. Protein concentrations were measured using a Coomassie Plus (Bradford) Protein Assay kit from Pierce. All peptides were purchased from Innopep (San Diego, CA)
Agilent 6520 accurate-mass quadrupole-time-of-flight (QTOF) instrument was used to carry out the high-resolution mass spectrometry for all protein samples, which was equipped with reverse phase liquid chromatography and an electrospray ionization (ESI) source. Samples in PBS were injected (10 μL) at a concentration of 0.2 mg/mL and separated on a 150 mm reverse phase C8 wide pore column heated to 70 °C to improve peak resolution. Proteins were eluted in a gradient of H2O + 0.1% formic acid (solvent A) and acetonitrile + 0.1% formic acid (solvent B) using the following method: 5% B for 2 min, 5–60% B for 10 min, 60–80% B for 1 min, followed by a wash (95% B) and re-equilibration (5% A) phase. ESI source settings were 350 °C, 10 L/min, 40 psig nebulizer nitrogen gas, 200 V fragmentor, and 4,500 V capillary. Figures were created by extraction of the total ion count across the entire area of protein elution (roughly 8–10 min) and deconvolution of charge envelopes using Agilent Qualitative Analysis software with BioConfirm.
Amino acid uptake assay
An overnight culture of DH10B E. coli in GMML medium was diluted 1:100 in GMML and was grown to saturation (~18 hr). 1 mL stocks of ncAAs (25 mM in GMML) were added to 4 mL diluted cultures at OD600nm 0.8 to make a final concentration of 5 mM. 1 mL of p-acetyl-phenyalanine (pAcF) was used as positive control for the method and 1 mL of GMML as negative control. The mixtures were incubated at 37 °C, 250 rpm for 2 hr. The cells were pelleted by centrifugation 13,000 × g for 5 min. The media supernatants were saved and the pellets were washed four times with 1 mL ice cold GMML lacking heavy metals, thiamine and biotin (GMML minus) at 4 °C to minimize leaching out of ncAAs from cells. The last washes were saved and the cell pellets were resuspended in 150 μL lysis buffer (1 mg/mL lysozyme, 5 μg/mL DNAse I and 10 μg/mL RNAse in water) and incubated at 37 °C, 250 rpm for 1 hr before been centrifuged at 13,000 × g for 20 min at 4 °C. The cell lysates were filtered and subjected to liquid chromatography and mass spectroscopy (LC-MS) analysis together with the saved media and last wash.
The LC-MS analysis was performed on an Agilent 1100 Series LC/MSD instrument. The chromatographic peak corresponding to mass of each ncAAs was extracted and processed using Agilent LC/MSD ChemStation (revision B.03.02).
Myoglobin expression and purification
An overnight culture of DH10B E. coli co-transformed with the appropriate pEVOL plasmids (pEVOL-CMF, pEVOL-SY and pEVOL-BoroF) and pBAD-Myo(K99TAG) in 2YT medium at 37 °C with chloramphenicol (50 μg/mL) and ampicillin (100 μg/mL) was diluted 1:50 in Terrific Broth (TB) supplemented with chloramphenicol (50 μg/mL) and ampicillin (100 μg/mL) in the presence or absence of ncAAs (the final concentration for sY and CMF is 1 mM, for Pmp and pTyr and their corresponding dipeptides is 2mM). The cultures were grown at 37 °C to an OD600 of 0.8 when arabinose was added to a final concentration of 0.2% to induce the protein expression. The cultures were grown in 30 °C for 18 hr and were pelleted and lysed with BugBuster (Novagen), and the proteins were purified on Ni-NTA spin column (Qiagen) according to manufacturer’s protocol.
Construction of pUltra-CMF and pET-T5-ABL1 and its mutants
CMFRS gene without stop codon was amplified from pEVOL-CMFRS18 with primers PMP001 and PMP002 (Supplementary Table 1), and inserted into the pUltra-MiTyrRS vector34 digested with NotI by Gibson assembly to afford pUltra-CMF. Two additional amino acids Thr and Glu were inserted at the Abl1 N-terminus between the first Met and second N to increase its stability in E. coli during expression while a His6 tag was added to the C-terminus for affinity purification. The gene was synthesized by IDT and was amplified with primers PMP003 and PMP004, and inserted into a pET-T5 backbone amplified with primers PMP005 and PMP006 using Gibson assembly to provide pET-T5-ABL1-wt. Site-directed mutagenesis of this plasmid with primers PMP007/PMP008 and PMP009/PMP010 using Q5 Site-Directed Mutagenesis Kit (NEB) according to manufacturer’s protocol resulted in pET-T5-ABL1-Y30TAG and pET-T5-ABL1-Y52TAG, which was used to express Abl1 mutants containing Pmp (Supplementary Table 1).
Construction of CMFRS expression plasmid
The CMFRS gene without stop codon was amplified from pEVOL-CMFRS18 with primers PMP011 and PMP012 (Supplementary Table 1), and inserted into the pET-22b(+) (Novagen) using NdeI and XhoI restriction sites to provide pET-CMFRS with a six-histidine tag engineered at C-terminus of the CMFRS. The plasmid was electroporated into E. coli BL21(DE3) competent cell.
CMFRS expression and purification
The transformed E. coli BL21(DE3) was plated on an LB plate containing ampicillin (100 μg/ml) and allowed overnight growth at 37 °C. A single colony was picked to inoculate 50 ml LB medium and cultured overnight. 50 ml bacterial culture was then centrifuged and the bacterial were re-suspended and then transferred into 1 L fresh LB medium. All LB media used contained ampicillin (100 μg/ml). The cells were induced with 1 mM isopropyl β-d-1-thiogalacto-pyranoside (IPTG) when the culture reached an OD600 of 0.6–0.8. After induction, cultivation was continued for 3.5 h at 37 °C. The cells were harvested by centrifugation at 6,000 g at 4 °C for 15 min. The cell pellets obtained were stored at −20 °C for subsequent use. The purification was completed using a Ni–NTA affinity column (Qiagen) and an ÄKTA purification system (GE Healthcare). The cell pellets were re-suspended in 30 ml lysis buffer (20 mM Tris–HCl pH 8.0, 500 mM NaCl, 10 mM imidazole) with 0.1 mM PMSF and the cells were lysed by ultrasonication. The lysate was centrifuged at 24,000 g at 4 °C for 40 min. The supernatant was loaded directly onto a Ni–NTA column, which was pre-equilibrated with lysis buffer. The column was washed with wash buffer (20 mM Tris–HCl pH 8.0, 500 mM NaCl, 20 mM imidazole). CMFRS was eluted with elution buffer (20 mM Tris–HCl pH 8.0, 500 mM NaCl, 250 mM imidazole). The eluate was concentrated by ultrafiltration (Millipore) at 4 °C and buffer changed to 20 mM Tris-HCl, pH 8.0. Then the CMFRS solution was loaded onto a Mono Q 5/50 GL column (GE Healthcare) for ion exchange purification. As running the gradient of 20 mM Tris-HCl, 1 M NaCl, the column flow rate was 0.5 ml min−1 and all peak fractions were collected. All fractions were identified by SDS–PAGE. CMFRS was eluted at the NaCl concentration of 50–200 mM. Concentrate and change buffer of CMFRS from the ion exchange purification to 20 mM Tris-HCl, pH 8.0, 50 mM NaCl, 10 mM Pmp or SY, 5 mM DTT. The final CMFRS concentration is 15 mg/ml, protein concentrations were determined by Nanodrop 2000c.
Crystallization of CMFRS
The crystallization experiments were conducted using the hanging-drop vapor-diffusion method at 289 K. The crystallization conditions were optimized by varying the type and concentration of precipitants (PEGs and glycerol), the pH, the types of additives (Hampton Additive kit) and the protein concentration. As the result of several rounds of optimization, crystals of CMFRS were obtained by mixing equal volumes of protein solution (32 mg/mL) and crystallization solution containing 15–25% PEG 3350, 3–7% glycerol, 10 mM Pmp or SY, 5 mM DTT, 50 mM sodium cacodylate trihydrate pH 6.0–7.0 and equilibrating over 0.5 ml reservoir solution (15–25% PEG 3350, 50 mM sodium cacodylate trihydrate pH 6.0–7.0) at 277 K for about a week. Prior to data collection, CMFRS crystals were transferred to cryoprotectant solution containing 15% ethylene glycol, 16% PEG 3350, 10 mM Pmp or SY, 5 mM DTT, 50 mM sodium cacodylate trihydrate pH 6.0–7.0 for about 2 min; they were then immediately flash-frozen in a liquid-nitrogen stream on cryoloops.
Abl1 mutant expression and purification
For wild-type Abl1, an overnight culture of DH10B E. coli transformed with pET-T5-ABL1-wt in 2YT media with ampicillin (100 μg/ml) was diluted 100-fold into 200 mL TB media supplemented with ampicillin (100 μg/ml) at 37 °C. For Abl1 containing amber codon, an overnight culture of DH10B E. coli co-transformed with pUltra-CMFRS and appropriate pET-T5-ABL1-Y30TAG or -Y52TAG) in 2YT media with ampicillin (100 μg/ml) and spectinomycin (50 μg/ml) was diluted 100-fold into 200 mL TB media supplemented with ampicillin (100 μg/ml) and spectinomycin (50 μg/ml). The cells were allowed to grow for 3–5 hrs when the OD 600 reached 0.6 and IPTG was added to a final concentration of 1 mM to induce protein expression. The cells were grown for an addition 30 hrs at 30 °C before harvesting by centrifugation at 6,000 g for 10 minutes. The cell pellet was lysed by sonication and the resulting cell lysate was clarified by centrifugation at 15,500 g for 35 minutes at 4 °C. Abl1 wild-type and mutants were purified on Ni-NTA resin (Qiagen) following the manufacturer’s instructions.
Data collection, structure determination, and refinement
Diffraction data for CMFRS were collected at the wavelength of 1.03320 Å on beamline 23-IDB APS41 at 100K and HKL2000 (HKL research) was used for the processing, reduction and scaling of the diffraction data.
The data collection statistics of CMFRS crystals are shown in Supplementary Table 2. The crystal diffracted to 3.0 Å resolution. The diffraction data were indexed in space group P21 with unit-cell parameters for the native crystal of a = 58.6 Å, b = 130.6, c = 60.0 Å, α = 90°, β = 117.7° and γ = 90°. Two molecules were found in the asymmetric unit with a solvent content of 57.0%.
The first density map was obtained by molecular replacement using PDB 4PBR as a model and the program Phaser42 in CCP4. A preliminary model with about 60% completeness was well fitted to the electron density. The rest of the residues were manually modeled using Coot43. The refinement was carried out using Refmac44,45 in the CCP4 suite. The statistics of data collection and refinement are summarized in Supplementary Table 2. All structural pictures were rendered in PyMOL (The PyMOL Molecular Graphics System, Version 1.8 Schrödinger, LLC.). Ramachandran statistics (Favored 94.8%, outliers 1.4%).
Fluorescence Polarization Binding Assay
The fluorescence polarization assays were performed by using PerkinElmer black 96-well flat bottom plates and the fluorescence polarization was determined by the SpectraMax M5 (Molecular Devices) plate reader. To each well, FAM labeled peptide in PBS was added to a final concentration of 5 nM, the recombinant Abl1-wt or mutants were added in two-fold dilution from 20 mM concentration. PBS was used to fill up each well to 100 μL. The plate was incubated for 15 min after which the polarization values were determined in millipolarization units (mP) with excitation wavelength of 485 nm and emission wavelength of 535 nm. The acquired data were analyzed by using Prism (Graphpad Software, Inc., San Diego, CA, USA) and the KD values were determined by nonlinear fitting.
Data availability
Coordinates and structure factors have been deposited in the Protein Data Bank under accession codes 5U36 for CMFRS. All other data that support the findings of this study are in the published article (and its supplementary information files) or are available from the corresponding author upon reasonable request.
Supplementary Material
Acknowledgments
The authors acknowledge Kristen Williams for the assistance in manuscript preparation. X-ray diffraction data were collected at the Advanced Photon Source (APS) beamline 23ID-B. Use of the Advanced Photon Source for data collection was supported by the DOE, Basic Energy Sciences, Office of Science, under contract no. DE-AC02- 06CH11357. GM/CA CAT has been funded in whole or in part with federal funds from NCI (grant Y1-CO-1020) and NIGMS (grant Y1-GM-1104). The NIH and DOE funders at the beamlines had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. This work was supported by NIH Grant 5R01 GM062159-14 (to P.G.S). This is manuscript 29424 of The Scripps Research Institute.
Footnotes
Author Contributions
X.L., P.G.S., and F.W. designed research. X.L., G.F., and R.E.W. performed protein expression, purification, and crystallization. X.L., R.E.W., C.Z., R.L., W.X., C.H., and P.-Y.Y. performed chemical synthesis. X.L., T.L., J.D., M.K., and Y.Z. performed the cloning and screening of synthetases, expression of target proteins. X.Z. performed X-ray diffraction experiments.; X.L., G.F., X.Z., X.Lyu., I.A.W., and F.W. performed crystallographic analysis and data deposition. X.L., H.G., and A.Y., performed fluorescence polarization assay. X.L., T.L., W.X., P.G.S., and F.W. analyzed data; and X.L., S.A.R., P.G.S., and F.W. wrote the paper.
Competing Financial Interests
The authors declare no competing financial interests.
References
- 1.Manning G, Whyte DB, Martinez R, Hunter T, Sudarsanam S. The protein kinase complement of the human genome. Science. 2002;298:1912–1934. doi: 10.1126/science.1075762. [DOI] [PubMed] [Google Scholar]
- 2.Pawson T. Specificity in signal transduction: From phosphotyrosine-SH2 domain interactions to complex cellular systems. Cell. 2004;116:191–203. doi: 10.1016/s0092-8674(03)01077-8. [DOI] [PubMed] [Google Scholar]
- 3.Hunter T, Cooper JA. Protein-tyrosine kinases. Annu Rev Biochem. 1985;54:897–930. doi: 10.1146/annurev.bi.54.070185.004341. [DOI] [PubMed] [Google Scholar]
- 4.Alonso A, et al. Protein tyrosine phosphatases in the human genome. Cell. 2004;117:699–711. doi: 10.1016/j.cell.2004.05.018. [DOI] [PubMed] [Google Scholar]
- 5.Eckhart W, Hutchinson MA, Hunter T. An activity phosphorylating tyrosine in polyoma T antigen immunoprecipitates. Cell. 1979;18:925–933. doi: 10.1016/0092-8674(79)90205-8. [DOI] [PubMed] [Google Scholar]
- 6.Tailor P, Gilman J, Williams S, Couture C, Mustelin T. Regulation of the low molecular weight phosphotyrosine phosphatase by phosphorylation at tyrosines 131 and 132. J Biol Chem. 1997;272:5371–5374. doi: 10.1074/jbc.272.9.5371. [DOI] [PubMed] [Google Scholar]
- 7.Feng GS, Hui CC, Pawson T. SH2-containing phosphotyrosine phosphatase as a target of protein-tyrosine kinases. Science. 1993;259:1607–1611. doi: 10.1126/science.8096088. [DOI] [PubMed] [Google Scholar]
- 8.Tarrant MK, et al. Regulation of CK2 by phosphorylation and O-GlcNAcylation revealed by semisynthesis. Nat Chem Biol. 2012;8:262–269. doi: 10.1038/nchembio.771. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Chen Z, Cole PA. Synthetic approaches to protein phosphorylation. Curr Opin Chem Biol. 2015;28:115–122. doi: 10.1016/j.cbpa.2015.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Serwa R, et al. Chemoselective Staudinger-Phosphite Reaction of Azides for the Phosphorylation of Proteins. Angew Chem Int Ed. 2009;48:8234–8239. doi: 10.1002/anie.200902118. [DOI] [PubMed] [Google Scholar]
- 11.Olsen JV, et al. Global, in vivo, and site-specific phosphorylation dynamics in signaling networks. Cell. 2006;127:635–648. doi: 10.1016/j.cell.2006.09.026. [DOI] [PubMed] [Google Scholar]
- 12.Arslan T, Mamaev SV, Mamaeva NV, Hecht SM. Structurally modified firefly luciferase, effects of amino acid substitution at position 286. J Am Chem Soc. 1997;119:10877–10887. [Google Scholar]
- 13.Liu CC, Schultz PG. Recombinant expression of selectively sulfated proteins in Escherichia coli. Nat Biotechnol. 2006;24:1436–1440. doi: 10.1038/nbt1254. [DOI] [PubMed] [Google Scholar]
- 14.Liu CC, et al. Protein evolution with an expanded genetic code. Proc Natl Acad Sci U S A. 2008;105:17688–17693. doi: 10.1073/pnas.0809543105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Liu CC, Choe H, Farzan M, Smider VV, Schultz PG. Mutagenesis and Evolution of Sulfated Antibodies Using an Expanded Genetic Code. Biochemistry. 2009;48:8891–8898. doi: 10.1021/bi9011429. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Liu CC, Cellitti SE, Geierstanger BH, Schultz PG. Efficient expression of tyrosine-sulfated proteins in E-coli using an expanded genetic code. Nat Protoc. 2009;4:1784–1789. doi: 10.1038/nprot.2009.188. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Liu CC, Brustad E, Liu W, Schultz PG. Crystal structure of a biosynthetic sulfo-hirudin complexed to thrombin. J Am Chem Soc. 2007;129:10648–10649. doi: 10.1021/ja0735002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Xie J, Supekova L, Schultz PG. A genetically encoded metabolically stable analogue of phosphotyrosine in Escherichia coli. ACS Chem Biol. 2007;2:474–478. doi: 10.1021/cb700083w. [DOI] [PubMed] [Google Scholar]
- 19.Rust HL, et al. Using Unnatural Amino Acid Mutagenesis To Probe the Regulation of PRMT1. ACS Chem Biol. 2014;9:649–655. doi: 10.1021/cb400859z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Guerra-Castellano A, et al. Mimicking Tyrosine Phosphorylation in Human Cytochrome c by the Evolved tRNA Synthetase Technique. Chem Eur J. 2015;21:15004–15012. doi: 10.1002/chem.201502019. [DOI] [PubMed] [Google Scholar]
- 21.Zheng YT, Lv XX, Wang JY. A genetically encoded sulfotyrosine for VHR function research. Protein Cell. 2013;4:731–734. doi: 10.1007/s13238-013-3907-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Lu W, Shen K, Cole PA. Chemical dissection of the effects of tyrosine phosphorylation of SHP-2. Biochemistry. 2003;42:5461–5468. doi: 10.1021/bi0340144. [DOI] [PubMed] [Google Scholar]
- 23.Zhang ZS, Shen K, Lu W, Cole PA. The role of C-terminal tyrosine phosphorylation in the regulation of SHP-1 explored via expressed protein ligation. J Biol Chem. 2003;278:4668–4674. doi: 10.1074/jbc.M210028200. [DOI] [PubMed] [Google Scholar]
- 24.Lu W, Gong DQ, Bar-Sagi D, Cole PA. Site-specific incorporation of a phosphotyrosine mimetic reveals a role for tyrosine phosphorylation of SHP-2 in cell signaling. Mol Cell. 2001;8:759–769. doi: 10.1016/s1097-2765(01)00369-0. [DOI] [PubMed] [Google Scholar]
- 25.Leive L. A Nonspecific Increase in Permeability in Escherichia coli Produced by EDTA. Proc Natl Acad Sci U S A. 1965;53:745–750. doi: 10.1073/pnas.53.4.745. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Fickel TE, Gilvarg C. Transport of Impermeant Substances in E. coli by Way of Oligopeptide Permease. Nat New Biol. 1973;241:161–163. doi: 10.1038/newbio241161a0. [DOI] [PubMed] [Google Scholar]
- 27.Ames BN, Ferroluz G, Young JD, Tsuchiya D, Lecocq J. Illicit transport: the oligopeptide permease. Proc Natl Acad Sci U S A. 1973;70:456–458. doi: 10.1073/pnas.70.2.456. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Parrish AR, et al. Expanding the Genetic Code of Caenorhabditis elegans Using Bacterial Aminoacyl-tRNA Synthetase/tRNA Pairs. ACS Chem Biol. 2012;7:1292–1302. doi: 10.1021/cb200542j. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Payne JW, Gilvarg C. The Role of the Terminal Carboxyl Group in Peptide Transport in Escherichia coli. J Biol Chem. 1968;243:335–340. [PubMed] [Google Scholar]
- 30.Verkamp E, Backman VM, Bjornsson JM, Soll D, Eggertsson G. The periplasmic dipeptide permease system transports 5-aminolevulinic acid in Escherichia coli. J Bacteriol. 1993;175:1452–1456. doi: 10.1128/jb.175.5.1452-1456.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Brustad E, et al. A Genetically Encoded Boronate-Containing Amino Acid. Angew Chem Int Ed. 2008;47:8220–8223. doi: 10.1002/anie.200803240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Young TS, Ahmad I, Yin JA, Schultz PG. An Enhanced System for Unnatural Amino Acid Mutagenesis in E. coli. J Mol Biol. 2010;395:361–374. doi: 10.1016/j.jmb.2009.10.030. [DOI] [PubMed] [Google Scholar]
- 33.Zitterbart R, Seitz O. Parallel Chemical Protein Synthesis on a Surface Enables the Rapid Analysis of the Phosphoregulation of SH3 Domains. Angew Chem Int Ed. 2016;55:7252–6. doi: 10.1002/anie.201601843. [DOI] [PubMed] [Google Scholar]
- 34.Chatterjee A, Sun SB, Furman JL, Xiao H, Schultz PG. A Versatile Platform for Single- and Multiple-Unnatural Amino Acid Mutagenesis in Escherichia coli. Biochemistry. 2013;52:1828–1837. doi: 10.1021/bi4000244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Viguera AR, Arrondo JLR, Musacchio A, Saraste M, Serrano L. Characterization of the Interaction of Natural Proline-Rich Peptides with Five Different SH3 Domains. Biochemistry. 1994;33:10925–10933. doi: 10.1021/bi00202a011. [DOI] [PubMed] [Google Scholar]
- 36.Young DD, et al. An Evolved Aminoacyl-tRNA Synthetase with Atypical Polysubstrate Specificity. Biochemistry. 2011;50:1894–1900. doi: 10.1021/bi101929e. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Lu CHS, Liu K, Tan LP, Yao SQ. Current Chemical Biology Tools for Studying Protein Phosphorylation and Dephosphorylation. Chem Eur J. 2012;18:28–39. doi: 10.1002/chem.201103206. [DOI] [PubMed] [Google Scholar]
- 38.Liu DR, Schultz PG. Progress toward the evolution of an organism with an expanded genetic code. Proc Natl Acad Sci U S A. 1999;96:4780–4785. doi: 10.1073/pnas.96.9.4780. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Park HS, et al. Expanding the Genetic Code of Escherichia coli with Phosphoserine. Science. 2011;333:1151–1154. doi: 10.1126/science.1207203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Fan CG, Ip K, Soll D. Expanding the genetic code of Escherichia coli with phosphotyrosine. FEBS Lett. 2016;590:3040–3047. doi: 10.1002/1873-3468.12333. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Otwinowski Z, Minor W. Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol. 1997;276:307–326. doi: 10.1016/S0076-6879(97)76066-X. [DOI] [PubMed] [Google Scholar]
- 42.McCoy AJ, et al. Phaser crystallographic software. J Appl Crystallogr. 2007;40:658–674. doi: 10.1107/S0021889807021206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Emsley P, Cowtan K. Coot: model-building tools for molecular graphics. Acta Crystallogr Sect D Biol Crystallogr. 2004;60:2126–2132. doi: 10.1107/S0907444904019158. [DOI] [PubMed] [Google Scholar]
- 44.Vagin AA, et al. REFMAC5 dictionary: organization of prior chemical knowledge and guidelines for its use. Acta Crystallogr Sect D Biol Crystallogr. 2004;60:2184–2195. doi: 10.1107/S0907444904023510. [DOI] [PubMed] [Google Scholar]
- 45.Murshudov GN, Vagin AA, Dodson EJ. Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallogr Sect D Biol Crystallogr. 1997;53:240–255. doi: 10.1107/S0907444996012255. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Coordinates and structure factors have been deposited in the Protein Data Bank under accession codes 5U36 for CMFRS. All other data that support the findings of this study are in the published article (and its supplementary information files) or are available from the corresponding author upon reasonable request.
