The synthesis of chemically-modified proteins is a valuable tool for the study of their function.[1] One of the most enabling protein synthesis technologies is native chemical ligation (NCL), in which a peptide thioester is reacted with a peptide bearing an N-terminal Cys to form a native amide bond.[2] In many cases, one of these two fragments can be expressed in E. coli and then ligated to a smaller synthetic peptide so that a large semi-synthetic protein can be made with a minimum of peptide synthesis. This sort of expressed protein ligation (EPL) can be implemented in two ways.[3] For a synthetic C-terminus, the N-terminal region can be expressed as a fusion to an intein and then converted to a thioester for ligation. Likewise, a protein can be expressed with an N-terminal Cys and ligated to a synthetic thioester. EPL is an extremely efficient way of making large quantities of synthetic proteins. However, there are limitations to the method, most notably, the need for Cys at the site of ligation.
Since many proteins do not have Cys residues at convenient locations or lack Cys entirely, several research groups have developed methods for “erasing” the ligation site. The most common strategies are alkylation of Cys to make Lys or Glu analogs and desulfurization to convert Cys to Ala.[4] Danishefsky and others have developed strategies using synthetic Cys analogs that are converted to other amino acids such as Leu, Phe, Val, Thr, and Lys by desulfurization.[5] Dawson and Muir have used nitrobenzyl Cys surrogates that can be removed by photolysis after ligation.[6] Raines and Bertozzi have demonstrated “traceless” ligations based on Staudinger reductions and Bode has developed a novel ligation using ketoacid/alkoxyamine partners.[7] Homocysteine (Hcs) has been used for ligation and then converted to Met by selective methylation.[8] While chemically elegant, most of these strategies are still limited in that they require the ligation handles to be incorporated synthetically. Therefore, these handles can be placed in a synthetic peptide to be ligated to an intein-derived thioester, but cannot be applied to an expressed C-terminal fragment for ligation to a synthetic thioester.
One way of functionalizing protein N-termini makes use of the aminoacyl transferase (AaT) enzyme from E. coli, which transfers Phe, Leu, or Met from an aminoacyl tRNA (Zaa-tRNA) to the N-terminus of a protein.[9,10] Our laboratory has recently shown that AaT does not require full length tRNAs, and can use aminoacyl adenosine (Zaa-A) substrates to transfer a variety of amino acids.[11] Previous work from Tirrell and Sisido demonstrated that AaT could transfer unnatural amino acids from Zaa-tRNAs produced either through semi-synthesis or by the activity of a modified aminoacyl tRNA synthetase (RS).[12] These two strategies are complementary. Our Zaa-A donor substrates are easily synthesized and allow us to explore the inherent substrate specificity of AaT without the necessity of identifying a novel RS. Once a useful substrate is identified, higher yields can be achieved with the catalytic regeneration of Zaa-tRNA by an appropriate RS. We wished to determine whether AaT could be used to functionalize proteins with erasable Cys analogs for ligation. Varshavsky and others have shown that AaT recognizes N-terminal Arg or Lys residues regardless of the adjacent sequence.[10,13] Therefore, we expected that any transferable analogs we identified would be applicable to a wide variety of protein targets. Here, we show that AaT can deliver disulfide-protected Hcs to the N-terminus of a protein under mild conditions. After reduction, Hcs can be used in a ligation reaction and then converted to Met by alkylation (Figure 1). We demonstrate the utility of this approach by synthesizing α-synuclein, a protein whose aggregation contributes to Parkinson’s disease pathology.[14]
Figure 1.

Expressed Protein Ligation at Methionine. An expressed protein bearing an N-terminal Arg or Lys (green) is functionalized with a protected Hcs analog from an aminoacyl adenosine donor, under catalysis by AaT transferase. A synthetic peptide thioester (red) is ligated to the modified protein under reducing conditions that allow deprotection of Hcs. After ligation, Hcs is converted to Met.
Previous studies had shown that Cys-tRNA is not a substrate for AaT and our own preliminary experiments showed that Hcs was not transferred (data not shown).[13,15] Therefore, we prepared Hcs analogs masked as aliphatic or aromatic disulfides in order to mimic the endogenous AaT substrates Met, Leu, and Phe. Disulfide protecting groups (MeS, i-PrS, t-BuS, and PhS) were used so that they could be removed in situ under the reducing conditions of the subsequent NCL reaction.
AaT donor molecules were synthesized using a simple protocol beginning with Boc protection of oxidized Hcs dimer to give 1, followed by disulfide exchange to form protected Hcs analogs 2a-d. These were activated as cyanomethyl esters 3a-d, allowing for acylation of the 2′ and 3′ hydroxyl groups of 5′-O-dimethoxytrityladenosine, (DMT)-A, without amidation of its unprotected exocyclic amine. TFA treatment cleaved both the DMT and the Boc groups to give Zaa-A donor molecules 5a-d. Overall yields of the DMT-protected adenosyl analogs (4a-d) were ~ 40% from (Hcs)2. However, several byproducts of the TFA deprotection reaction were observed, requiring HPLC purification of 5a-d.
Once the Zaa-A donor molecules were purified, they were tested in transfer reactions to a reporter peptide LysAlaAcm, where Acm is 7-aminocoumarin. We compared yields of ZaaLysAlaAcm by HPLC analysis after four hour reactions at pH 8.0 at 37 °C (Scheme 2). Under these conditions, transfer of Phe is quantitative and Met proceeds in 75% yield. We first tested the transfer of disulfide-protected compounds using thiomethyl Cys (Csm), a Met isostere, prepared through a route similar to Scheme 1. Csm was transferred in 21% yield. While this was lower than Met, it established that we could transfer disulfide-protected analogs without substantial side-reactions. Hcs donors 5a-d were tested in a similar fashion. Hcm was transferred efficiently from 5a, with somewhat lower yields for Hcp and Hcb. No transfer of Hcf was observed from donor 5d. In order to maintain the disulfide-protected analogs during the transfer reaction, we eliminated β-mercaptoethanol (BME) from the AaT purification process and the reaction buffer. This increased the yields of HcmLysAlaAcm (6) to 37%. Once Hcm was identified as an optimal substrate, we assessed the possibility of generating HcmtRNA in situ using an RS. Free Hcm was synthesized by TFA deprotection of Boc-Hcm. Wild type E. coli MetRS and an L13G mutant (Met*RS) previously used by the Tirrell group were tested with Hcm, ATP, and E. coli total tRNA.[16] Met*RS was more active than MetRS, giving full transfer at ≤ 0.1 mg/mL loading. This is consistent with Met*RS structural analysis, where longer sidechains are accommodated by the expanded active site.[16]
Scheme 2.

Amino Acid Transfer to Reporter Peptide. Reagents and conditions: a) 1 mM Zaa-A, 0.1 mg/mL AaT; or b) 1mM Zaa, 2 mg/mL E. coli total tRNA, 0.1 mg/mL Met*RS, 0.1 mg/mL AaT; with 100 μM LysAlaAcm in 50 mM HEPES, 150 mM KCl, 10 mM MgCl2, pH 8.0, 37 °C, 4 h. Yields determined by HPLC analysis.
Scheme 1.

Synthesis of Protected Homocysteine Donor Molecules. Reagents and conditions: a) (Boc)2O, DIPEA, THF; b) RSH, NaOH, I2, 1:1 THF/H2O; c) ClCH2CN, DIPEA, THF; d) (DMT)-A, DIPEA, NBu4Ac, THF; e) TFA, TIPSH, THF. 2a, 3a, 4a, 5a R = Me; 2b, 3b, 4b, 5b R = i-Pr; 2c, 3c, 4c, 5c R = t-Bu; 2d, 3d, 4d, 5d R = Ph. Yields reported in Supporting Information. Boc = tert-butyloxycarbonyl, DIPEA = N,N-diisopropylethylamine, THF = tetrahydrofuran, (DMT)-A = 5′-O-dimethoxytrityl adenosine, NBu4Ac = N,N,N,N-tetrabutylammonium acetate, TFA = trifluoroacetic acid, TIPSH = triisopropylsilane.
We then took HcmLysAlaAcm (6) on to a model ligation reaction with Ac-MetAspValPhe-SR (7). HPLC and MALDI MS analyses showed that disulfide cleavage was completed almost immediately and that the subsequent ligation reached 100% completion after 24 h. After ligation, the product peptide (8a) was treated with 1000 equiv MeI at pH 8.6 to form 8b. After 5 min, 89% conversion of Hcs5 to Met5 was observed by HPLC and MALDI MS analyses (Supporting Information). Unreacted 8a (11%) could be removed after disulfide formation. Oxidation of Met1, was observed (4%), but no undesired alkylation of Met or Lys was seen. This is consistent with previous uses of Hcs as a ligation handle, where subsequent alkylation is surprisingly selective, given that the pKa of Hcs (8.9) is close to the pK of Lys (9.5).[17]
After successful completion of the model peptide ligation, we wished to test the Hcs transfer/ligation strategy with a full-sized protein. One protein that we have previously studied which has the requisite MetLys motif is α-synuclein (αS). Several recent studies have shown that the N-terminal sequence of αS can have important consequences for its folding and self-association.[18] However, these studies profoundly disagree on what those consequences are. Therefore, the preparation of N-terminally-modified αS could be useful to addressing these questions and serve as a valuable demonstration of our method.
The αS NCL protocol is shown in Figure 2. We chose Met5Lys6 as the point of disconnection and prepared an αS6-140 plasmid with an N-terminal His10 tag and a Factor Xa proteolysis site. The precursor protein, HisTag-αS6-140 (9), was isolated by Ni-affinity chromatography and cleaved with Factor Xa to give αS6-140 (10). We incubated 10 with AaT, Met*RS, tRNA, Hcm, and ATP. Transfer of Hcm was complete after 1 h, within the limits of quantitation by MALDI MS. (Supporting Information) Complete transfer is extremely valuable for large proteins, since separation of an Hcm-modified protein such as 11 from unmodified 10 is not generally feasible. Modified αS 11 was isolated from AaT, Met*RS, and tRNA by boiling and FPLC purification. Then 11 was incubated with thioester AcMcmMetAspValPhe-SR (S13), which corresponds to αS1-4 labelled at the N-terminus with 7-methoxycoumarinylalanine (Mcm). Overnight incubation at 37 °C gave 12a in high yield from 11. No unligated starting materials were observed, though some oxidized 12a was observed (Supporting Information). Purified 12a was quantitatively converted to 13a by treatment with aqueous MeI, as determined by analyses using Ellman’s reagent and MALDI MS (Supporting Information). In general, we expect that the NCL reaction will not be 100 % efficient, but we do not view this as a severe limitation, provided that the NCL product can be purified away from the starting materials and methylation is quantitative and selective.
Figure 2.

α-Synuclein Model Ligation. Functionalization of αS6-140 by a) cleavage of His tag at Factor Xa site, b) attachment of Hcm by AaT-catalyzed modification, c) ligation of an N-terminal thioester peptide (7 or S13), and d) conversion of Hcs to Met by methylation. 12a, 13a Xxx = AcMcm; 12b, 13b Xxx = Ac. Top Right: Gel image showing product of each step, WT indicates full-length αS. Middle Right: MALDI MS analysis of full length Ac-αS before (12b) and after (13b) methylation. Bottom Right: Trypsinized fragment corresponding to Ac-αS1-6 (812.4 m/z) confirms successful methylation. Asterisk indicates the expected mass of unmethylated Ac-αS1-6Hcs5.
A similar ligation was carried out with Ac-MetAspValPhe-SR (7), yielding N-Ac-αS after methylation (13b). The ligation product 13b was subjected to trypsin digest and MALDI MS analysis to confirm that methylation occurred with high yield and few side reactions. Observation of the Ac-MetAspValPheMetLys fragment (Ac-αS1-6) with only trace Ac-MetAspValPheHcsLys contamination confirmed that methylation proceeded in > 95% yield within the limitations of quantitation by MS. Furthermore, MALDI MS analysis of the rest of the tryptic fragments demonstrated that there were no substantial alkylation side reactions, even on His50 in the αS46-58 fragment. (Supporting Information) Such selectivity is rewarding, but probably not completely general; for any novel protein, similar analysis by digestion and MS should be performed. It should also be noted that protein targets containing unprotected Cys residues would also be methylated on the free thiols. However, more conventional NCL strategies using ligation at Cys are probably viable in those cases.
Application of our system to targets other than αS may require mild denaturation of the protein in order to access the N-terminal amino acids. Therefore, we have tested AaT activity in the presence of low concentrations of denaturants and detergents. We find that AaT activity has a Gdn•HCl IC50 of 0.4 M and that full activity could be maintained in 5% v/v Triton X-100. While the N-termini of some proteins may be buried in important structural interactions, these concentrations should allow one to partially unfold many proteins to access the terminus for Hcm transfer and ligation.
While our experiments begin to remove the sequence limitations for EPL, they certainly do not eliminate them entirely. Essentially, one can now use XxxMetArg or XxxMetLys as a disconnection point in a retrosynthetic analysis of a potential protein target. The utility of our approach depends on how frequently this motif occurs. To address this, we analyzed the protein sequences in the PDB and found that out of the 60,325 proteins surveyed, 5,405 possible N-terminal ligation sites (< 40 amino acids from the N-terminus) were identified in 5,156 unique proteins. These numbers increased to 31,259 ligation sites and 20,701 proteins when we considered proteins with the ligation motif in the center of the protein where one might use multiple ligations to synthesize the full-length protein. We chose to restrict our search to the PDB since these sequences represent proteins with some degree of prior characterization typical of a protein target chosen for synthesis via NCL.
While it is clear that many proteins could be potential synthesis targets using AaT, we wish to eventually make our methods as broadly useful as possible. Therefore, it is worth noting that there exist other transferases with different N-terminal specificity, such as the BpT transferase from V. vulnificus, which transfers Leu to an N-terminal Asp or Glu.[19] Our preliminary experiments indicate that BpT can transfer Leu from small molecule donors in vitro, but our current yields are too low (< 10 % after 4 h) to be useful. In situ aminoacylations using Hcp, Hcb, or Hcm with BpT and either Met*RS or E. coli LeuRS have not produced significant transfer yields. Efforts are underway to improve transfer of Hcs analogs. We are also working to mutate AaT to alter the substrate specificity for both Zaa and the N-terminal residue.
Prior to our work, the only way to avoid Cys at a ligation site in the expressed protein fragment during EPL was to desulfurize to form Ala or mask Cys by forming a non-native amino acid. Here, we have shown that a protein N-terminus can be functionalized with Hcs under mild conditions. After ligation, Hcs can be efficiently and selectively converted to Met, erasing the ligation site. Our work begins to remove the N-terminal Cys requirement for ligation by allowing one to use the MetArg or MetLys motif as a point of disconnection in protein semi-synthesis.
Supplementary Material
Scheme 3.

Ligation and Alkylation of Hcs-Functionalized Peptide. Reagents and conditions: a) 1.0 mM 6, 3.3 mM 7, 100 mM sodium phosphate dibasic, 20 mM TCEP, 0.5% PhSH, 0.5% BnSH, pH 7.2, room temperature, 24 h, 100 %; b) 50 mM MeI, 100 mM sodium bicarbonate, pH 8.6, room temperature, 5 min, 89%.
Acknowledgments
[**] This work was supported by funding from the University of Pennsylvania and the Searle Scholars Program (10-SSP-214 to EJP). TT was supported by a fellowship from the Japan Society for the Promotion of Science (JSPS). We thank Rakesh Kohli for assistance with HRMS (NIH RR-023444) and MALDI-MS (supported by NSF MRI-0820996).
Footnotes
Supporting information for this article is available on the WWW under http://www.angewandte.org or from the author.
References
- [1].Hackenberger CPR, Schwarzer D. Angew. Chem. Int. Ed. 2008;47:10030–10074. doi: 10.1002/anie.200801313. [DOI] [PubMed] [Google Scholar]
- [2].Kent SBH. Chem. Soc. Rev. 2009;38:338–351. doi: 10.1039/b700141j. [DOI] [PubMed] [Google Scholar]
- [3].Muralidharan V, Muir TW. Nat. Methods. 2006;3:429–438. doi: 10.1038/nmeth886. [DOI] [PubMed] [Google Scholar]
- [4] a).Okazaki K, Yamada H, Imoto T. Anal. Biochem. 1985;149:516–520. doi: 10.1016/0003-2697(85)90607-4. [DOI] [PubMed] [Google Scholar]; b) Yan LZ, Dawson PE. J. Am. Chem. Soc. 2001;123:526–533. doi: 10.1021/ja003265m. [DOI] [PubMed] [Google Scholar]; c) Hopkins CE, Hernandez G, Lee JP, Tolan DR. Archives Biochem. Biophys. 2005;443:1–10. doi: 10.1016/j.abb.2005.08.020. [DOI] [PubMed] [Google Scholar]
- [5] a).Crich D, Banerjee A. J. Am. Chem. Soc. 2007;129:10064–10065. doi: 10.1021/ja072804l. [DOI] [PubMed] [Google Scholar]; b) Chen J, Wan Q, Yuan Y, Zhu JL, Danishefsky SJ. Angew. Chem. Int. Ed. 2008;47:8521–8524. doi: 10.1002/anie.200803523. [DOI] [PMC free article] [PubMed] [Google Scholar]; c) Haase C, Rohde H, Seitz O. Angew. Chem. Int. Ed. 2008;47:6807–6810. doi: 10.1002/anie.200801590. [DOI] [PubMed] [Google Scholar]; d) Yang RL, Pasunooti KK, Li FP, Liu XW, Liu CF. J. Am. Chem. Soc. 2009;131:13592–13593. doi: 10.1021/ja905491p. [DOI] [PubMed] [Google Scholar]; e) Chen J, Wang P, Zhu JL, Wan Q, Danishefsky SJ. Tetrahedron. 2010;66:2277–2283. doi: 10.1016/j.tet.2010.01.067. [DOI] [PMC free article] [PubMed] [Google Scholar]; f) Harpaz Z, Siman P, Kumar KSA, Brik A. ChemBioChem. 2010;11:1232–1235. doi: 10.1002/cbic.201000168. [DOI] [PubMed] [Google Scholar]; g) Townsend SD, Tan ZP, Dong SW, Shang SY, Brailsford JA, Danishefsky SJ. J. Am. Chem. Soc. 2012;134:3912–3916. doi: 10.1021/ja212182q. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6] a).Offer J, Boddy CNC, Dawson PE. J. Am. Chem. Soc. 2002;124:4642–4646. doi: 10.1021/ja016731w. [DOI] [PubMed] [Google Scholar]; b) Chatterjee C, McGinty RK, Pellois JP, Muir TW. Angew. Chem. Int. Ed. 2007;46:2814–2818. doi: 10.1002/anie.200605155. [DOI] [PubMed] [Google Scholar]
- [7] a).Saxon E, Armstrong JI, Bertozzi CR. Org. Lett. 2000;2:2141–2143. doi: 10.1021/ol006054v. [DOI] [PubMed] [Google Scholar]; b) Nilsson BL, Kiessling LL, Raines RT. Org. Lett. 2001;3:9–12. doi: 10.1021/ol006739v. [DOI] [PubMed] [Google Scholar]; c) Bode JW, Fox RM, Baucom KD. Angew. Chem. Int. Ed. 2006;45:1248–1252. doi: 10.1002/anie.200503991. [DOI] [PubMed] [Google Scholar]
- [8] a).Tam JP, Yu QT. Biopolymers. 1998;46:319–327. doi: 10.1002/(SICI)1097-0282(19981015)46:5<319::AID-BIP3>3.0.CO;2-S. [DOI] [PubMed] [Google Scholar]; b) Pachamuthu K, Schmidt RR. Synlett. 2003:659–662. [Google Scholar]; c) Saporito A, Marasco D, Chambery A, Botti P, Monti SM, Pedone C, Ruvo M. Biopolymers. 2006;83:508–518. doi: 10.1002/bip.20582. [DOI] [PubMed] [Google Scholar]; d) Aussedat B, Fasching B, Johnston E, Sane N, Nagorny P, Danishefsky SJ. J. Am. Chem. Soc. 2012;134:3532–3541. doi: 10.1021/ja2111459. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Kaji A, Novelli GD, Kaji H. Biochem. Biophys. Res. Commun. 1963;10:406–409. doi: 10.1016/0006-291x(63)90546-1. [DOI] [PubMed] [Google Scholar]
- [10].Varshavsky A. Nat. Struct. Mol. Biol. 2008;15:1238–1240. doi: 10.1038/nsmb1208-1238. [DOI] [PubMed] [Google Scholar]
- [11].Wagner AM, Fegley MW, Warner JB, Grindley CLJ, Marotta NP, Petersson EJ. J. Am. Chem. Soc. 2011;133:15139–15147. doi: 10.1021/ja2055098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12] a).Taki M, Sisido M. Biopolymers. 2007;88:263–271. doi: 10.1002/bip.20678. [DOI] [PubMed] [Google Scholar]; b) Connor RE, Piatkov K, Varshavsky A, Tirrell DA. ChemBioChem. 2008;9:366–369. doi: 10.1002/cbic.200700605. [DOI] [PubMed] [Google Scholar]
- [13].Leibowitz MJ. J. Biol. Chem. 1971;246:5207–5212. [PubMed] [Google Scholar]
- [14].Auluck PK, Caraveo G, Lindquist S. Ann. Rev. Cell. Dev. Biol. 2010;26:211–233. doi: 10.1146/annurev.cellbio.042308.113313. [DOI] [PubMed] [Google Scholar]
- [15].Scarpulla RC, Deutch CE, Soffer RL. Biochem. Biophys. Res. Commun. 1976;71:584–589. doi: 10.1016/0006-291x(76)90827-5. [DOI] [PubMed] [Google Scholar]
- [16].Link AJ, Vink MKS, Agard NJ, Prescher JA, Bertozzi CR, Tirrell DA. Proc. Natl. Acad. Sci. USA. 2006;103:10180–10185. doi: 10.1073/pnas.0601167103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Trujillo M, Radi R. Archives Biochem. Biophys. 2002;397:91–98. doi: 10.1006/abbi.2001.2619. [DOI] [PubMed] [Google Scholar]
- [18] a).Vamvaca K, Volles MJ, Lansbury PT. J. Mol. Biol. 2009;389:413–424. doi: 10.1016/j.jmb.2009.03.021. [DOI] [PMC free article] [PubMed] [Google Scholar]; b) Bartels T, Ahlstrom LS, Leftin A, Kamp F, Haass C, Brown MF, Beyer K. Biophys. J. 2010;99:2116–2124. doi: 10.1016/j.bpj.2010.06.035. [DOI] [PMC free article] [PubMed] [Google Scholar]; c) Bartels T, Choi JG, Selkoe DJ. Nature. 2011;477:107–110. doi: 10.1038/nature10324. [DOI] [PMC free article] [PubMed] [Google Scholar]; d) Vamvaca K, Lansbury PT, Stefanis L. J. Neurochem. 2011;119:389–397. doi: 10.1111/j.1471-4159.2011.07431.x. [DOI] [PMC free article] [PubMed] [Google Scholar]; e) Fauvet B, Fares MB, Samuel F, Dikiy I, Tandon A, Eliezer D, Lashuel HA. J. Biol. Chem. 2012 doi: 10.1074/jbc.M112.383711. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [19].Graciet E, Hu RG, Piatkov K, Rhee JH, Schwarz EM, Varshavsky A. Proc. Natl. Acad. Sci. USA. 2006;103:3078–3083. doi: 10.1073/pnas.0511224103. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
