Abstract
As important epigenetic marks, lysine methylations play critical roles in the regulation of both chromatin and non-chromatin proteins. There are three levels of lysine methylation, mono-, di-, and trimethylation. Each one has turned out to be biologically distinctive. For the biochemical characterization of proteins with lysine methylation, multiple chemical biology methods have been developed. This concept article will highlight these developments and their applications in epigenetic investigation of protein functions.
Keywords: Posttranslational Lysine Methylation, Lysine Monomethylation, Lysine Dimethylation, Histone, Chromatin
In the past two decades, a growing number of different posttranslational modifications (PTMs) of proteins have been identified on both cytoplasmic and nuclear proteins. Known as “epigenetic marks”, these PTMs are critical to both gene expression and metabolic regulation; notwithstanding, detailed molecular mechanisms of many PTMs are still elusive. Among PTMs, Lysine methylation exists widespread in chromatin and non-chromatin proteins.[1] There are three levels of lysine methylation, namely mono-, di-, and trimethylation that lead to three types of posttranslationally methylated lysines Kme1, Kme2, and Kme3 correspondingly in target proteins (Figure 1). As critical epigenetic marks, all three levels of lysine methylation have contributed to chromatin regulation that involves multiple lysine methylation writer, eraser, and reader proteins affecting DNA replication, repair, and gene transcription.[1c] Beyond chromatin, lysine methylation has also been discovered in a number of nuclear and cytoplasmic protein such as p53 and NF-κB.[2] Lysine methylation of these proteins directly influences signal transduction cascades leading to unique ways of transcriptional and metabolic regulation. Although mounting cell biology evidence has demonstrated that lysine methylation is crucial for eukaryotes, understanding how it regulates different cellular processes at the molecular level has largely fallen behind of related cell biology studies due to the impeding difficulty of accessing proteins with site-specific lysine methylation. Traditional biological techniques are in general not applicable for the installation of lysine methylation site-selectively into target proteins. Purifying selectively methylated proteins from their cellular hosts is typically not practical due to the heterogeneous nature of these proteins that usually contain other modifications. Neither is using lysine methylation writer proteins, namely histone lysine methyltransferases[3] for direct methylation of protein substrates at particular lysine sites. Many histone lysine methyltransferases are notoriously promiscuous. Using them for selective lysine methylation is difficult. In addition, a complete picture of what histone lysine methyltransferases catalyze which lysine sites in particular proteins has not yet been fully drawn. Another challenge lies at the three levels of lysine methylation. It has appeared that all three lysine methylation levels serve distinct biological roles. However, most histone lysine methyltransferases catalyze methylation at two or three levels. It is difficult to stop a histone lysine methyltransferases-catalyzed reaction at a particular stage of lysine methylation. Thus, there is clearly a gap for the synthesis of proteins with site-specific lysine methylation. To fill in the gap, other methods are absolutely needed.
Figure 1.

Lysine and its three levels of methylation.
Proteins that undergo the most extensive lysine methylation are no doubt histones.[1c, 4] Given their small size with only 100 to 130 residues, synthetic and semi-synthetic methods have been developed for the synthesis of histones with site-specific lysine methylation. These methods are mostly based on native chemical ligation and its derivative expressed protein ligation.[5] The requirement of a cysteine for the ligation process usually introduces a mutation to histones that, except H3, are naturally devoid of cysteine residues. Fortunately, histones can be easily unfolded/folded and tolerate the Raney nickel-catalyzed desulfurization reaction.[6] Therefore, the preinstalled cysteine in a histone for native chemical ligation can be converted to a native alanine. For the convenience of synthesis in which no protection is necessary for the side chain of Kme3, histones with lysine trimethylation are usually synthesized. The finally afforded trimethyl-histones have been applied in a number of chromatin-related studies.[7] Although powerful, native chemical ligation/expressed protein ligation-based approaches are in general useful for the installation of lysine methylation at the N- and C- termini of histones. Applying them to install lysine methylation in the middle of histones is difficult. Histones with lysine mono- and dimethylation have also rarely been synthesized in this way. In addition, applying these methods to larger proteins especially for the majority of transcription factors and cytosolic proteins appears to be challenging.
For more straightforward access of histones with lysine methylation, Shokat and coworkers developed a methyllysine analog (MLA) installation method as shown in Figure 2A.[8] It exploits both the fact that most histones are devoid of cysteine and the high nucleophilicity of the cysteine thiolate. Analogs of all three methyllysines (mono, di, and tri) have been successfully installed in histones in this way. Given its simplicity, this MLA installation method has been used to synthesize histones with MLAs for probing functions of eraser proteins, namely histone lysine demethylases and methyllysine recognition domains in reader proteins. [9] Although useful, MLAs are structurally different from methyllysines, potentiating altered binding interactions and catalysis. A recent study showed that several methyllysine recognition domains have notably decreased affinities for MLAs relative to their native methyllysine counterparts.[9e] Besides directly converting cysteine to MLAs, methods that generate dehydroalanine in histones and then react with different N-methylated cysteamine derivatives have also been developed for the synthesis of histones with MLAs (Figure 2B). To install dehydroalanine, an alkylated selenocysteine or a phosphoserine coded by an amber codon can be selectively incorporated into a histone during translation and then chemically converted to dehydroalanine;[10] a cysteine in a histone can also been selectively alkylated to from a sulfonium that then undergoes elimination to generate dehydroalanine.[11] The afforded MLAs are racemic. Both the mimicking nature and the loss of chirality lead to concerns of using these MLAs. Built upon the selective dehydroalanine installation technique, Park and coworkers recently developed an Zn-Cu-catalyzed coupling reaction to fuse alkyl groups directly to dehydroalanine, leading to the installation of Kme3 in histones.[10c] Except its racemic nature, Kme3 installed in histones in this way is as same as the native one. A similar approach can be applied for the installation of racemic Kme1 and Kme2 in histones. For all reactions shown in Figure 2, they are well applicable for histones but not for other proteins that may contain essential cysteine residues or are sensitive to reaction conditions such as peroxides, high temperature, and radicals.
Figure 2.

(A) A cysteine-based methyllysine analog (MLA) installation approach for the synthesis of histones with three MLAs; (B) dehydroalanine-based approaches to install racemic MLAs and methyllysines.
In order to develop approaches that can be generally applied for the synthesis of proteins with site-specific and chirally pure lysine monomethylation, several groups have worked out alternative strategies that are based on the amber suppression-based noncanonical amino acid mutagenesis. All these strategies exploit the pyrrolysine (Pyl) incorporation machinery that natively exists in certain methanogenic archaea and bacteria. In these organisms, Pyl is a naturally occurring 22nd amino acid coded by the amber UAG codon.[12] Its incorporation during translation is procured by the action of pyrrolysyl-tRNA synthetase (PylRS) and its cognate amber-suppressing tRNAPyl.[13] A number of research groups have shown that engineered Pyl incorporation systems can be transferred to naive organisms such as E. coli for the coding of a large variety of noncanonical amino acids by amber codon.[14] Among these noncanonical amino acids, several are Kme1 precursors (Figure 3) whose incorporation into proteins followed by chemical or photo-conversion leads to the access of proteins with site-specific and chirally pure lysine monomethylation.[15] Except the one with the Boc protection, all Kme1 precursors can be deprotected in mild conditions compatible with proteins and therefore are suitable for the synthesis of non-histone proteins with authentic lysine monomethylation.
Figure 3.

Kme1 precursors that have been genetically incorporated into histones and their deprotection to afford Kme1.
Although successful in the synthesis of proteins with authentic lysine monomethylation, a similar noncanonical amino acid mutagenesis-based approach cannot be applied for the synthesis of proteins with lysine dimethylation. Kme2 has already three alkyl groups on its side chain nitrogen. Adding further a protection group that can also be easily deprotected is difficult. Based on our own experience, the Pyl incorporation system cannot be evolved to target Kme2 for its direct incorporation at amber codon during protein translation. Although Chin and coworkers previously described a noncanonical amino acid mutagenesis-based multi-step strategy for the synthesis of histones with lysine dimethylation, this strategy involves steps including the incorporation of a protected lysine to a designated site in a histone, global protection of all free amines in the histone, the removal of the protection group from the protected lysine to release the designated lysine for reductive alkylation with formaldehyde for the synthesis of Kme2, and the final removal of the global protection group from all other amines to finally afford a histone with site-specific and chirally pure lysine dimethylation.[16] Although the design is elegant and the approach can be applied to histones that tolerate very harsh conditions, this is not a approach that can be generally used for the synthesis of proteins with authentic lysine dimethylation, especially for transcription factors and cytosolic proteins that readily denature in organic solvents and aggregate when global protection and deprotection are applied. Apparently a more general approach for the installation of authentic lysine dimethylation that can be broadly applied is necessary.
When we initially designed methods for the installation of lysine dimethylation in proteins, we specifically kept in mind that all conditions for chemical conversion had to be protein compatible. For simplicity, we sought to directly incorporate allysine (Figure 4) that undergoes selective reductive amination with dimethylamine in the presence of sodium cyanoborohydride under physiological conditions such as water solvent, room temperature, and neutral pH. Allysine natively occurs in elastin and collagen as a posttranslationally modified noncanonical amino acid.[17] This indicates it is quite stable in a protein context. Unfortunately, we were not able to use an engineered Pyl incorporation system for the direct incorporation of allysine. Providing allysine to the medium significantly inhibits the growth of E. coli. This is presumably due to the toxicity of the side chain aldehyde that may crosslink proteins and DNA. To shield this toxic effect, two kinds of protected precursors were initially designed. One is based on the aldehyde tautomer enol and the other enamine that decomposes in water to form aldehyde. We chose to focus on protected enamines given their similarity with most lysine derivatives that have been genetically encoded using engineered Pyl incorporation systems. Three protection groups shown in Figure 4A were applied. We finally focused on the one with the para-azidoCBZ protection (AcdK) due to its higher stability than the other two. The other two tend to decompose easily. The deprotection of para-azidoCBZ from AcdK to recover enamine and then allysine can be performed via Staudinger reduction with tris(2-carboxyethyl)phosphine (TCEP) that is commonly used for preserving proteins in their native states and therefore protein-compatible (Figure 4B). For the incorporation of AcdK, a corresponding mutant Pyl incorporation system was engineered and successfully used in E. coli to express proteins incorporated with AcdK.[18] Using afforded proteins incorporated with AcdK, we tested our proposed deprotection and reductive amination for the installation of authentic lysine dimethylation. As expected, both reactions are compatible with preserving proteins in their native states and efficient. Finally proteins with authentic lysine dimethylation were confirmed with both mass spectrometry and Western blot analysis. One bonus of the designed method is its ready adaptability for the synthesis of proteins with lysine monomethylation. This can be achieved by simply changing dimethylamine to monomethylamine in the reductive amination step. Another key aspect of this invented technique is its general applicability. Using the technique, we demonstrated the synthesis of histone H3K4me2 (H3 with dimethylation at K4) as a representative chromatin protein, superfolder green fluorescent protein with Kme2 at its 134 position as a representative cytosolic protein, and p53K372me2 (p53 with dimethylation at K372) as a representative transcription factor. Using p53K372me2, we further confirmed that dimethylation at K273 of p53 recruits histone acetyltransferase Tip60 for improved acetylation at K120. This cross-regulation of modifications at two p53 lysines have been suspected without biochemistry evidence.[19] As far as we know, we are the first to directly provide direct evidence to support it. Recently a large variety of cytosolic and nuclear but non-histone proteins have been revealed to undergo lysine mono- and dimethylation.[20] Direct biochemical studies of lysine methylation, especially dimethylation in these proteins have been long hindered by unavailability of techniques for their synthesis. By resolving this obstacle, we expect to observe broad adoption of our current invented technique for functional annotation of lysine mono- and dimethylation in these proteins. New avenues of research can be foreseen.
Figure 4.

(A) The design of three Kme2 precursors; (B) The genetic incorporation of AcdK followed by deprotection with TCEP and then reductive amination to afford Kme2 or Kme1 in a protein.
With general methods for the synthesis of proteins with lysine mono- and dimethylation available, a next technical barrier that needs to be cracked is the development of methods that can be generally applied for the synthesis of native proteins with authentic lysine trimethylation. This is an ongoing research effort in our group. Hopefully we can report our breakthrough in a near future.
In summary, multiple methods have been developed for the synthesis of proteins with lysine methylation. Most of these methods are typically for the synthesis of histones with lysine methylation. Their general applicability is limited. Using the noncanonical amino acid mutagenesis technique, we and others have shown that native proteins with lysine monomethylation can be readily accessible through the genetic incorporation of Kme1 precursors followed by protein-compatible deprotection. Recently we have also devised a sophisticated noncanonical amino acid mutagenesis-based method for the synthesis of native proteins with authentic lysine dimethylation. One key aspect of our techniques is their general applicability. They can be readily applied to synthesize histones, cytoplasmic, and nuclear but non-histone proteins with lysine mono- and dimethylation for their functional characterization. We expect our developed techniques will open new avenues of research and drive significantly our understanding of epigenetic regulation of proteins by lysine methylation.
Acknowledgments
Research support in Liu group is provided from National Institutes of Health (grants R01GM121584 and R01CA161158), National Science Foundation (CHE-1148684), and Welch Foundation (grant A-1715).
References
- 1.a) Lee DY, Teyssier C, Strahl BD, Stallcup MR. Endocrine Reviews. 2005;26:147–170. doi: 10.1210/er.2004-0008. [DOI] [PubMed] [Google Scholar]; b) Cao XJ, Arnaudo AM, Garcia BA. Epigenetics. 2013;8:477–485. doi: 10.4161/epi.24547. [DOI] [PMC free article] [PubMed] [Google Scholar]; c) Zhang Y, Reinberg D. Genes Dev. 2001;15:2343–2360. doi: 10.1101/gad.927301. [DOI] [PubMed] [Google Scholar]
- 2.a) Chuikov S, Kurash JK, Wilson JR, Xiao B, Justin N, Ivanov GS, McKinney K, Tempst P, Prives C, Gamblin SJ, Barlev NA, Reinberg D. Nature. 2004;432:353–360. doi: 10.1038/nature03117. [DOI] [PubMed] [Google Scholar]; b) Kurash JK, Lei H, Shen Q, Marston WL, Granda BW, Fan H, Wall D, Li E, Gaudet F. Mol Cell. 2008;29:392–400. doi: 10.1016/j.molcel.2007.12.025. [DOI] [PubMed] [Google Scholar]; c) Yang XD, Huang B, Li M, Lamb A, Kelleher NL, Chen LF. EMBO J. 2009;28:1055–1066. doi: 10.1038/emboj.2009.55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Boriack-Sjodin PA, Swinger KK. Biochemistry. 2015 doi: 10.1021/acs.biochem.5b01129. [DOI] [PubMed] [Google Scholar]
- 4.Zhang T, Cooper S, Brockdorff N. EMBO Rep. 2015 doi: 10.15252/embr.201540945. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.a) Dawson PE, Muir TW, Clark-Lewis I, Kent SB. Science. 1994;266:776–779. doi: 10.1126/science.7973629. [DOI] [PubMed] [Google Scholar]; b) Muir TW, Sondhi D, Cole PA. Proc Natl Acad Sci U S A. 1998;95:6705–6710. doi: 10.1073/pnas.95.12.6705. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.He S, Bauman D, Davis JS, Loyola A, Nishioka K, Gronlund JL, Reinberg D, Meng F, Kelleher N, McCafferty DG. Proc Natl Acad Sci U S A. 2003;100:12033–12038. doi: 10.1073/pnas.2035256100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.a) Kawakami T, Akai Y, Fujimoto H, Kita C, Aoki Y, Konishi T, Waseda M, Takemura L, Aimoto S. Bulletin of the Chemical Society of Japan. 2013;86:690–697. [Google Scholar]; b) Li J, Li Y, He Q, Li Y, Li H, Liu L. Organic & biomolecular chemistry. 2014;12:5435–5441. doi: 10.1039/c4ob00715h. [DOI] [PubMed] [Google Scholar]; c) Nguyen UT, Bittova L, Müller MM, Fierz B, David Y, Houck-Loomis B, Feng V, Dann GP, Muir TW. Nature methods. 2014;11:834–840. doi: 10.1038/nmeth.3022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.a) Simon MD, Chu F, Racki LR, de la Cruz CC, Burlingame AL, Panning B, Narlikar GJ, Shokat KM. Cell. 2007;128:1003–1012. doi: 10.1016/j.cell.2006.12.041. [DOI] [PMC free article] [PubMed] [Google Scholar]; b) Lu X, Simon MD, Chodaparambil JV, Hansen JC, Shokat KM, Luger K. Nat Struct Mol Biol. 2008;15:1122–1124. doi: 10.1038/nsmb.1489. [DOI] [PMC free article] [PubMed] [Google Scholar]; c) Huang R, Holbert MA, Tarrant MK, Curtet S, Colquhoun DR, Dancy BM, Dancy BC, Hwang Y, Tang Y, Meeth K, Marmorstein R, Cole RN, Khochbin S, Cole PA. J Am Chem Soc. 2010;132:9986–9987. doi: 10.1021/ja103954u. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.a) Kim SA, Chatterjee N, Jennings MJ, Bartholomew B, Tan S. Nucleic Acids Res. 2015;43:4868–4880. doi: 10.1093/nar/gkv388. [DOI] [PMC free article] [PubMed] [Google Scholar]; b) Pack LR, Yamamoto KR, Fujimori DG. J Biol Chem. 2016;291:6060–6070. doi: 10.1074/jbc.M115.696864. [DOI] [PMC free article] [PubMed] [Google Scholar]; c) Torres IO, Kuchenbecker KM, Nnadi CI, Fletterick RJ, Kelly MJ, Fujimori DG. Nat Commun. 2015;6:6204. doi: 10.1038/ncomms7204. [DOI] [PMC free article] [PubMed] [Google Scholar]; d) Canzio D, Chang EY, Shankar S, Kuchenbecker KM, Simon MD, Madhani HD, Narlikar GJ, Al-Sady B. Mol Cell. 2011;41:67–81. doi: 10.1016/j.molcel.2010.12.016. [DOI] [PMC free article] [PubMed] [Google Scholar]; e) Seeliger D, Soeroes S, Klingberg R, Schwarzer D, Grubmuller H, Fischle W. ACS Chem Biol. 2012;7:150–154. doi: 10.1021/cb200363r. [DOI] [PMC free article] [PubMed] [Google Scholar]; f) Wilkinson AW, Gozani O. Biochim Biophys Acta. 2014;1839:669–675. doi: 10.1016/j.bbagrm.2014.01.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.a) Wang J, Schiller SM, Schultz PG. Angew Chem Int Ed Engl. 2007;46:6849–6851. doi: 10.1002/anie.200702305. [DOI] [PubMed] [Google Scholar]; b) Wang ZU, Wang YS, Pai PJ, Russell WK, Russell DH, Liu WR. Biochemistry. 2012;51:5232–5234. doi: 10.1021/bi300535a. [DOI] [PMC free article] [PubMed] [Google Scholar]; c) Yang A, Ha S, Ahn J, Kim R, Kim S, Lee Y, Kim J, Soll D, Lee HY, Park HS. Science. 2016;354:623–626. doi: 10.1126/science.aah4428. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Chalker JM, Gunnoo SB, Boutureira O, Gerstberger SC, Fernandez-Gonzalez M, Bernardes GJL, Griffin L, Hailu H, Schofield CJ, Davis BG. Chemical Science. 2011;2:1666–1676. [Google Scholar]
- 12.Srinivasan G, James CM, Krzycki JA. Science. 2002;296:1459–1462. doi: 10.1126/science.1069588. [DOI] [PubMed] [Google Scholar]
- 13.Polycarpo C, Ambrogelly A, Berube A, Winbush SM, McCloskey JA, Crain PF, Wood JL, Soll D. Proc Natl Acad Sci U S A. 2004;101:12450–12454. doi: 10.1073/pnas.0405362101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.a) Neumann H, Peak-Chew SY, Chin JW. Nat Chem Biol. 2008;4:232–234. doi: 10.1038/nchembio.73. [DOI] [PubMed] [Google Scholar]; b) Bianco A, Townsley FM, Greiss S, Lang K, Chin JW. Nat Chem Biol. 2012;8:748–750. doi: 10.1038/nchembio.1043. [DOI] [PubMed] [Google Scholar]; c) Chen PR, Groff D, Guo J, Ou W, Cellitti S, Geierstanger BH, Schultz PG. Angew Chem Int Ed Engl. 2009;48:4052–4055. doi: 10.1002/anie.200900683. [DOI] [PMC free article] [PubMed] [Google Scholar]; d) Zhang M, Lin S, Song X, Liu J, Fu Y, Ge X, Fu X, Chang Z, Chen PR. Nat Chem Biol. 2011;7:671–677. doi: 10.1038/nchembio.644. [DOI] [PubMed] [Google Scholar]; e) Huang Y, Wan W, Russell WK, Pai PJ, Wang Z, Russell DH, Liu W. Bioorg Med Chem Lett. 2010;20:878–880. doi: 10.1016/j.bmcl.2009.12.077. [DOI] [PubMed] [Google Scholar]; f) Wan W, Tharp JM, Liu WR. Biochim Biophys Acta. 2014;1844:1059–1070. doi: 10.1016/j.bbapap.2014.03.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.a) Nguyen DP, Garcia Alai MM, Kapadnis PB, Neumann H, Chin JW. J Am Chem Soc. 2009;131:14194–14195. doi: 10.1021/ja906603s. [DOI] [PubMed] [Google Scholar]; b) Wang YS, Wu B, Wang Z, Huang Y, Wan W, Russell WK, Pai PJ, Moe YN, Russell DH, Liu WR. Mol Biosyst. 2010;6:1557–1560. doi: 10.1039/c002155e. [DOI] [PubMed] [Google Scholar]; c) Ai HW, Lee JW, Schultz PG. Chem Commun (Camb) 2010;46:5506–5508. doi: 10.1039/c0cc00108b. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Nguyen DP, GarciaAlai MM, Virdee S, Chin JW. Chem Biol. 2010;17:1072–1076. doi: 10.1016/j.chembiol.2010.07.013. [DOI] [PubMed] [Google Scholar]
- 17.Yamauchi M, Sricholpech M. Essays Biochem. 2012;52:113–133. doi: 10.1042/bse0520113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Wang ZA, Zeng Y, Kurra Y, Wang X, Tharp JM, Vatansever EC, Hsu WW, Dai S, Fang X, Liu WR. Angew Chem Int Ed Engl. 2017;56:212–216. doi: 10.1002/anie.201609452. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Tang Y, Luo J, Zhang W, Gu W. Mol Cell. 2006;24:827–839. doi: 10.1016/j.molcel.2006.11.021. [DOI] [PubMed] [Google Scholar]
- 20.Islam K, Chen Y, Wu H, Bothwell IR, Blum GJ, Zeng H, Dong A, Zheng W, Min J, Deng H, Luo M. Proc Natl Acad Sci U S A. 2013;110:16778–16783. doi: 10.1073/pnas.1216365110. [DOI] [PMC free article] [PubMed] [Google Scholar]
