Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Sep 24.
Published in final edited form as: Angew Chem Int Ed Engl. 2018 Sep 5;57(39):12702–12706. doi: 10.1002/anie.201806197

Building and Breaking Bonds via a Compact S-propargyl-cysteine to Chemically Control Enzymes and Modify Proteins

Jun Liu [a], Rujin Cheng [b], Haifan Wu [a], Shanshan Li [a],[c], Peng G Wang [c], William F DeGrado [a], Sharon Rozovsky [b], Lei Wang [a]
PMCID: PMC6169525  NIHMSID: NIHMS988494  PMID: 30118570

Abstract

Analogous to reversible post-translational protein modifications, the ability to attach and subsequently remove modifications on proteins would be valuable for protein and biological research. Although bioorthogonal functionalities have been developed to conjugate or cleave protein modifications, they are introduced into proteins on separate residues and often with bulky side chains, limiting their use to one type of control and primarily protein surface. Here we achieved dual control on one residue by genetically encoding S-propargyl-cysteine (SprC), which has bioorthogonal alkyne and propargyl groups in a compact structure, permitting usage in protein interior in addition to surface. We demonstrated its incorporation at the dimer interface of glutathione transferase for in vivo crosslinking via thiol-yne click chemistry, and at the active site of human rhinovirus 3C protease for masking and then turning on enzyme activity via Pd-cleavage of SprC into Cys. In addition, we installed biotin onto EGFP via Sonogashira coupling of SprC and then tracelessly removed it via Pd cleavage. SprC is small in size, commercially available, nontoxic, and allows for bond building and breaking on a single residue. Genetically encoded SprC will be valuable for chemically controlling proteins with an essential Cys and for reversible protein modifications.

Keywords: propargyl cysteine, palladium-mediated cleavage, reversible protein modification, thiol-yne, Sonogashira coupling

Graphical Abstract

S-propargyl-cysteine (SprC) was genetically encoded into proteins at the interface to crosslink proteins and at active site to mask an enzyme, which was chemically activated by Pd(TPPTS)4. In addition, molecular modifications were installed onto proteins via Sonogashira cross coupling on SprC by Pd(NO3)2, and then tracelessly removed through Pd(TPPTS)4.

graphic file with name nihms-988494-f0001.jpg


Protein functions are enriched and regulated by various post-translational modifications, which are installed onto and removed from proteins reversibly. Such reversibility is achieved through enzymes with opposing activities in vivo. For protein research and engineering, molecular modifications can be attached onto proteins using bioorthogonal ligations, and a pre-installed modification can also be removed through light or chemical cleavage.[1] In particular, genetically encoding unnatural amino acids (Uaas) has enabled the incorporation of various bioorthogonal functionalities into proteins, which have been separately harnessed for ligation or for decaging.[2] Yet it has been challenging to integrate the installation and subsequent removal on the same residue. In addition, available Uaas bearing bioorthogonal functionalities are often significantly longer in length or bulkier in size than canonical amino acids.[2d] Although generally adequate for modifications on protein surface, they may introduce undesired perturbations at protein interior, active site, and protein-protein interface. To exploit bioorthogonal functionalities at these positions for modulating protein activities or probing protein-protein interactions, it is desirable to incorporate bioorthogonal Uaas with side chain size close to those of natural amino acids, which will less likley interfere with protein folding, stability, and expression.

Here we genetically encoded S-propargyl-Cysteine (SprC), which has bioorthorgonal functionalities in a compact structure comparable to canonical amino acids. The presence of alkyne and propargyl in this Uaa enabled the reversible installation and removal of modifications on proteins via a single residue. Its compact size also enabled its incorporation into protein interfaces for chemical coupling, and into enzyme active sites for chemically turning on enzyme activity. We expect that the genetically encoded SprC will be valuable for chemically controlling various enzymes with an essential Cys for gain-of-function studies, and for reversible protein modifications in applications such as protein enrichment, purification, and analysis.

To achieve reversible protein modification via a Uaa, the Uaa must contain a bioorthogonal functional group for protein coupling and a linkage that is stable in the coupling step but selectively cleavable afterwards. The alkyne group has been used for biorthogonal copper-catalyzed azide-alkyne cycloaddition (CuAAC) and Sonogashira coupling.[3] Although pyrrolysine- and phenylalanine-derivatized alkyne Uaas have been genetically encoded,[4] they are used to attach labels only, which cannot be removed. They have bulky side chains and are incorporated onto protein surfaces in general. For bond cleavage, we sought for cleavage mediated by chemicals. Compared with photocaged Uaas, chemically protected Uaas are more stable and have potential to be utilized in tissues and animals since chemical penetration into tissue is superior to that of light.[5] Recently, palladium (Pd) catalysts have been applied for efficient O-C and S-C bond cleavage.[5-6] We therefore decided to incorporate SprC (Fig. 1a), which will provide a terminal alkyne for biorthogonal conjugation and a propargyl group potentially susceptible to Pd cleavage, all in a compact structure.

Figure 1. Genetic incorporation of SprC into proteins and its conversion into Cys.

Figure 1.

a) Scheme of SprC incorporation and conversion to Cys by Pd-mediated cleavage. b) SDS-PAGE analysis of SprC incorporation into Trx at site 73. c) Mass spectrum of intact Trx(73SprC). d) Mass spectrum of Trx(73SprC) after Pd cleavage. * indicates a gain of +16 Da for oxidation. e) Thiol-Michael addition reaction of AMS with Trx(73SprC) before and after Pd cleavage. f) Tandem MS of trypsin digested Trx(73SprC).

To genetically incorporate SprC, we evolved a tRNAPyl-synthetase specific for it. We first tested a C348W/W417S mutant of Methanosarcina mazei PylRS, which was reported to incorporate S-allyl cysteine, a Uaa structurally similar to SprC,[6c] and a transplanted C313W/W382S mutant of Methanosarcina barkeri PylRS (MbPylRS). However, they both did not incorporate SprC. We then generated a library to randomize these two sites and subjected it to selections for SprC specificity as previously described.[7] All hits identified were found to bear the same mutations (C313W/W382T). We further randomized active site residues L270, Y271, and L274 (Fig. S1 and Supplementary results), and all successful hits contained the same amino acid residues as C313W/W382T despite codon changes at other sites. We thus named MbPylRS(C313W/W382T) as SprCRS.

We examined the ability of tRNAPyl/SprCRS to incorporate SprC into several proteins in E. coli. When tRNAPyl/SprCRS were co-expressed with the EGFP(182TAG) gene, full-length EGFP was produced in the presence of SprC, but was barely detectable in its absence (Fig. S2). SprC could be supplemented into the media at high 10 mM without toxicity, and the EGFP yield increased with SprC concentration in the media (Fig. S2). We also tested incorporating SprC into human thioredeoxin (Trx) by coexpressing the Trx(73TAG) gene with tRNAPyl/SprCRS. Full length Trx was purified only in the presence of SprC, and the intact protein was analyzed by ESI-MS (Fig. 1c). A peak observed at 12722 Da corresponds to Trx lacking the initiator Met and containing SprC at site 73 (expected: 12722 Da). No peaks corresponding to misincorporation of other amino acids at site 73 were identified, indicating high fidelity of SprC incorporation. This high fidelity was further demonstrated by incorporating SprC into a third protein, ubiquitin, followed by ESI-MS analysis (Fig. S3). In addition, we sequenced the Trx(73SprC) protein with tandem MS. A series of b and y ions unambiguously indicate that SprC was incorporated at the TAG-specified position 73 (Fig. 1f, Fig. S4). These results indicate that the tRNAPyl/SprCRS pair incorporated SprC into proteins in E. coli with high specificity.

We then tested to chemically convert the incorporated SprC into Cys by cleaving the propargyl group using Pd. Trx(73SprC) protein was treated with Pd(TPPTS)4 and analyzed by ESI-MS (Fig. 1D). A peak was observed at 12684 Da, which corresponds to Trx(73Cys) without initiator Met (expected: 12684 Da), while the peak corresponding to Trx(73SprC) disappeared, signifying that SprC was converted into Cys in high efficiency. In addition, the reactivity of the liberated Cys residue on Trx was also demonstrated by thiol-Michael addition reaction with an alkylating reagent 4-acetamido-4’-maleimidylstilbene-2,2’-disulfonic acid (AMS). AMS reacted with the regenerated Cys of Pd-treated Trx(73SprC), causing a band shift of 0.5 kDa on the SDS-PAGE (Fig. 1e).

Activating enzymes with chemicals provides a gain-of-function approach to probe the sufficiency of target enzyme to cellular phenotype.[8] Since SprC could be efficiently converted into Cys through Pd-mediated deprotection at mild conditions, we explored to control the catalytic activity of a cysteine protease with SprC. Photocaged Cys have been genetically encoded to control protein activities,[9] yet they are usually bulkier than canonical amino acids and may not fit well into the active site. SprC size is comparable to canonical amino acids and should minimize potential perturbation. The human rhinovirus-14 3C (HRV 3C) protease bears a catalytic Cys147 in the active site (Fig. 2A),[10] and is responsible for viral protein maturation and cold infection.[11] Due to robust activity in diverse conditions it also serves to remove tag in protein affinity purification and engineering.[12] We codon-optimized the HRV 3C protease gene for E. coli expression and introduce the TAG codon at Cys147 for SprC incorporation. The purified HRV 3C(147SprC) protease was more stable than the WT HRV 3C protease at room temperature (Fig. S5), suggesting that SprC substitution for Cys did not negatively affect the protease.

Figure 2. Turn on HRV 3C protease activity through Pd-mediated conversion of SprC to Cys.

Figure 2.

a) Scheme showing the chemical activation strategy. SprC was incorporated at Cys147 in the active site of HRV 3C protease (PDB: 2IN2). The other two active site residues His41 and Glu72 are shown in stick. b) SDS-PAGE analysis of HRV 3C protease activity. Substrate Ub-linker-GyrA contained a cleavage site for HRV 3C in the linker. c) ESI-MS confirming incorporation of SprC into HRV 3C protease at site 147 in high specificity. The expected [M+H]+ and [M-Met+H]+ for HRV 3C(147SprC) are 21308 Da and 21177 Da; observed 21307 Da and 21175 Da, respectively. d) ESI-MS of Pd-treated HRV 3C(147SprC) confirming conversion of SprC into Cys in high efficiency. The expected [M+H]+ and [M-Met+H]+ for HRV 3C (147Cys) are 21270 Da and 21139 Da; observed 21270 Da and 21139 Da, respectively.

To measure HRV 3C protease activity, we prepared a substrate by introducing the protease cleavage site and a Gly6 linker between ubiquitin and GyrA intein for recognition by HRV 3C (Supplementary results). After SprC incorporation, the resultant HRV 3C(147SprC) protease was unable to cleave the substrate Ub-GyrA (Fig. 2b), indicating that the protease activity was masked by SprC. Upon addition of Pd in situ, the substrate was completely cleaved into two fragments, indicating that the HRV 3C protease activity was activated (Fig. 2b). We also analyzed the HRC 3C(147SprC) protease before and after Pd treatment using MS, which confirmed SprC incorporation at 147 site and subsequent deprotection to Cys (Fig. 2c, 2d), confirming that protease activation was due to chemical conversion of SprC to Cys. These results show that SprC could mask and then liberate Cys through chemical deprotection to activate the cysteine protease HRC 3C.

In addition to bond cleavage, SprC could also be harnessed for building bonds via its bioorthogonal alkyne group. Thiol-ene or thiol-yne click chemistry has wide applications in preparation of hydrogel, coating macromolecules on surface, polymerization, and protein labeling.[13] Since SprC has compact size and thus is less likely to perturb protein folding and interaction, we explored its usage at protein interface to crosslink proteins via thiol-yne reaction. E. coli glutathione transferase (GST) forms a dimer with a large dimer interface. We introduced SprC at position 103 that is buried in the interface, and examined whether it could covalently crosslink Cys introduced at Lys107, a residue in proximity at the interface on the other monomer (Fig. 3a). After SprC incorporation into GST, host E. coli cells were illuminated with 302 nm light to activate the thiol-yne reaction in vivo. Cell lysates were separated under denatured conditions and probed with Western blot (Fig. 3b). Covalent GST dimerization was markedly increased for GST (103SprC/107Cys) in comparison to GST(103SprC/107Lys) or GST(103SprC/107Ala), demonstrating that SprC crosslinked with Cys via thiol-yne reaction in vivo. The GST dimerization was also confirmed by SDS-PAGE and detected by mass spectrometry (Fig. S6). There are many radical scavengers in vivo competing for the thiol-yne reaction, such as antioxidant enzymes and thiols, yet the SprC-Cys still crosslinked successfully. It is likely that the close proximity of SprC-Cys and their location at the buried dimer interface facilitated the reaction and excluded competing radical scavengers. Incorporation of SprC into protein interior and interface due to its compact size will allow for the use of thiol-yne reaction for probing protein interactions in living cells.

Figure 3. SprC enables thiol-yne click chemistry at GST dimer interface in live E. coli cells.

Figure 3.

a) Left: E. coli GST dimer (PDB code: 1A0F) showing an extensive dimer interface and two proximal residues 103 and 107 buried in the interface. Right: Scheme showing the cross linking of GST dimer through thiol-yne reaction between SprC103 and Cys107. b) Western blot analysis of covalent cross-linking of GST dimer.

As SprC contains two chemical features for building and breaking bonds respectively, we reasoned that it can afford a “Two birds one stone” route for clicking on and subsequent cleaving off protein modifications. We initially tested this possibility by using the CuAAC click chemistry to conjugate the alkyne group of SprC.[3a] As expected, EGFP(182SprC) was efficiently ligated to various azide derivatives, such as PEG-N3, biotin-N3, and azide agarose (Fig. S7, S8, S9a).[14] However, the ligation product triazolated Cys could not be efficiently cleaved by Pd(TPPTS)4 (Fig. S9b).

We then considered Sonogashira coupling because it retains the alkyne group after coupling[15] and we hypothesized that an interlinking alkyne would still allow Pd-mediated cleavage as the propargyl group. Since Sonogashira coupling uses Pd2+ and the reaction time is short,[3b, 4b] we reasoned that using two different Pd catalysts would control the coupling and cleavage reaction sequentially (Fig. 4a). To test this idea, we synthesized the biotin-iodobenzene, and successfully coupled it to EGFP(182SprC) using Pd(NO3)2 as the catalyst, as demonstrated by Western blot analysis of the reaction product using a biotin-recognizing avidin HRP conjugate (Fig. 4b). The coupling product was also analyzed by ESI-MS, which confirmed the reaction of biotin-iodobenzene with SprC (Fig. 4d). The disappearance of peaks for EGFP(182SprC) suggests that the Sonogashira coupling occurred efficiently (Fig. 4c and 4d). We subsequently treated the coupled product with Pd(TPPTS)4. Western blot analysis showed that biotin was efficiently removed (Fig. 4e); ESI-MS analysis of the reaction indicates that the cleavage generated Cys in high efficiency, as the coupled product could no longer be detected after cleavage (Fig. 4f). Together these results indicate that SprC could be used for facile installation and subsequent removal of modifications onto proteins under mild conditions compatible with proteins.

Figure 4. Click on and off protein modifications using SprC.

Figure 4.

a) SprC in EGFP enables Sonogashira coupling with a biotin label mediated by Pd2+, followed by traceless cleavage using Pd(TPPTS)4. b) SDS-PAGE and Western blot analyses of Sonogashira coupling. EGFP(182SprC) was coupled with biotin-iodobenzene and Pd(NO3)2 for 1 h at RT. After coupling, centrifugation had no effect on samples, suggesting proteins were stable. c) Mass spectrum of EGFP(182SprC). EGFP(182SprC): expected 27911 Da; EGFP(182SprC) without initiator Met: expected 27779 Da. d) Mass spectrum of biotin-iodobenzene labelled EGFP(182SprC). EGFP(182SprC)+label: expected 28387 Da; EGFP(182SprC) without initiator Met+label: expected 28256 Da. e) Biotin-iodobenzene labelled EGFP(182SprC) was treated with Pd(TPPTS)4 for cleavage, and then analyzed with SDS-PAGE and Western blot. f) Mass spectrum of biotin-iodobenzene labelled EGFP(182SprC) followed by Pd(TPPTS)4 treatment. EGFP(182Cys): expected 27873 Da; EGFP(182Cys) without initiator Met: expected 27741 Da.

In summary, we genetically encoded SprC into proteins and demonstrated the use of its two bioorthogonal functionalities for building and breaking bonds under mild conditions compatible with proteins. In addition to protein surface, the compact structure of SprC enables its incorporation into protein interior. We demonstrated its usage at protein interfaces for crosslinking proteins through the thiol-yne click reaction, and at the active site of cysteine protease for enzyme activation through Pd-mediated conversion of SprC into Cys. For many biophysical studies, such as EPR and FRET, it is beneficial to keep the respective spin and fluorescent labels as close as possible to protein backbone for accurate measurements. SprC will be advantagous for such labeling. In addition, using SprC to mask a critical Cys will ensure the target protein start from a completely inactive status, as no Cys will be misincorporated by the tRNAPyl/SprCRS pair, and then be activated by Pd catalyst, affording an approach to turn on enzymes chemically. Deprotection of chemically caged Lys has been demonstrated in situ with cell compatible Pd catalysts;[5] SprC deprotection by specific type of Pd in vivo will be explored. This chemically turning on approach will be valuable for studying the sufficiency of target protein to cells via gain-of-function, and can be generally applicable to proteins containing a critical Cys residue. Masking and liberating Cys chemically via SprC may also be used to label two Cys residues differently and to protect Cys and selenocysteine in peptide synthesis.[16] Moreover, harnessing the alkyne and propargyl group of SprC we demonstrated the installation and subsequent removal of a modification to proteins on the same residue, via Sonogashira coupling and Pd-mediated deprotection, respectively. This reversible modification of proteins will be valuable for enrichment, purification, and analysis purposes. The ability to reversibly regulate a protein-of-interest using chemicals in vivo in the future will allow facile study of protein function and chemically control related biological events. Lastly, SprC is commercially available, generally nontoxic to cells, and has high stability and bioorthogonality in proteins, which will facilitate ready adoption for other applications.

Supplementary Material

Supporting Information

Acknowledgements

S.R. acknowledges the support of NSF (MCB-1616178) and NIH (5 P30 GM110758–02, 5 P20 GM104316 and R01GM121607A); L.W. acknowledges the support of the NIH (R01GM118384, RF1MH114079).

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

RESOURCES