Chloroplast glutamate peptidase has physiological exo- and endo-glutamyl peptidase activity, and autocatalytic removal of its C-terminal prosequence increases substrate size limitation.
Abstract
Chloroplast proteostasis is governed by a network of peptidases. As a part of this network, we show that Arabidopsis (Arabidopsis thaliana) chloroplast glutamyl peptidase (CGEP) is a homo-oligomeric stromal Ser-type (S9D) peptidase with both exo- and endo-peptidase activity. Arabidopsis CGEP null mutant alleles (cgep) had no visible phenotype but showed strong genetic interactions with stromal CLP protease system mutants, resulting in reduced growth. Loss of CGEP upregulated the chloroplast protein chaperone machinery and 70S ribosomal proteins, but other parts of the proteostasis network were unaffected. Both comparative proteomics and mRNA-based coexpression analyses strongly suggested that the function of CGEP is at least partly involved in starch metabolism regulation. Recombinant CGEP degraded peptides and proteins smaller than ∼25 kD. CGEP specifically cleaved substrates on the C-terminal side of Glu irrespective of neighboring residues, as shown using peptide libraries incubated with recombinant CGEP and mass spectrometry. CGEP was shown to undergo autocatalytic C-terminal cleavage at E946, removing 15 residues, both in vitro and in vivo. A conserved motif (A[S/T]GGG[N/G]PE946) immediately upstream of E946 was identified in dicotyledons, but not monocotyledons. Structural modeling suggested that C-terminal processing increases the upper substrate size limit by improving catalytic cavity access. In vivo complementation with catalytically inactive CGEP-S781R or a CGEP variant with an unprocessed C-terminus in a cgep clpr2-1 background was used to demonstrate the physiological importance of both CGEP peptidase activity and its autocatalytic processing. CGEP homologs of photosynthetic and nonphotosynthetic bacteria lack the C-terminal prosequence, suggesting it is a recent functional adaptation in plants.
Organellar peptidases (proteases) play important roles in proteome homeostasis (proteostasis; van Wijk, 2015). Arabidopsis (Arabidopsis thaliana) chloroplasts house ∼90 to 95 proteins directly involved with intrachloroplast protein maturation and proteolysis (Majsec et al., 2017). One of these is chloroplast glutamyl endopeptidase (CGEP; AT2G47390), a large (∼97 kD) Ser peptidase that was assigned in the database MEROPS (https://www.ebi.ac.uk/merops/) to the S9D subfamily within the SC clan of Ser proteases (Rawlings et al., 2014). We became interested in CGEP because maize (Zea mays) CGEP is highly enriched in bundle-sheath–strand chloroplasts as compared to mesophyll chloroplasts (Martin et al., 2007; Friso et al., 2010; Majeran et al., 2010). Furthermore, CGEP is the only known glutamyl-endopeptidase in plants and is evolutionarily unrelated to γ-glutamyl peptidases/transferases (MEROP clan-family: PA-C26) involved in the glutathione cycle and glutathione conjugates (Bachhawat and Yadav, 2018; Joshi et al., 2019). The other known glutamyl peptidase, also exhibiting minor aspartyl endopeptidase activity, is the small (27 kD) GluC (also named “V8”) from Staphylococcus aureus (Drapeau et al., 1972; Houmard and Drapeau, 1972) in the PA-S1B family, which is unrelated to CGEP. There is no known V8/GluC homolog in Arabidopsis (or eukaryotes in general), which makes CGEP the only confirmed endopeptidase in plants.
Chloroplast glutamyl peptidase activity was initially detected in stroma of spinach (Spinacia oleracea) chloroplasts and the native mass of the activity was between 350 and 380 kD; however, the protein was not identified (Laing and Christeller, 1997). The enriched stromal fraction was able to very slowly (>1–2 d) cleave insulin (10 kD) on the C-terminal side of Glu, but neither Rubisco, RNase, nor casein were cleaved. The enriched fraction efficiently cleaved the synthetic peptide carbobenzoxy-Leu-Leu-Glu-naphthylamide after Glu (Laing and Christeller, 1997). A glutamyl endopeptidase activity was also isolated from cucumber (Cucumis sativus) leaves, and partial amino acid sequencing of a 97-kD protein suggested that it was a homolog of Arabidopsis CGEP (Yamauchi et al., 2001). Incubation with a small set of synthetic substrates showed preferential cleavage C-terminal of Glu. Finally, a chloroplast stromal fraction from pea (Pisum sativum) was reported to cleave the recombinant N terminus of LHCII after Glu residues and CGEP was identified as the peptidase (Forsberg et al., 2005). However, the CGEP-enriched stromal fraction could not degrade native LHCII. Collectively, these three studies suggest there is an active stromal glutamyl peptidase in chloroplasts, warranting further characterization. So far, no loss-of-function mutants for CGEP have been described and it is not known if CGEP has any genetic interactions with other chloroplast proteases.
This study provides a comprehensive functional analysis of Arabidopsis chloroplast CGEP addressing both in vitro activities and in vivo physiological relevance. Comparative proteomics and genetic interactions show that CGEP is part of the chloroplast proteostasis network. Both proteomics and coexpression analyses suggest that CGEP is directly or indirectly involved in the regulation of starch metabolism. Tandem mass spectrometry (MS/MS)–based analysis of peptidase activity of recombinant CGEP (rCGEP) incubated with large peptide libraries (using the proteomic identification of protease cleavage sites [PICS] technique; Schilling et al., 2011; Biniossek et al., 2016) show that CGEP specifically cleaves after Glu residues without any observable effects of neighboring residues. CGEP can digest peptides and small proteins, but not larger proteins. Surprisingly, Arabidopsis CGEP autocatalytically cleaves 15 residues from its C terminus both in vitro and in vivo, and based on homology structural modeling and experimentation, this cleavage enhances substrate entry into the catalytic pocket. In vivo complementation demonstrates that CGEP autocatalytic C-terminal processing and CGEP peptidase play functional roles in the chloroplast proteostasis network.
RESULTS
CGEP Is a Highly Conserved Protein in Plants
Phylogeny of 41 CGEP homologs (protease family S9D) across the species tree-of-life showed two major clades—one with photosynthetic eukaryotes, and one with cyanobacteria, proteobacteria (α, β, and γ), and flavobacteria (within the FCB group; Fig. 1A; Supplemental Table S1; Supplemental Dataset S1). Conserved key features include the catalytic triad (Ser-781, Asp-855, and His-889—numbering for Arabidopsis CGEP) and the GGHSYGAF signature sequence for S9D peptidases (Supplemental Fig. S1). We did not identify CGEP homologs in the sequenced gymnosperms (Picea glauca, Picea albies, and Pinus taeda), the glaucophyte Cyanophora paradoxa, nor archaea. Moreover, CGEP homologs were not observed in nonphotosynthetic eukaryotes (e.g. yeast [Saccharomyces cerevisiae], Neurospora spp., humans [Homo sapiens], Drosophila spp.). CGEP is a single gene family in most photosynthetic species, including Arabidopsis, but some angiosperms have two homologs (e.g. cotton [Gossypium spp.] and rice [Oryza sativa]; Supplemental Table S1). Sequence alignment of angiosperm (monocot and dicot) CGEP homologs revealed high sequence identity for most of the protein, with the exception of the predicted chloroplast transit peptides (cTPs) and C-terminal region (Supplemental Fig. S1, A and B). Most of the angiosperm CGEP homologs exhibited a predicted cTP, whereas three others were found to lack the C-terminal portion (Egr2, Mtr2, and Osa2). These missing regions are likely due to incomplete genome sequencing and/or mistakes in the assembly. The phylogeny suggested that CGEP in photosynthetic eukaryotes originates from an early progenitor of plants and algae, but not directly from endosymbiosis with cyanobacteria.
CGEP Forms Homo-Oligomers in Chloroplast Stroma
CGEP was detected in previous proteomics studies of leaf and chloroplast samples in both maize and Arabidopsis (Huang et al., 2013), as can also be viewed through the Plant Proteome Database (PPDB) at http://ppdb.tc.cornell.edu/. Using a newly generated anti-CGEP polyclonal serum, SDS-PAGE and immuno-detection showed that Arabidopsis CGEP is ∼100 kD and localizes nearly exclusively to chloroplast stroma and is not present in the chloroplast membrane fraction (Fig. 1B). The native size of CGEP in Arabidopsis was determined by size exclusion chromatography of stromal proteome, followed by SDS-PAGE and immuno-detection (Fig. 1C), as well as native PAGE (Fig. 1D). This showed that native Arabidopsis CGEP forms complexes, consistent with observations for the cucumber homolog (Yamauchi et al., 2001). To determine if CGEP forms stable interactions with other proteins, extensive affinity purifications with anti-CGEP serum using isolated stromal proteome samples from Arabidopsis were carried out (Supplemental Text S1). The experiments were all successfully highly enriched for CGEP (based on sequence coverage and protein scores), but enrichment analysis did not identify obvious interacting proteins (see Supplemental Text S1). We therefore conclude that Arabidopsis chloroplast CGEP is a soluble stromal protein accumulating as homo-oligomers. Previously, we determined that the relative abundance of CGEP in rosette leaves was in the same range as the CLPR subunits, CLPB3 and cpHSP90 (Zybailov et al., 2008).
In Vivo CGEP Loss-of-Function Mutants and Genetic Interaction of CGEP with the Chloroplast CLP Protease
To determine the in vivo CGEP function, we identified three T-DNA Arabidopsis mutants, namely cgep-1 (SAIL_574_D03), cgep-2 (SALK_066117), and cgep-3 (SAIL_589_G08), from the Arabidopsis Biological Resource Center (Fig. 2, A and B). Reverse transcription PCR (RT-PCR) analysis showed that expression of the CGEP gene was undetectable in all three mutants (Fig. 2C) and, consistently, immunoblotting of total leaf extracts showed a complete loss of CGEP protein accumulation (Fig. 2D). Thus, cgep-1, cgep-2, and cgep-3 were considered null mutants for CGEP. None of these lines showed obvious visible phenotypes as compared to wild type when grown under standard growth-chamber conditions (Fig. 2B) or after high-light or drought-stress conditions (Supplemental Fig. S2, A and B). To further probe for a physiological role of CGEP in growth and development, we crossed the cgep-2 allele with partial loss-of-function mutants in the essential CLP protease system. cgep-2 was crossed with clpr2-1, a partial loss-of-function CLPR2 T-DNA mutant (Rudella et al., 2006), and the double mutant clpt1-2 clpt2-1 (Kim et al., 2015). Homozygous progeny of these crosses was identified in the F2 populations (Fig. 2E; Supplemental Fig. S3). The visible growth phenotypes indicated strong genetic interactions between CGEP and clpr2-1 (Fig. 2E) as well as the clpt1 clpt2 double mutant (Supplemental Fig. S3). Immunoblotting confirmed the complete lack of CGEP in all plants with the cgep-2 allele (Fig. 2F). The rosette diameters and fresh weights of rosette leaves of cgep-2 clpr2-1 and cgep-2 clpt1-1 clpt2-1 plants were significantly smaller than clpr2-1 and clpt1-1 clpt2-1 plants, respectively (Fig. 2G). By contrast, homozygous progeny (F2) of crosses of cgep-2 with var2-1, a partial loss-of-function allele of thylakoid FTSH2 (Chen et al., 2000), did not show an obvious synergistic visible growth or developmental phenotype (Supplemental Fig. S3). These results show that CGEP does play a specific role in the chloroplast proteostasis network.
Proteome Phenotype of cgep-2
To determine if other chloroplast proteases were up- or downregulated in cgep, wild type and cgep were probed by immunoblotting for the abundance of stromal DEG2 (Nishimura et al., 2016) as well as thylakoid-localized FTSH2 and FTSH5 (Nishimura et al., 2016) and stress-responsive SPPA (Lensch et al., 2001; Wetzel et al., 2009). Accumulation levels of these proteases were unchanged in cgep as compared to wild-type plants (Fig. 2H). To more comprehensively understand the physiological response to the loss of CGEP, we compared the isolated chloroplast stromal proteomes of wild type and cgep in three biological replicates using SDS-PAGE followed by in-gel tryptic digests and identification and quantification by MS/MS (Supplemental Fig. S4). We previously successfully applied this workflow (Friso et al., 2011) to characterize other chloroplast protease mutants (Kim et al., 2013, 2015; Nishimura et al., 2013, 2015). Proteins were assigned to subcellular locations and functions based on updated curated information in the PPDB. The proteomics experiment identified 591 proteins of which 553 were localized to plastids (Supplemental Dataset S2). To obtain a general view of the physiological proteome phenotype, we calculated protein investments for key chloroplast functions across three main categories: carbon metabolism; chloroplast biogenesis and proteostasis; and other metabolic pathways (Fig. 3; Supplemental Fig. S4). This demonstrated statistically significant altered investments in protein (un)folding (up 25%), plastid 70S ribosomes (up 53%), starch synthesis (down 39%), and glycolysis (down 32%). The overall investments in other chloroplast metabolic pathways (e.g. Calvin–Benson cycle, N- and S- metabolism, amino acid metabolism) were not significantly affected (Fig. 3A; Supplemental Fig. S4). At the individual protein level, 12 proteins, all plastid-localized, showed significant (false discovery rate < 5% and P < 0.01) differential accumulation between wild type and cgep (Fig. 3B). One of these was CGEP, identified with 124 matched MS/MS spectra across the three wild-type samples but never in cgep-2. Strikingly, five of these proteins are directly or indirectly involved in starch metabolism, as will be discussed below. The others are involved in the Calvin–Benson Cycle and photorespiration (SBPase, SFBA-1, and PGP1/2), fatty acid metabolism (acetyl-CoA synthetase), amino acid metabolism (ABERRANT GROWTH AND DEATH 1 and 2), and a protein associated with abiotic stress (UOS1-like [similar to pea UV-B-and-ozone similarly regulated protein]; for full protein names, see legend to Fig. 3B).
Although we detected all major players in the stromal (un)folding machinery including CLPB3, HSP90, cpHSP70-1/2, the group of CPN10/20/60 proteins, and the HSP70 nucleotide exchange factors GRPE-1/2 (Supplemental Dataset S2), the increased protein investment in the (un)folding machinery was mostly due to increased amounts of the abundant CPN60 chaperones and peptidyl-prolyl isomerase ROC4. Within the function proteolysis, we identified 26 stromal proteases and protease chaperones (Supplemental Dataset S2), including the complete CLPRT core complex (11 proteins) and all three CLP chaperones (C1, C2, and D), dual targeted organellar oligo peptidase (OOP), PREP1, and PREP2, stromal processing peptidase (SPP), DEG2, and several other less studied proteases. None of these proteases showed significant differential accumulation between wild type and cgep, with the exception of CGEP. Furthermore, the abundance ratio for cgep/wild type of the complete stromal CLPPRTCD system (14 proteins) was 1.05, indicating that the CLP system was not up- or downregulated in response to loss of CGEP. The 53% increase in 70S ribosomes was due to an increase in both the 30S (41% up) and 50S (81% up) particles. Other functions within biogenesis and proteostasis, including proteins involved in RNA metabolism or translation, were not significantly affected (Fig. 3A).
Within plastid glycolysis, we detected seven stromal proteins, which were all lower in cgep than in wild type (Supplemental Dataset S2), explaining the significant 30% decrease. The two most abundant enzymes were the well-characterized Glc-6-P isomerase and plastid phosphoglucomutase (starch free mutant1), driving the conversion of the Calvin cycle intermediate Fru 6-P to Glc-1 P immediately upstream of starch biosynthesis. The decrease of plastid phosphoglucomutase in cgep was statistically significant (Fig. 3B). The proteome analysis identified 21 proteins involved in starch metabolism, of which 11 function in starch synthesis and 10 function in starch degradation (Supplemental Fig. S4D; Supplemental Dataset S2). The starch metabolic enzymes were carefully annotated based on the most recent reviews and experimental studies (Goren et al., 2018; Abt et al., 2020; Abt and Zeeman, 2020; Smith and Zeeman, 2020) and The Arabidopsis Information Resource (TAIR10). Except for isoamylase2 (DBE1), all enzymes involved in starch synthesis were lower in cgep-2, with the reduction in inorganic pyrophosphatase6 and starch branching enzyme class II-3 being statistically significant. Within starch degradation, beta amylase2 and disproportionating enzyme1 were both significantly reduced in cgep-2 (Fig. 3B).
mRNA-Based Coexpression Analysis
In our recent coexpression peptidase network (Majsec et al., 2017), CGEP forms a tight coexpression module (Module VI) with two other peptidases, namely dual-localized plastid/mitochondria OOP and stromal Met aminopeptidase 1B. This module contains 70 nonredundant genes making 81 edges and a relative enrichment in starch metabolism (7% compared with 0.8% for the whole network). However, this was a so-called “forced network” using only the plastid and mitochondrial peptidases and their auxiliary proteins as bait and limiting the number of coexpressors to the top 20. Therefore, we extracted the top-100 (based on mutual-rank [MR] values) coexpressed genes for CGEP from the coexpression database ATTEDII (http://atted.jp/) and evaluated enrichment for subcellular location and functions (Supplemental Dataset S3). This showed that proteins involved in starch metabolism, glycolysis, and the tricarboxylic acid cycle were significantly (hypergeometric test) overrepresented (30%, 11%, and 12%, respectively) when weighing for functional bin size (Supplemental Dataset S3). To better put these data in perspective, we also evaluated the top-100 coexpressors for several other stromal peptidases, namely CLPP5, CLPR1, and DEGP2, as well as the dual-targeted chloroplast/mitochondrial OOP, PREP1, and PREP2 (Supplemental Dataset S3). Also, OOP showed enrichment for starch metabolism (26% of weighted bin distribution) but not glycolysis or the tricarboxylic acid cycle. The coexpression of CGEP with starch metabolism was even more pronounced for the top-50 and top-20 coexpressing genes, increasing enrichment to 52% (top 50) and 63% (top 20), whereas enrichment decreased for OOP to 16% (top 50) and 13% (top 20). A closer look at the CGEP coexpressors shows that the coexpression of CGEP with starch degradation extends into the cytosolic conversion of maltose to Suc, as evidenced by the coexpression of disproportionating enzyme2, heteroglycan phosphorylase2, and Suc P synthase 1F. The coexpression pattern strongly suggests that the function of CGEP is at least in part associated with starch metabolism (see “Discussion”).
CGEP Is an Active Peptidase that Can Degrade Small Proteins and Peptides
To probe the catalytic activity of CGEP, CGEP was expressed as an N-terminal GST-fusion protein in Escherichia coli and then purified over glutathione-agarose beads (rCGEP). As a negative control for CGEP activity, we generated the catalytic-site mutant S781R (rCGEP-S781R). Recombinant proteins were incubated with bovine serum albumin (BSA; 67 kD), β-casein (25 kD), and insulin (10 kD) as substrates. After 6-h incubation, rCGEP partially degraded β-casein (Fig. 4A) and completely degraded insulin (Fig. 4B), but rCGEP-S781R could not degrade either substrate (Fig. 4, A and B). Furthermore, BSA could not be degraded by active rCGEP (Fig. 4C). A recombinant Arabidopsis truncated CGEP consisting of just the C-terminal portion (773–961) including the active site residues (S781, D855, and H889) could not degrade insulin, indicating that the intact protein is required for proteolytic activity (Fig. 4D).
Autocatalytic C-Terminal Cleavage of CGEP In Vivo and In Vitro
We noticed that the proteolytically inactive rCGEP-S781R migrated at a slighter higher mass on the SDS-PAGE gels than rGCEP (Fig. 5A). Comparing the peptide sequence coverage of both active and inactive rCGEP from in-gel trypsin digests and MS/MS analysis, we found a clear difference at the C-terminal portion of CGEP. The most C-terminal–detected tryptic peptide in rCGEP-S781R was K↓V936STGTGGGNPEFGEHEVHSK954 (the arrow indicates the tryptic cleavage site) whereas the most C-terminal peptide in rCGEP was the semitryptic peptide K↓E928GSDADKVSTGTGGGNPE946 (Fig. 5B; Supplemental Table S2). Subsequent inspection of CGEP sequence coverage from in vivo Arabidopsis samples across many previous experiments on a range of wild-type Arabidopsis leaf and chloroplast samples (e.g. Zybailov et al., 2008; Olinares et al., 2010), as viewed through the PPDB, showed that the most C-terminal coverage observed in vivo came from the semitryptic peptide K↓E928GSDADKVSTGTGGGNPE946, similar to the most C-terminal peptide observed for rCGEP (Fig. 5B). This was confirmed with Arabidopsis in vivo samples by MS/MS analysis of CGEP enrichment through immunoprecipitation (using CGEP serum) of the chloroplast stromal proteome of wild-type plants (Fig. 5C; Supplemental Fig. S4, A–C; Supplemental Table S3). Inspection of the sequence alignment (Supplemental Fig. S1B) and sequence logo of CGEP homologs in dicotyledons showed a strong conservation of the residues A[S/T]GGGXPE946 (Fig. 5D). The in vitro and in vivo data, together with this sequence conservation, strongly suggest autocatalytic processing C-terminal of E946, hence removing 15 residues (FGEHEVHSKLRRSLL) from the C terminus.
We observed a C-terminal Trp (W907 in Arabidopsis CGEP) conserved across all photosynthetic CGEP homologs (Supplemental Fig. S1, B–D), thus providing a good reference for comparing the C-terminal extension across CGEP homologs (see Supplemental Table S1). Using this reference, the C-terminal extension in Arabidopsis is 55 amino acids long, with the autocatalytic processing trimming this by 15 residues to 39 residues. The monocots also have C-terminal extensions but it diverged from the extension in the dicotyledons (Supplemental Fig. S1B). Green algae generally have shorter C-terminal extensions between 10 and 30 residues, whereas moss and lycopod have 30 and 36 residues, respectively (Supplemental Fig. S1D); very little conservation is found between them. Finally, cyanobacterial CGEP homologs have only few residues (9–11 amino acids) beyond this conserved tryptophan, thus removing the need for autocatalytic cleavage (Supplemental Fig. S1C). We discuss the position of the C-terminal extension further below (“Homology Model of the CGEP Monomer and C-Terminal Cleavage”).
CGEP Cleaves C-Terminal of Glu, Independent of Neighboring Residues, as Demonstrated by PICS
Previous reports with a very limited set of synthetic peptides and cleavage sites in a few model substrates (Yamauchi et al., 2001; Forsberg et al., 2005), and our observed C-terminal autocleavage described above, indicate that CGEP can cleave the peptidyl bond immediately C-terminal of Glu residues. These data do not determine if Arabidopsis CGEP can also cleave after other residues or if there are other cleavage determinants, sometimes referred to as “subsite cooperativity” (Ng et al., 2009). We also wanted to better resolve the upper and lower substrate size limits. Therefore, we applied the PICS method or variations thereof (Schilling et al., 2011; Biniossek et al., 2016). For each experiment, we incubated rCGEP and rCGEP-S781R (as negative control) with very large (>100,000) peptide libraries and compared the resulting peptides by MS/MS analysis. Complementary peptide libraries were generated by digesting total soluble Arabidopsis leaf proteomes with the commercial peptidases trypsin (cleaving C-terminal of Lys/Arg), LysC (C-terminal of Lys), or GluC (C-terminal of Glu).
In the first experiment outlined in Figure 6A, all three types of peptide libraries were first dimethylated to block and chemically mark all primary amino groups (N-termini and Lys side chains) and these libraries were then incubated with active rCGEP or catalytically inactive rCGEP-S781R. The newly rCGEP-generated amino termini, neo-N-termini, were reacted with a cleavable amine cross linker containing a biotin moiety. The biotin tag was then used for affinity enrichment of these neo-N-terminal peptides for subsequent nanoscale liquid chromatography (nano-LC)-MS/MS analysis. The number of matched MS/MS spectra per unique peptide was summed for each and the list collapsed to identify cleavage events identified in the CGEP but not in the CGEP-S781R–treated samples. We observed 324 peptides (2,157 MS/MS; trypsin library), 197 peptides (863 MS/MS; lysC library), and 65 peptides (215 MS/MS; GluC library) that were specific to rCGEP (absent in CGEP-S781R) and observed by at least three MS/MS spectra (Supplemental Dataset S4). A quantity of 89% (trypsin library), 96% (LysC library), and 82% (GluC library) of these specific peptides resulted from cleavage after Glu (Supplemental Dataset S4). Figure 6B visualizes this substrate cleavage preference of CGEP as determined from these digested libraries using sequence logos (https://weblogo.berkeley.edu/logo.cgi) or iceLogos (https://iomics.ugent.be/icelogoserver/) after weighing for the amino acid composition of the proteome. This demonstrates strong enrichment for Glu in the P1 position, reflecting the specificity of CGEP. There was very little sequence preference or avoidance up- or downstream of the P1 residue, except for a weak (10%) enrichment (P = 0.01) for Gly in the P1′ position (Fig. 6B). The GluC library created the least number of CGEP cleavages, which is logical because GluC also cleaves C-terminal of Glu, but the peptides in the GluC library did include many missed cleavages, hence allowing for additional cleavages by CGEP. The largest peptides specifically generated by rCGEP (absent in rCGEP-S781R) from the trypsin and LysC libraries were two different peptides 29 residues in length (observed 16 times; z = 3+; observed six times; z = 2+); both these large peptides were generated by cleavage C-terminal of Glu (Fig. 6C).
In a second, complementary experiment summarized in Figure 7A, unmodified peptide libraries (trypsin and GluC) were incubated with CGEP or CGEP-S781R for digestion. Subsequently, CGEP- or CGEP-S781R–digested peptides were dimethylated with CH2O (“light” formaldehyde) or CD2O (“heavy” formaldehyde) followed by mixing in equal proportions to allow for direct comparison by nano-LC-MS/MS. We only considered peptides observed in both replicates (Supplemental Dataset S5). We observed 142 (579 MS/MS) and 79 (293 MS/MS) cleavage events that were specific to rCGEP (observed by at least two MS/MS spectra and absent in CGEP-S781R) in libraries made with trypsin and GluC, respectively. Figure 7B visualizes the substrate cleavage preference of CGEP as determined from these digested libraries using sequence logos or iceLogos after weighing for the amino acid composition of the proteome. Similar to what was indicated in the previous experiment (Fig. 6), this demonstrates the strong enrichment for Glu in the P1 position reflecting the specificity of CGEP. There was very little sequence preference up- or downstream of the P1 residue, showing that CGEP cleaves C-terminal of Glu irrespective of the surrounding residues (Fig. 7B). We then carefully evaluated to what extent the specific rCGEP-generated peptides resulted from N- or C-terminal exo-glutamyl peptidase or endo-glutamyl peptidase activity. Figure 7C shows three examples for each of these activities. We note that exo-peptidases are defined as peptidases that can cleave peptidyl bonds one, two, or three residues away from the N- or C terminus, thus releasing one amino acid, di-, or tripeptides, respectively. We observed examples for all N- and C-terminal exo-glutamyl peptidase activity as illustrated in Figure 7C. Most of the rCGEP activity resulted from endo-peptidase activity, with three examples shown in Figure 7C. The largest peptide specifically generated by rCGEP (absent in rCGEP-S781R) from the trypsin library had 43 residues and was generated from a 49-residue, full tryptic peptide, as specifically observed in rCGEP-S781R (Fig. 7C).
CGEP without C-Terminal Autocleavage Is Proteolytically Active but Limited in Substrate Size
We next asked if C-terminal autocleavage of CGEP is required for its proteolytic activity in vitro or in vivo. As described above (“Autocatalytic C-Terminal Cleavage of CGEP In Vivo and In Vitro” section), in vitro and in vivo analysis identified residue E946 as the most C-terminal residue in CGEP (Fig. 5, B and C). We generated a CGEP construct, named CGEP-E946A-E949A-E951A (CGEP-C2), in which C-terminal Glu residues E946, E949, and E951 were mutated into Ala residues. In addition, we also created a CGEP construct in which E926 and D931 were changed into Ala residues (CGEP-C1). Together, these two constructs mutated every Glu in the C-terminal extension (Fig. 8A). For in vitro testing, CGEP-C1 and CGEP-C2 were produced with an N-terminal GST fusion for affinity purification. These GST-CGEP fusion proteins were overexpressed in E. coli and affinity-purified, run out on an SDS-PAGE gel along with GST-CGEP and GST-CGEP-S871R, and followed by Coomassie staining or immunoblotting with anti-CGEP serum (Fig. 8B). This showed that GST-CGEP and GST-CGEP-C1 have a lower molecular mass than GST-CGEP-S871R and GST-CGEP-C2, indicative of the lack of C-terminal processing in the latter two proteins. Indeed, MS/MS analysis confirmed the cleavage after E946 in CGEP-C1 (as in CGEP) but not in CGEP-C2 (as in CGEP-S781R; Supplemental Table S2). Previously, we showed that rCGEP can completely degrade insulin (10 kD) as well as β-casein (25 kD), albeit with lower efficiency (Fig. 4, A and B). To determine protease activity of CGEP with an unprocessed C terminus, rGST-CGEP-C2 was incubated for 6 h at 37°C with insulin or casein. This showed that GST-CGEP-C2 completely degrades insulin but not casein (Fig. 8C), suggesting that the unprocessed C terminus limited access to the active site.
Homology Model of the CGEP Monomer and C-Terminal Cleavage
To better understand CGEP, the possible significance of the C-terminal autocleavage, as well as substrate interactions and size limitations, a 3D structural model was constructed based on the mature CGEP (residues 63–961; predicted cTP removed). The best scoring threading template was the crystal structure of the S9B dipeptidyl aminopeptidase IV in the Gram-negative bacterium Stenotrophomonas maltophilia (PDB:2ECF; Nakajima et al., 2008). A sequence alignment with the predicted secondary structures of this bacterial protein and CGEP from Arabidopsis, Brassica rapa, and Populus trichocarpa is shown in Supplemental Figure S5. The 3D homology CGEP model shows a typical β-propeller domain (upper) and α/β-hydrolase domain (lower) containing the catalytic triad of S-D-H residues (Fig. 9A). The active site is partially accessible through a shallow cavity that is ∼15 Å wide and at most 10 Å high, visible from the front view. A very narrow cavity (<4 Å) can also be seen from the top view, extending through the center of the β-propeller domain (Supplemental Fig. S6A); this feature was also noted in other S9 peptidases and is thought to be too narrow for a substrate to pass through without substantial rearrangement of tertiary structure (Tsirigotaki et al., 2017). The structures of CGEP (rainbow colors) and Dipeptidyl Aminopeptidase IV (gray) were overlaid, showing close alignment of α-helices and β-sheets throughout, but little to no alignment around the mouth of the central cavity at the N- and C-terminal regions (Fig. 9B). The reliability confidence score (C-score) of −2.29 for the Arabidopsis model is relatively low, largely due to uncertainty for the N-terminal region. Therefore, we generated a second model (based on the same structure) for an N-terminally truncated CGEP (residues 387–961) that provided a better C-score of −0.93 (Fig. 9, C and D; Supplemental Fig. S6, B and C). This structure CGEP showed that the C-terminal stretch of 15 residues, which is autocatalytically removed, is in close proximity to the active site, and its removal may increase accessibility to the active site (Fig. 9E; see “Discussion”).
C-Terminal Autocatalytic Cleavage In Vivo
To verify autocleavage in the C terminus in vivo, we generated four 35S-driven C-terminally tagged CGEP variants and transformed these transgenes into cgep-2. These transgenes were CGEP-STREPII (wild-type–like), CGEP-S781R-STREP (catalytically inactive), and CGEP-C1-STREPII (in which either Glu residue E926 and the Asp residues D931 in the C-terminal region of CGEP were changed into Ala residues), and CGEP-C2-STREPII (in which the Glu residues E946, E949, and E951 were changed into Ala residues; Supplemental Fig. S7). The C-terminal STREPII tag was included for affinity purification and to rapidly determine if the C terminus was cleaved in vivo using immunoblotting. Transformants were identified on medium with glufosinate (BASTA) and PCR-based genotyping, and RT-PCR confirmed expression of the transgenes (Fig. 10A; Supplemental Fig. S7, B and C). Immunoblotting of stromal proteomes of the confirmed transformants using anti-CGEP– and anti-STREPII–specific antisera showed CGEP in wild type and the two transgenic lines cgep-2/CGEP-STREPII and cgep-2/CGEP-S781R-STREPII, but not in cgep-2 as expected (Fig. 10B; Supplemental Fig. S7D). However, no signals were observed with anti-STREPII serum in wild type, cgep-2, and wild-type/CGEP-STREPII, whereas a clear signal was observed in wild-type/CGEP-S781R-STREPII (Fig. 10B; Supplemental Fig. S7D). This is fully consistent with in vitro autocatalytic C-terminal cleavage by active CGEP, which thus removes the STREPII tag, whereas this tag was not removed by the inactive CGEP-S781R-STREPII peptidase. Protein analysis of leaves from cgep-2/CGEP-C1-STREPII showed that CGEP is not autocatalytically cleaved at E928 or D951, as shown by the lack of detectable STREPII affinity tag by immunoblotting (Fig. 10B). By contrast, autocatalytic cleavage was blocked in cgep-2/CGEP-C2-STREPII, as evidenced by the detection of the C-terminal STREPII tags in this line (Fig. 10B; Supplemental Fig. S7E). Native gel analysis with the various Cqgz-terminal mutant lines showed that the C-terminal autocatalytic cleavage is not required for dimerization (Supplemental Fig. S7F). Carefully comparative analysis of the growth phenotypes of cgep-2 and the four transgenic lines showed that cgep-2/CGEP-S781R-STREPII and cgep-2/CGEP-C2-STREPII had significantly reduced growth, whereas cgep-2/CGEP-STREPII and cgep-2/CGEP-C1-STREPII were not significantly different than cgep-2 (Fig. 10C). This showed that overexpression of inactive CGEP or active CGEP, but with an unprocessed C terminus, negatively impacts growth.
The Physiological Significance of CGEP and Its C-Terminal Autocleavage Determined by In Vivo Complementation in the cgep clpr2-1 Background
None of the three cgep null mutant alleles have an obvious growth or developmental phenotype under optimal or under various abiotic stress conditions, as was shown above (Fig. 2B; Supplemental Fig. S2). However, cgep alleles do create a phenotype in the clpr2-1 background (Fig. 2, E and G), thus providing an opportunity to further test the physiological significance of CGEP variants.
We therefore crossed the cgep-2/CGEP-STREP, cgep-2/CGEP-C1-STREPII, and cgep-2/CGEP-C2-STREPII lines with clpr2-1. Homozygous progenies were identified for each of these crosses in the F2/F3 populations, and also confirmed by immunoblotting with anti-CGEP and anti-STREPII sera (Fig. 11, A and B). These progenies were phenotyped along with clpr2-1 for growth and development (Fig. 11; Supplemental Fig. S8). Importantly, this showed that cgep-2 clpr2/CGEP-STREPII and cgep-2 clpr2/CGEP-C1-STREPII were indistinguishable from the clpr2-1 parent, indicating that expression of catalytically active CGEP capable of autocatalytic C-terminal processing activity fully complemented the double mutant. However, only partial complementation was obtained for cgep-2 clpr2/CGEP-C2-STREPII (no autocatalytic cleavage) and cgep-2 clpr2/CGEP-S781R-STREPII (no catalytic activity) plants; these plants were smaller or developmentally delayed compared to clpr2-1 plants (Fig. 11, A and C; Supplemental Fig. S9 for additional plant images at an earlier developmental stage). This showed that CGEP does contribute to the chloroplast proteostasis network and that C-terminal autocatalytic processing is physiologically important.
Searching for CGEP Protein Interactors and Possible Trapped Substrates
In an extensive effort to identify CGEP interactors and potential (trapped) substrates, we compared the in vivo protein interactomes of CGEP-STREP, CGEP-S781R-STREPII, and CGEP-C2-STREPII by either coimmunoprecipitation with CGEP antiserum or using the STREPII-tag for affinity purification (Supplemental Fig. S10; Supplemental Text S1). Whereas excellent recovery for CGEP was obtained in these experiments, no obvious interactors emerged, suggesting only weak interactions between substrates and CGEP, or that substrates are too small to be identified. Supplemental Text S1 summarizes these experiments.
DISCUSSION
S9D proteases are unique to bacteria, including cyanobacteria, and plastids in photosynthetic eukaryotes. The S9 family of α/β-hydrolases (part of Clan SC) has four subfamilies of Ser type peptidases: S9A, with cytosolic prolyl oligopeptidases from bacteria, archaea, and eukaryotes; S9B, with acylaminoacyl peptidases only found in eukaryotes; S9C, with membrane-bound dipeptidyl-peptidase IV in both bacteria and eukaryotes; and S9D, to which chloroplast CGEP was assigned (Rawlings et al., 2014). As also our phylogenetic analysis illustrates, S9D peptidases are found in bacteria and in photosynthetic eukaryotes (where they are most likely confined to plastids), but not in other eukaryotes. Plastid CGEP is clearly of bacterial origin, but the phylogenetic analysis does not support a direct endosymbiotic origin from cyanobacteria. S9 peptidases are generally believed to hydrolyze only relatively short peptide substrates, whereas large structured peptides and proteins are not usually cleaved (Rea and Fülöp, 2006). Crystal structures have been solved (although not for plant proteins) for S9A (PDB:1QFS and PDB:1QFM; Fülöp et al., 1998), S9B (PDB:2ECF; Nakajima et al., 2008), and S9C (PDB:5YZM, 6IGQ, and 6IGP; Yadav et al., 2019) members, but not for S9D members. S9 structures show an N-terminal, eight-bladed β-propeller and a C-terminal peptidase unit folded as an α/β/α sandwich that contains the catalytic triad. These two domains form bowl-like structures that together form a large central cavity, restricting access to the catalytic site (Rea and Fülöp, 2006). The catalytic residues are Ser-Asp-His and conserved motifs around Ser define the A to D families (GGSXGG; GWSYGG, GGSYGG, and GGHSYGA for Fig. 9, A–D, respectively). Two other plant S9 peptidases, both in the S9A family, have been studied experimentally in plants, namely the acyl-amino acid-releasing peptidase1 in Arabidopsis (AARE1, AT4G14570; Nakai et al., 2012) and a tyrosyl aminopeptidase in daikon radish (Raphanus sativus; the likely Arabidopsis homolog is AARE2, AT5G36210 (Tsuji et al., 2011). Arabidopsis cytosolic Dipeptidyl peptidase IV-like (AT5G24260) is a member of the S9B family, but it has not been studied. There are, so far, no known S9C members in plants. Whereas CGEP is more closely related to AARE1 and AARE2 proteins than any other Arabidopsis protein, CGEP, AARE1, and AARE2 are only distantly related, with sequence similarity only in the C-terminal peptidase domain. This study shows that CGEP is a stromal oligomeric protein; is physiologically important in stromal proteostasis and starch metabolism; has clear cleavage specificity; has a maximal substrate size limitation, which is influenced by a surprising autocatalytic C-terminal cleavage; and has specific genetic interactions with the stromal CLP protease system, but not with the abundant thylakoid FTSH protease.
CGEP Cleaves C-Terminal of Glu through Endo- and Exo-Peptidase Activity
The PICS analyses presented in this study using rCGEP and CGEP variants demonstrated that CGEP can cleave C-terminal of Glu residues with high fidelity and without any discernable impact of neighboring residues. Furthermore, CGEP efficiently degrades the 10-kD protein bovine insulin and partially degrades 25-kD bovine β-casein, but not larger proteins such as BSA (67 kD). This glutamyl peptidase activity is consistent with observed activity in enriched leaf extracts from spinach, pea, or cucumber leaves using the synthetic peptide carbobenzoxy-L-L-E-naphthylamide (Laing and Christeller, 1997; Yamauchi et al., 2001) or a soluble 7.5-kD recombinant peptide (N terminus of LHCII-1) with multiple Glu residues (Forsberg et al., 2005). This study showed that CGEP has both exo- and endo glutamyl-peptidase activity, i.e. it can cleave C-terminal of Glu residues within any of the three residues of the N- or C-termini (exo-peptidase), as well as within substrates further away from the termini (endo-peptidase). Most peptidases in biology have either exo- or endo-peptidase activity; however, there are exceptions such as human metallo-peptidase angiotensin I-converting enzyme (MA-E M2 family; Naqvi et al., 2005), vertebrate Cys peptidase cathepsin B (Krupa et al., 2002), and plant cathepsin B-like peptidases (Tsuji et al., 2008; Porodko et al., 2018). Cathepsin B homologs have C-terminal dipeptidase activity (releasing dipeptides from the C terminus of substrates) as well as endopeptidase activity; this dual activity is regulated through an occluding loop. Our analysis showed that CGEP can digest proteins by cleavage of the peptide bond immediately C-terminal of Glu, irrespective of the position of the Glu and any neighboring residues, as long as proteins are relatively small.
Autocatalytic Cleavage of the C-Terminal Extension of CGEP and Its Functional Role
This study discovered and clearly documented C-terminal cleavage of CGEP at E946, both in vitro with rCGEP and in vivo in Arabidopsis leaves from observation of endogenous CGEP, as well as from CGEP variants with a C-terminal STREPII tag expressed from stable transgenes. The cleavage removed the C-terminal 15 amino acids. The evidence was based on MS/MS analysis as well as immunoblotting. Cleavage activity of rCGEP incapable of autocatalytic cleavage through a E946A mutation reduced the substrate size limit, as it was not able to cleave β-casein whereas it still cleaved the smaller insulin effectively. This showed that removal of the C-terminal 15 residues extends the substrate size range of CGEP, likely through providing increased access to the catalytic cavity. Indeed, 3D structural modeling based on the S9B dipeptidyl aminopeptidase IV from S. maltophilia (Nakajima et al., 2008) supports this hypothesis, as the C terminus is positioned at the entry to the catalytic cavity. Importantly, the C-terminal cleavage is also physiologically important, as demonstrated by the in vivo complementation assays in the cgep-2 clpr2-1 background. We explored the literature for other examples of C-terminal autocleavage of peptidases and found an example in the unrelated chymotrypsin-like Cys protease 3CL (3CLpro) from the Severe Acute Respiratory Syndrome (SARS) coronavirus (Muramatsu et al., 2016). 3CLpro was shown to autocatalytically (by another copy of 3CLpro) cleave the 10-residue C-terminal prosequence using a noncanonical specificity. A different example is a Leu aminopeptidase from Pseudomonas aeruginosa, which is C-terminally cleaved by other proteases resulting in intramolecular autocatalytic N-terminal cleavage (12 amino acids removed), leading to its activation (Sarnovsky et al., 2009). CGEP is a member of the plant- and bacterial-specific S9D family, and not much is known about this family. However, the better studied S9A, S9B, and S9C peptidase families with different peptidase activities (no glutamyl peptidases) have been shown to use sophisticated and diverse mechanisms to regulate access to the buried active site. This includes transient openings, double-gated entry mechanisms, and active site assembly/disassembly, as reviewed in Rea and Fülöp (2006). The C-terminal cleavage identified here for CGEP is a novel mechanism that is likely a relatively new evolutionary diversification, as it seems absent in algae, primitive plants, and (cyano)bacteria.
The Importance of Glu in Plant Cells and the Role of CGEP
The amino acid Glu is found in abundance in the plant cell (∼10% to 40% of the total amino acid pool), likely because of its central role in primary metabolism, including the photorespiratory C2 cycle, tetrapyrrole biosynthesis as a precursor for synthesis of the porphyrin ring, and as substrate for amino-transferases (Forde and Lea, 2007; Hildebrandt et al., 2015). Calculation of amino acid frequencies in the theoretical proteome of Arabidopsis showed that Glu is the third most frequent residue (∼7%) in proteins (L and S are more frequent at ∼9%; Hildebrandt et al., 2015). The abundance and frequency of Glu in the plant cell could explain why plants evolved a specific glutamyl peptidase with both exo- and endo-peptidase activity. The insensitivity of CGEP activity to the neighboring residues of the target Glu further supports a role of CGEP in Glu homeostasis, because it makes CGEP very effective in liberating Glu. The specificity for Glu must lie in its side chain interacting with the specific residues in the substrate cavity of CGEP. High-resolution structure determination of CGEP together with substrate will be needed to understand the residues contributing to the Glu specificity.
Physiological Role of CGEP and Position within the Chloroplast Peptidase Network
Chloroplast proteostasis involves a network of chaperones and peptidases (Majsec et al., 2017). The in vivo physiological impacts of several chloroplast peptidases have been characterized and loss-of-function mutants have shown a wide range of phenotypes, from no-visible phenotypes, cotyledon phenotypes, variegated phenotypes, and virescent-leaf phenotypes (often with the strongest lack of chlorophyll in the youngest leaves) through to embryo- or seedling-lethal phenotypes (for review, see Kmiec et al., 2014; Rigas et al., 2014; Adam, 2015; Nishimura and van Wijk, 2015; Xie et al., 2015; Nishimura et al., 2016, 2017; Kato and Sakamoto, 2018). Furthermore, genetic interactions have been observed within members of the same peptidase system, such as the thylakoid FTSH system (Moldavski et al., 2012; Kato and Sakamoto, 2018), the stromal CLP system (Nishimura and van Wijk, 2015), and the thylakoid lumen DEG system (Butenko et al., 2018). However, more recently, genetic interactions have been shown between different chloroplast peptidase systems, such as PREP and OOP (Teixeira et al., 2017), FTSH and CLP (Park and Rodermel, 2004), and VAR2 and EGY1 (Qi et al., 2020). There are various possible explanations for such genetic interactions, including (partial) functional redundancies, protein–protein interactions affecting assembly and stability, or threshold effects in proteostasis. Genetic interactions can also be explained by the concept that degradation of a single protein likely involves a sequence of cleavage events involving the sequential activity of different proteases. For example, large protein substrates will require at least partial, often ATP-dependent, unfolding to initiate degradation and ultimate require exo-peptidases to generate individual amino acids. Relatively little is known about substrate selection mechanisms of chloroplast peptidases and possible hierarchies in protein degradation cascades, in part due to the lack of known tagging systems or degrons. Here, we showed that CGEP genetically interacts with the chloroplast stromal CLP protease system, but not with thylakoid-bound FTSH2. When crossed into various clp mutants (clpr2-1 or clpt1 clpt2 double mutant), the cgep-2 null mutation results in loss of plant growth and biomass, demonstrating the physiological significance of CGEP. Similar complementation experiments within catalytically inactive CGEP or CGEP that lacks C-terminal autocleavage further support the physiological significance of full CGEP activity. The CLP endo-peptidase system has no known upper limit of substrate size due to its ATP-dependent capacity of the chaperone component of the CLP protease system to unfold substrate (Nishimura and van Wijk, 2015; Rodriguez-Concepcion et al., 2018). Furthermore, once unfolded and delivered into the CLP protease cavity, cleavage occurs without site-specificity, releasing peptides in the range of seven to nine amino acids. By contrast, CGEP has an upper substrate size limit of ∼25 kD, and based on synthetic peptides (Laing and Christeller, 1997; Yamauchi et al., 2001) and the PICS experiments shown here, can cleave even short peptides through its exo-peptidase activity. If indeed the CLP and CGEP peptidases contribute to degradation of a shared set of substrates, it seems most likely that CLP acts mostly upstream of CGEP. The comparative proteomics analysis of wild type and cgep-2 showed that loss of CGEP does not result in significant differences in other stromal proteolytic systems, including the CLP system. Moreover, loss of CGEP did result in a significant increase in plastid ribosomal proteins, without affecting the rest of the translational machinery or RNA metabolism. By contrast, loss of CLP protease capacity results in increases in protein initiation, elongation factors, and RNA metabolism, but not in ribosomal proteins (Kim et al., 2013, 2015). This suggests that CGEP has a unique function within the proteostasis network.
Finally, there were early reports of several endogenous CGEP inhibitors in leaf extracts (∼8, 20, and 25 kD) in cucumber leaf extracts, one of which was heat-sensitive (Yamauchi et al., 2001). The reducing agent DTT was reported to activate glutamyl peptidase activity in an enriched leaf fraction (Forsberg et al., 2005). Both chloroplast endogenous CGEP inhibitors and redox regulation of CGEP would be a way to control in vivo CGEP activity, but our extensive CGEP protein interaction studies with in vivo samples did not identify candidate proteinaceous inhibitors, nor did we observe any impact of DTT on the peptidase activity of CGEP in vitro. It thus remains to be determined how CGEP activity is regulated in the chloroplast.
A Role for CGEP in Regulation of Starch Metabolism?
The comparative proteomics and coexpression analyses both suggest that the function of CGEP is directly or indirectly associated with starch metabolism. Indeed, Smith and Zeeman (2020) state that no master regulator of the starch biosynthesis pathway has been found and that it seems likely that there are multiple controls at the transcriptional and post-translational levels, depending on the organ and species concerned (Smith and Zeeman, 2020). Interestingly, it has been suggested that in specific species and tissues, starch phosphorylase may be under post-translational control through phosphorylation and protein degradation (Yu et al., 2001; Young et al., 2006; Goren et al., 2018), but our data did not indicate that Arabidopsis starch phosphorylase is cleaved by CGEP. A role of CGEP in regulating starch metabolism could be either direct downregulation of specific starch enzymes through degradation or N-or C-terminal trimming events that change activity directly or indirectly by effecting protein-protein interactions or effecting subchloroplast localization. Finally, as mentioned in the introductory paragraphs, the CGEP homolog in maize is enriched in bundle-sheath chloroplasts as compared to mesophyll chloroplasts (Friso et al., 2010; Majeran et al., 2010). Due to the distribution of C4 photosynthesis across these two cell types, transient starch is mostly synthesized in the bundle sheath chloroplast. The preferential accumulation of CGEP in the same cell type as starch metabolism further supports a regulatory role of CGEP in starch metabolism.
CONCLUSIONS AND OUTLOOK
Here we characterized a chloroplast CGEP peptidase and showed it has exo- and endo- glutamyl peptidase activity, through which we identified a novel mechanism to increase substrate accessibility to the active site. Complementation experiments and comparative proteomics confirmed the physiological relevance of CGEP. Structural analysis of CGEP and its substrate selection mechanism are now required for understanding the role of CGEP at the molecular level and the position of CGEP in the chloroplast protease network. An in-depth analysis of starch phenotypes is warranted, given the coexpression of CGEP with starch metabolic enzymes.
MATERIALS AND METHODS
Phylogenetic Analysis
To generate a phylogenetic cladogram, 41 CGEP proteins from 35 species across the tree-of-life were aligned and trimmed using the tool MUSCLE (http://www.ebi.ac.uk/Tools/msa/muscle/). The cladogram was generated as described in Bhuiyan et al. (2016). Untrimmed alignment is provided in Supplemental Dataset S1.
Generation of CGEP Antiserum
The nucleotide sequence encoding amino acids 640 to 782 of CGEP were amplified by PCR (for primers, see Supplemental Table S4). The resulting DNA fragment was ligated into restriction sites (BamHI and XhoI) of the C-terminal His affinity tag of the pET21a expression vector. BL21 Escherichia coli cells were transformed by a pET21a vector harboring this truncated CGEP gene, and cells were harvested from liquid culture after addition of 1 mm of isopropyl β-d-1-thiogalactopyranoside for 3-h incubation at 22°C. The overexpressed proteins were solubilized in 200 mm of NaCl, 50 mm of Tris, and 8 m of Urea at pH 8 and purified on a nickel-nitrotriacetic acid agarose resin matrix. A polyclonal antibody against this truncated CGEP protein was raised in rabbits by injecting purified antigen (Alfa Diagnostic International). Antisera were affinity-purified against the same antigen coupled to a HiTrap N-hydroxysuccinimide ester-activated column (GE Healthcare Life Sciences).
T-DNA Insertion Mutants, Transgenic Plants, Genotyping, and Phenotyping
T-DNA–insertion lines cgep-1 (SAIL_574_D03), cgep-2 (SALK_066117), and cgep-3 (SAIL_589_G08) were obtained from the Arabidopsis Biological Resource Center. T-DNA insertion lines were identified by genotyping and insertion was confirmed by DNA sequencing. The T-DNA insertion lines clpr2-1 and clpt1 clpt2 were described in Rudella et al. (2006) and Kim et al. (2015). Primers for genotyping are listed in Supplemental Table S4. Transcripts levels were determined by RT-PCR using RNA collected from homozygous plants as described in Bhuiyan et al. (2016).
To generate complimented transgenic plants, CGEP (wild-type form; a catalytically inactive form of CGEP mutated at S781R), CGEP-C1-STREPII (CGEP-E928A-D931A), and CGEP-C2-STREPII (CGEP-E946A-E949A-E951A) was generated from complementary DNA and mutations introduced as indicated. Primers used for cloning and mutation introduction are listed in Supplemental Table S4. See also “CGEP Site-Directed Mutagenesis and In Vitro Proteolytic Activity Assays.” A nucleotide sequence encoding the StrepII tag was added in the reverse primer before the stop codon to generate a C-terminal tag for protein detection and affinity purification. PCR products were cloned into pCR8 topo vector (Invitrogen) and verified by DNA sequencing. This clone was ligated into a gateway pEARLYGATE100 vector (Arabidopsis Biological Resource Center) by using LR enzymes (Invitrogen). These vectors were transformed into Agrobacterium and subsequently transformed into the cgep-2 null mutant and the double mutant clpr2-1xcgep-2 by the floral-dip method. Transgenic plants were selected using BASTA (for pEARLYGATE100 vector)-containing plates. Plants surviving on selective medium were genotyped, and confirmed transgenics were transferred to soil for seed production and generation of homozygous progeny. Plants were grown in a growth chamber with long-day conditions (18-h light/6-h dark) and temperature at 22°C with 130- or 100-μmol photons m−2 s−1 light intensity. For transcript analysis, total RNA was extracted from Arabidopsis leaves using the RNeasy Plant Mini Kit (Qiagen), and RT-PCR was carried on as mentioned above. To determine the statistical significance in growth phenotypes (rosette diameter and rosette weight), two-sample t tests were used.
Pigment Concentrations
Chlorophyll and carotenoid concentrations were determined by absorbance spectrometry after extraction in 80% acetone (Porra et al., 1989).
Chloroplast and Stroma Isolation
Chloroplasts were isolated on Percoll step gradients from mature rosettes of 4-week-old plants as described in Bhuiyan et al. (2015). Stroma was isolated from total chloroplast by centrifugation at 100,000g for 30 min at 4°C. Protein concentrations were determined using the BCA Protein Assay kit (Thermo Fisher Scientific).
Comparative Proteomics and MS
For protein identification and quantification, each gel lane was cut in consecutive gel slices, followed by in-gel digestion using trypsin and subsequent peptide extraction, as described by Friso et al. (2011). Peptide extracts for each gel band were then analyzed by on-line nano-LC-MS/MS using a Linear Trap Quadropole Orbitrap (Thermo Fisher Scientific). Resulting spectral data were searched against the predicted Arabidopsis proteome (TAIR10), including a small set of typical contaminants and the decoy, as described by Nishimura et al. (2013). Only proteins with three or more matched spectra were considered. Protein abundances were quantified according based on normalized adjusted spectral counts, as explained in Friso et al. (2011). Significance analysis to determine differentially accumulated proteins between wild type and cgep-2 was done as described by Kim et al. (2013). MS-derived information, as well as annotation of protein name, location, and function for the identified proteins, can be found in the PPDB (http://ppdb.tc.cornell.edu). The MS data have been deposited to the PRIDE Archive (http://www.ebi.ac.uk/pride/archive/) via the PRIDE partner repository, and are available as PXD017189 in the ProteomeXchange (http://www.proteomexchange.org/).
Coexpression Analysis
Coexpressed genes for CGEP, as well as five additional organellar protease genes (CLPR1, CLPP5, DEG2 PREP1, and PREP2), were downloaded from the plant coexpression database ATTED-II (http://atted.jp/) using the most recent dataset, Ath-m (Obayashi et al., 2018). The top-100 highest expressed genes, based on MR, for each bait were used for detailed analysis. Protein function was based on the MapMan annotation system (https://mapman.gabipd.org/mapman) integrated and extensively updated in the Plant Proteome DataBase. The functional enrichment test for the coexpressed genes for each bait was based on the hypergeometric test (Majsec et al., 2017).
Coimmunoprecipitation, StrepII Affinity Purification, and Gel Filtration
For coimmunoprecipitation with CGEP antiserum, the same protocol was followed as described by Bhuiyan et al. (2015), except that stroma was used in this study instead of thylakoids. StrepII-tagged protein purification was carried out as described by Olinares et al. (2011) and Kim et al. (2015), except that Strep-Tactin (IBA Lifesciences) superflow, high-capacity resin was used in this study and biotin was used as an eluent. Gel filtration of stromal proteome was carried out by fast protein liquid chromatography as described by Nishimura et al. (2015).
Immunoblotting
For immunoblotting, 10 μg of protein (unless otherwise mentioned in the figure legends) was separated by SDS-PAGE, followed by transfer to 0.2-μm nitrocellulose membrane. Proteins were detected by electrochemiluminescence using standard procedures.
CGEP Site-Directed Mutagenesis and In Vitro Proteolytic Activity Assays
Mature Arabidopsis CGEP (starting at amino acid 62; without cTP) was cloned by using forward (AtCGEP-M-FW-BamHI) and reverse (AtCGEP-M-RV-XhoI) primers (primers are listed in Supplemental Table S4). The forward primer contains BamHI and the reverse primer contains XhoI sites. The resulting PCR fragment was ligated into a pCR8 topo vector and confirmed by DNA sequencing. A pCR8 vector harboring the CGEP gene was digested by BamHI and XhoI restriction enzymes. The resulting DNA fragment was ligated into restriction sites (BamHI and XhoI) of a pGEX vector that has an N-terminal GST tag. Three mutants—GST-CGEP-S781R, GST-CGEP-C1 (CGEP-E928A-D931A), and GST-CGEP-C2 (CGEP-E946A-E949A-E951A) of AtCGEP—were constructed by using a PCR method as described by Bhuiyan et al. (2016). For the mutant GST-CGEP-S781R, the C-terminal part of mature protein was amplified from a pCR8 plasmid harboring the CGEP gene by using specific forward (AtCGEPS781R-FW) and reverse primers (AtCGEP-M-RV-XhoI). Forward primer AtCGEPS781R-FW contains the mutation site TCC (Ser) to CGC (Arg). The N-terminal part of the mature protein was amplified by using specific forward (AtCGEP-M-FW-BamHI) and reverse (AtCGEPS781R-RV) primers. Reverse primer AtCGEPS781R-RV contains the introduced site TCC (Ser) to CGC (Arg). The two amplified fragments were gel-purified, mixed, and used as a template (1:1) for second-round PCR to amplify mature protein by using the forward and reverse primer sets AtCGEP-M-FW and AtCGEP-M-RV, respectively. GST-CGEP-C1 was amplified the same way as GST-CGEP-S781R, except for that different primer sets were used to introduce two mutations from GAA (Glu) to GCA (Ala), and GAT (Aps) to GCT (Ala). GST-CGEP-C2 was amplified by using AtCGEP-M-FW-BamHI as a forward primer and AtCGEP-XhoI-E946AE949AE951A-RV as a reverse primer. This reverse primer contains three mutation sites E946A (AGT to ACT), E949A (AGC to ACC), and E951A (AAG to ACG). The PCR fragments were ligated into a pCR8 topo vector and the mutations were confirmed by DNA sequencing. pCR8 vectors harboring different CGEP mutants were digested by BamHI and XhoI sites, and the resulting fragments were ligated into the same sites of the pGEX-5 vector to fuse with GST at the N terminus of the CGEP gene.
BL21 E. coli cells were transformed by pGEX vectors harboring various CGEP constructs and cells were harvested from liquid culture after addition of 1 mm of isopropyl β-d-1-thiogalactopyranoside for 3 h incubation at 22°C. Overexpressed wild-type and mutant versions of CGEP in E. coli were solubilized in 500 mm of NaCl, 50 mm of Tris, and 10% (v/v) glycerol, at pH 8, and purified on a glutathione resin matrix. The purified protein was dialyzed by using a dialysis cassette (Slide-A-Lyzer; Thermo Fisher Scientific) against buffer 100 mm of NaCl, 50 mm of Tris, and 10% (v/v) glycerol. After dialysis, the protein was concentrated by using Microcon Centrifugal Filter units (Millipore). In vitro proteolytic activity was performed by incubating recombinant proteins in 100 mm of NaCl, 50 mm of Tris, and 10% (v/v) glycerol with substrate proteins at 37°C. The reaction was stopped by adding 3% SDS (w/v) and then followed by separation of the protein products with SDS-PAGE and staining with Coomassie Brilliant Blue. Additional reactions for rCGEP were carried out by addition of dithiothreitol (DTT; 5 mm) and various concentration of NaCl from 0.1 to 0.5 m, but these additions to the reaction mixture did not affect degradation (data not shown).
PICS for Determination of Protease Cleavage Specificity
The PICS procedure was based on the method described in Schilling et al. (2011) and Biniossek et al. (2016). To generate peptide libraries, 1 mg of protein-soluble leaf proteome (in 50 mm of HEPES, 40 μg mL−1 of bestatin, and 10 μg mL−1 of phosphoramidon) was mixed with an equal volume of 8 m of GuHCl to denature proteins. DTT was added to a final concentration of 5 mm and the samples were incubated at 65°C for 1 h. After cooling to room temperature, cysteines were alkylated by addition of 15 mm of iodoacetamide and then incubation for 20 min in darkness. Excess iodoacetamide was quenched with 10 mm of DTT and the sample was gradually 8-fold diluted with 200 mm of HEPES at pH 8. Protein extracts were digested with 20 μg of trypsin, 20 μg of GluC, or 15 μg of LysC per 1 mg of protein at 37°C for 16 h in 1 m of GuHCl and 200 mm of HEPES at pH 8. Any precipitate was removed by centrifugation, and an aliquot of the sample (1 μg) was resolved by SDS-PAGE and silver staining to ensure the protein digestion was complete. The Ser protease inhibitor Pefabloc-SC (Sigma-Aldrich) was added to a final concentration of 5 mm to inactivate the digestion proteases trypsin, LysC, or GluC. The peptide libraries were then acidified with formic acid and desalted using 1-mL Resprep C18 columns (Restek). The acetonitrile in the elution buffer was removed with a SpeedVac (Thermo Fisher Scientific) and the peptides suspended in 50 mm of HEPES and 100 mm of NaCl at pH 8, as detailed in “PICS Experiment 2”. Alternatively, peptides were dimethylated before desalting and carried forward, as detailed in “PICS Experiment 1”.
PICS Experiment 1
After digestion of the proteome (with trypsin, GluC, or LysC), peptides were dimethylated with CD2O and then desalted with 1-mL Resprep C18 columns as described above. Purified dimethylated peptide libraries (120–170 μg) were reacted with 6.5 μg of either rCGEP or catalytically inactive rCGEP-S781R and incubated for 16 h at 37°C. CGEP activity was abolished by heating at 70°C for 10 min. To remove small molecules containing primary amines, samples were again desalted with 1-mL Resprep C18 columns as described above and each sample was suspended in 100 μL of 200-mm HEPES at pH 8. Samples/peptides were then reacted with 5 μL of 10-mm Sulfo-N-hydroxysuccinimide-SS-Biotin (Pierce/Thermo Fisher Scientific) and 0.5 mm of final concentration for 2 h at 25°C. One milliliter of Strep-Tactin resin (IBA Lifesciences) was washed 5× with 50 mm of HEPES and 150 mm of NaCl at pH 8 and the resin was split among six tubes. Samples were then added to the resin and incubated 2 h at 25°C with shaking. Each sample was then transferred to a 0.5-mL Pierce Spin Filter (Thermo Fisher Scientific). Resin was washed 10× with 500 μL of wash buffer with a brief spin in a desktop centrifuge to avoid drying of resin. Three-hundred microliters of 50 mm of HEPES and 20 mm DTT at pH 8 were added and incubated for 10 min at 25°C. Peptides were eluted by centrifugation into a clean tube followed by an additional 200 μL of the above buffer. Samples were desalted using Resprep C18 columns (Restek) as described above, and were suspended in 30 μL of 2% (v/v) acetonitrile and 2% (v/v) formic acid for LC/MS analysis.
PICS Experiment 2
Fifty microliters of the trypsin and GluC peptide libraries (1 μg µL−1 in 50 mm of HEPES and 100 mm of NaCl at pH 8) were mixed with 5 or 10 μL of rCGEP (1 μg μL−1 in 50 mm of TrisHCl, 100 mm of NaCl at pH 8, and 30% [v/v] glycerol) or rCGEP-S781R, and incubated for 15 h at 37°C. After this incubation, peptides were dimethylated with either light (control: S781R) or heavy (sample: CGEP) formaldehyde. Two mole of CH2O (light formaldehyde) or CD2O (heavy formaldehyde) was added to give a final concentration of 40 mm, followed immediately by addition of 1 m of NaCNBH3 to give final concentration of 30 mm. The samples were incubated for 2 h at 25°C and then a second aliquot of CH2O and NaCNBH3 was added, as above, to give 80- and 60-mm final concentrations, respectively, and the samples were incubated overnight at 25°C. The dimethylation reaction was quenched with 0.1 m of Gly, final concentration. The sample (rCGEP digest, heavy label) and control (rCGEP-S781R digest, light label) reactions were then mixed in a fresh tube. A 5-μg aliquot was desalted with a C18 ZipTip (EMD Millipore) using the manufacturer’s guidelines and the peptide eluate brought to dryness with a SpeedVac (Thermo Fisher Scientific). Samples were suspended in 20 μL of 2% (v/v) acetonitrile and 2% (v/v) formic acid for LC/MS analysis.
For LC-MS/MS analysis, 6.4 μL of each sample was loaded onto a C18 trapping column and then eluted onto a 15-cm × 75-μm I.D. A C18 PepMap column was interfaced to an Linear Trap Quadropole Orbitrap (Thermo Fisher Scientific). A 90-min linear gradient from 3% to 40% solvent B was used to separate the peptides. A typical data-dependent acquisition method was used whereby MS spectra were acquired in an Orbitrap (Thermo Fisher Scientific) at 100-K resolution followed by five data-dependent MS/MS scans in the ion trap.
Peak lists (mgf files) for database searching were generated from Thermo XCalibur (Thermo Fisher Scientific) raw data files using DTA Supercharge (http://msquant.sourceforge.net/). The peak lists were searched using the tool MASCOT 2.4 (Matrix Science) against TAIR10, appended with all reverse sequences (Decoy) and common contaminants (71,149 sequences and 29,099,754 residues). After an initial database search performed at 30-ppm MS tolerance and 0.8-D MS/MS tolerance, the peak lists were recalibrated as described in Friso et al. (2010). A semispecific enzyme search was then conducted with semiArgC and semiGluC (V8), allowing for three missed cleavages, 6-ppm MS tolerance, and 0.8-D MS/MS tolerance. For PICS Experiment 1, fixed modifications were carbamidomethylation Cys and dimethyl Lys (heavy, +32 D), variable modifications were oxidized Met, pyroGlu N-term Gln, dimethyl N-term (heavy, +32 D), and Thioacyl N-term. For PICS Experiment 2, fixed modifications were carbamidomethylation Cys and dimethyl Lys, and variable modifications were oxidized Met, pyroGlu N-term Gln, acetyl N-term, and dimethyl N-term (light, +28 D or heavy, +32 D). Another search including singly methylated N-term was conducted for select files to detect methylated Nt Pro. The database search results were parsed and sorted in the software Microsoft Excel.
Sequence logo and iceLogo plots were generated with the tool iceLogo v.1.2 (http://www.proteomics.be). The complete TAIR10 proteome was used as a background to normalize for natural amino acid abundance in the library.
Structural Model
Three-dimensional protein structural models for CGEP were generated with the tool i-TASSER (Yang et al., 2015) using the mature CGEP sequence (residues 63–961) or CGEP with a truncated N-terminal domain (residues 387–961). The top-scoring i-TASSER model for CGEP (mature) had a C-score of −2.29 (C-scores range from −5 [poorest] to 2 [best]); estimated template modeling score = 0.45 ± 0.14; estimated root mean square deviation (RMSD) = 14.4 ± 3.7Å. The top-scoring i-TASSER model for CGEP (Nt truncation) had a C-score of −0.93; estimated template modeling score = 0.60 ± 0.14; estimated RMSD = 9.8 ± 4.6Å. Images were generated with the software PyMol v.1.7.4 (Schrödinger).
Accession Numbers
Sequence data from this article can be found in the GenBank/EMBL data libraries under accession numbers CGEP (AT2G47390), CLPR2 (AT1G12410), CLPT1 (AT4G25370), ClpT2 (AT4G12060), FTSH2 (AT2G30950), OOP (AT5G65620), PREP1 (AT3G19170), PREP2 (AT1G49630), DEG2 (AT2G47940), CLPR1 (AT1G49970), and CLPP5 (AT1G02560).
Supplemental Data
The following materials are available.
Supplemental Figure S1. CGEP sequence alignment and conservations.
Supplemental Figure S2. Phenotyping of cgep homozygous T-DNA insertion lines cgep-1, cgep-2, and cgep-3.
Supplemental Figure S3. Genetic interactions of CGEP with the CLP and FTSH2 chloroplast proteases.
Supplemental Figure S4. Comparative proteomics of wild type and cgep-2.
Supplemental Figure S5. MS/MS analysis of endogenous Arabidopsis CGEP protein accumulation to determine autocatalytic processing of the CGEP C terminus.
Supplemental Figure S6. A sequence alignment with the predicted second structures of S9B dipeptidyl aminopeptidase IV in the Gram-negative bacterium S. maltophilia (PDB:2ECF; Nakajima et al., 2008) and CGEP from the dicotyledons Arabidopsis, B. rapa, and P. trichocarpa.
Supplemental Figure S7. Top views of the Arabidopsis CGEP 3D structural model generated from the iTASSER server with side views shown in Figure 9.
Supplemental Figure S8. Genotyping and molecular characterization of transgenic CGEP complemented lines.
Supplemental Figure S9. Additional images of seedlings of cgep-2, clpr2-1, cgep-2 clpr2-1, and cgep-2 clpr2-1 complemented lines with the CGEPSTREP variants.
Supplemental Figure S10. Examples of affinity purification of in vivo CGEP and transgenic variants with the objective to identify interacting proteins.
Supplemental Table S1. Distribution of CGEP protein homologs across the species tree-of-life, detailing the 41 proteins for the phylogenetic analysis and the length of their C-terminal extensions.
Supplemental Table S2. Peptides identified in rGST-CGEP fusion variants CGEP, CGEP-S781R, C1, and C2 and their associated MS information.
Supplemental Table S3. Peptides identified in CGEP immunoprecipitated from wild-type Arabidopsis plants and their associated MS information.
Supplemental Table S4. Primers used in this study.
Supplemental Dataset S1. Untrimmed sequence alignment of the 41 CGEP homologs listed in Figure 1A and detailed in Supplemental Table S1.
Supplemental Dataset S2. Comparative proteomics of wild type and cgep-2, with all identified proteins and their annotation, spectral count data, and significance analysis.
Supplemental Dataset S3. Coexpression analysis of CGEP and five other proteases based on the microarray dataset Ath-m.c7-0, and their MR values from the database ATTED-II.
Supplemental Dataset S4. Identified peptides with associated information for PICS Experiment 1.
Supplemental Dataset S5. Identified peptides with associated information for PICS Experiment 2.
Supplemental Text S1. Experiments to determine if CGEP forms stable interactions with other proteins using affinity purification with anti-CGEP serum, or streptavidin resins, using leaf extracts of Arabidopsis transgenic lines expressing either CGEP-STREPII or CGEP-S781R-STREPII.
Footnotes
This research was supported by the National Science Foundation Division of Molecular and Cellular Biosciences (grant nos. 1614629 and 1940961 to K.J.v.W.).
Articles can be viewed without a subscription.
References
- Abt M, Pfister B, Sharma M, Eicke S, Burgy L, Neale I, Seung D, Zeeman SC(2020) STARCH SYNTHASE 5, a noncanonical starch synthase-like protein, promotes starch granule initiation in Arabidopsis. Plant Cell May 29:tpc.00946.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Abt MR, Zeeman SC(2020) Evolutionary innovations in starch metabolism. Curr Opin Plant Biol 55: 109–117 [DOI] [PubMed] [Google Scholar]
- Adam Z.(2015) Plastid intramembrane proteolysis. Biochim Biophys Acta 1847: 910–914 [DOI] [PubMed] [Google Scholar]
- Bachhawat AK, Yadav S(2018) The glutathione cycle: Glutathione metabolism beyond the γ-glutamyl cycle. IUBMB Life 70: 585–592 [DOI] [PubMed] [Google Scholar]
- Bhuiyan NH, Friso G, Poliakov A, Ponnala L, van Wijk KJ(2015) MET1 is a thylakoid-associated TPR protein involved in photosystem II supercomplex formation and repair in Arabidopsis. Plant Cell 27: 262–285 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bhuiyan NH, Friso G, Rowland E, Majsec K, van Wijk KJ(2016) The plastoglobule-localized metallopeptidase PGM48 is a positive regulator of senescence in Arabidopsis thaliana. Plant Cell 28: 3020–3037 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Biniossek ML, Niemer M, Maksimchuk K, Mayer B, Fuchs J, Huesgen PF, McCafferty DG, Turk B, Fritz G, Mayer J, et al. (2016) Identification of protease specificity by combining proteome-derived peptide libraries and quantitative proteomics. Mol Cell Proteomics 15: 2515–2524 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Butenko Y, Lin A, Naveh L, Kupervaser M, Levin Y, Reich Z, Adam Z(2018) Differential roles of the thylakoid lumenal deg protease homologs in chloroplast proteostasis. Plant Physiol 178: 1065–1080 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen M, Choi Y, Voytas DF, Rodermel S(2000) Mutations in the Arabidopsis VAR2 locus cause leaf variegation due to the loss of a chloroplast FtsH protease. Plant J 22: 303–313 [DOI] [PubMed] [Google Scholar]
- Drapeau GR, Boily Y, Houmard J(1972) Purification and properties of an extracellular protease of Staphylococcus aureus. J Biol Chem 247: 6720–6726 [PubMed] [Google Scholar]
- Forde BG, Lea PJ(2007) Glutamate in plants: Metabolism, regulation, and signalling. J Exp Bot 58: 2339–2358 [DOI] [PubMed] [Google Scholar]
- Forsberg J, Ström J, Kieselbach T, Larsson H, Alexciev K, Engstrom Å, Åkerlund H-E(2005) Protease activities in the chloroplast capable of cleaving an LHCII N-terminal peptide. Physiol Plant 123: 21–29 [Google Scholar]
- Friso G, Majeran W, Huang M, Sun Q, van Wijk KJ(2010) Reconstruction of metabolic pathways, protein expression, and homeostasis machineries across maize bundle sheath and mesophyll chloroplasts: Large-scale quantitative proteomics using the first maize genome assembly. Plant Physiol 152: 1219–1250 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Friso G, Olinares PDB, van Wijk KJ(2011) The workflow for quantitative proteome analysis of chloroplast development and differentiation, chloroplast mutants, and protein interactions by spectral counting In Jarvis RP, ed, Chloroplast Research in Arabidopsis, Vol Vol 775 Humana Press, New York, pp 265–282 [DOI] [PubMed] [Google Scholar]
- Fülöp V, Böcskei Z, Polgár L(1998) Prolyl oligopeptidase: An unusual beta-propeller domain regulates proteolysis. Cell 94: 161–170 [DOI] [PubMed] [Google Scholar]
- Goren A, Ashlock D, Tetlow IJ(2018) Starch formation inside plastids of higher plants. Protoplasma 255: 1855–1876 [DOI] [PubMed] [Google Scholar]
- Hildebrandt TM, Nunes Nesi A, Araújo WL, Braun HP(2015) Amino acid catabolism in plants. Mol Plant 8: 1563–1579 [DOI] [PubMed] [Google Scholar]
- Houmard J, Drapeau GR(1972) Staphylococcal protease: A proteolytic enzyme specific for glutamoyl bonds. Proc Natl Acad Sci USA 69: 3506–3509 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang M, Friso G, Nishimura K, Qu X, Olinares PD, Majeran W, Sun Q, van Wijk KJ(2013) Construction of plastid reference proteomes for maize and Arabidopsis and evaluation of their orthologous relationships: The concept of orthoproteomics. J Proteome Res 12: 491–504 [DOI] [PubMed] [Google Scholar]
- Joshi NC, Meyer AJ, Bangash SAK, Zheng ZL, Leustek T(2019) Arabidopsis γ-glutamylcyclotransferase affects glutathione content and root system architecture during sulfur starvation. New Phytol 221: 1387–1397 [DOI] [PubMed] [Google Scholar]
- Kato Y, Sakamoto W(2018) FtsH protease in the thylakoid membrane: Physiological functions and the regulation of protease activity. Front Plant Sci 9: 855. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim J, Kimber MS, Nishimura K, Friso G, Schultz L, Ponnala L, van Wijk KJ(2015) Structures, functions, and interactions of ClpT1 and ClpT2 in the Clp protease system of Arabidopsis chloroplasts. Plant Cell 27: 1477–1496 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim J, Olinares PD, Oh SH, Ghisaura S, Poliakov A, Ponnala L, van Wijk KJ(2013) Modified Clp protease complex in the ClpP3 null mutant and consequences for chloroplast development and function in Arabidopsis. Plant Physiol 162: 157–179 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kmiec B, Teixeira PF, Glaser E(2014) Shredding the signal: Targeting peptide degradation in mitochondria and chloroplasts. Trends Plant Sci 19: 771–778 [DOI] [PubMed] [Google Scholar]
- Krupa JC, Hasnain S, Nägler DK, Ménard R, Mort JS(2002) S2′ substrate specificity and the role of His110 and His111 in the exopeptidase activity of human cathepsin B. Biochem J 361: 613–619 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Laing WA, Christeller JT(1997) A plant chloroplast glutamyl proteinase. Plant Physiol 114: 715–722 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lensch M, Herrmann RG, Sokolenko A(2001) Identification and characterization of SppA, a novel light-inducible chloroplast protease complex associated with thylakoid membranes. J Biol Chem 276: 33645–33651 [DOI] [PubMed] [Google Scholar]
- Majeran W, Friso G, Ponnala L, Connolly B, Huang M, Reidel E, Zhang C, Asakura Y, Bhuiyan NH, Sun Q, et al. (2010) Structural and metabolic transitions of C4 leaf development and differentiation defined by microscopy and quantitative proteomics in maize. Plant Cell 22: 3509–3542 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Majsec K, Bhuiyan NH, Sun Q, Kumari S, Kumar V, Ware D, van Wijk KJ(2017) The plastid and mitochondrial peptidase network in Arabidopsis thaliana: A foundation for testing genetic interactions and functions in organellar proteostasis. Plant Cell 29: 2687–2710 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Martin MN, Saladores PH, Lambert E, Hudson AO, Leustek T(2007) Localization of members of the gamma-glutamyl transpeptidase family identifies sites of glutathione and glutathione S-conjugate hydrolysis. Plant Physiol 144: 1715–1732 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moldavski O, Levin-Kravets O, Ziv T, Adam Z, Prag G(2012) The hetero-hexameric nature of a chloroplast AAA+ FtsH protease contributes to its thermodynamic stability. PLoS One 7: e36008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Muramatsu T, Takemoto C, Kim YT, Wang H, Nishii W, Terada T, Shirouzu M, Yokoyama S(2016) SARS-CoV 3CL protease cleaves its C-terminal autoprocessing site by novel subsite cooperativity. Proc Natl Acad Sci USA 113: 12997–13002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nakai A, Yamauchi Y, Sumi S, Tanaka K(2012) Role of acylamino acid-releasing enzyme/oxidized protein hydrolase in sustaining homeostasis of the cytoplasmic antioxidative system. Planta 236: 427–436 [DOI] [PubMed] [Google Scholar]
- Nakajima Y, Ito K, Toshima T, Egawa T, Zheng H, Oyama H, Wu YF, Takahashi E, Kyono K, Yoshimoto T(2008) Dipeptidyl aminopeptidase IV from Stenotrophomonas maltophilia exhibits activity against a substrate containing a 4-hydroxyproline residue. J Bacteriol 190: 7819–7829 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Naqvi N, Liu K, Graham RM, Husain A(2005) Molecular basis of exopeptidase activity in the C-terminal domain of human angiotensin I-converting enzyme: Insights into the origins of its exopeptidase activity. J Biol Chem 280: 6669–6675 [DOI] [PubMed] [Google Scholar]
- Ng NM, Pike RN, Boyd SE(2009) Subsite cooperativity in protease specificity. Biol Chem 390: 401–407 [DOI] [PubMed] [Google Scholar]
- Nishimura K, Apitz J, Friso G, Kim J, Ponnala L, Grimm B, van Wijk KJ(2015) Discovery of a unique Clp component, ClpF, in chloroplasts: A proposed binary ClpF-ClpS1 adaptor complex functions in substrate recognition and delivery. Plant Cell 27: 2677–2691 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nishimura K, Asakura Y, Friso G, Kim J, Oh SH, Rutschow H, Ponnala L, van Wijk KJ(2013) ClpS1 is a conserved substrate selector for the chloroplast Clp protease system in Arabidopsis. Plant Cell 25: 2276–2301 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nishimura K, Kato Y, Sakamoto W(2016) Chloroplast proteases: Updates on proteolysis within and across suborganellar compartments. Plant Physiol 171: 2280–2293 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nishimura K, Kato Y, Sakamoto W(2017) Essentials of proteolytic machineries in chloroplasts. Mol Plant 10: 4–19 [DOI] [PubMed] [Google Scholar]
- Nishimura K, van Wijk KJ(2015) Organization, function and substrates of the essential Clp protease system in plastids. Biochim Biophys Acta 1847: 915–930 [DOI] [PubMed] [Google Scholar]
- Obayashi T, Aoki Y, Tadaka S, Kagaya Y, Kinoshita K(2018) ATTED-II in 2018: A plant coexpression database based on investigation of the statistical property of the mutual rank index. Plant Cell Physiol 59: 440. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Olinares PD, Kim J, Davis JI, van Wijk KJ(2011) Subunit stoichiometry, evolution, and functional implications of an asymmetric plant plastid ClpP/R protease complex in Arabidopsis. Plant Cell 23: 2348–2361 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Olinares PD, Ponnala L, van Wijk KJ(2010) Megadalton complexes in the chloroplast stroma of Arabidopsis thaliana characterized by size exclusion chromatography, mass spectrometry, and hierarchical clustering. Mol Cell Proteomics 9: 1594–1615 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Park S, Rodermel SR(2004) Mutations in ClpC2/Hsp100 suppress the requirement for FtsH in thylakoid membrane biogenesis. Proc Natl Acad Sci USA 101: 12765–12770 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Porodko A, Cirnski A, Petrov D, Raab T, Paireder M, Mayer B, Maresch D, Nika L, Biniossek ML, Gallois P, et al. (2018) The two cathepsin B-like proteases of Arabidopsis thaliana are closely related enzymes with discrete endopeptidase and carboxydipeptidase activities. Biol Chem 399: 1223–1235 [DOI] [PubMed] [Google Scholar]
- Porra RJ, Thompson WA, Kriedemann PE(1989) Determination of accurate extinction coefficients and simultaneous equations for assaying chlorophylls a and b extracted with four different solvents: Verification of the concentration of chlorophyll standards by atomic absorption spectroscopy. Biochim Biophys Acta 975: 384–394 [Google Scholar]
- Qi Y, Wang X, Lei P, Li H, Yan L, Zhao J, Meng J, Shao J, An L, Yu F, et al. (2020) The chloroplast metalloproteases VAR2 and EGY1 act synergistically to regulate chloroplast development in Arabidopsis. J Biol Chem 295: 1036–1046 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rawlings ND, Waller M, Barrett AJ, Bateman A(2014) MEROPS: The database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res 42: D503–D509 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rea D, Fülöp V(2006) Structure-function properties of prolyl oligopeptidase family enzymes. Cell Biochem Biophys 44: 349–365 [DOI] [PubMed] [Google Scholar]
- Rigas S, Daras G, Tsitsekian D, Alatzas A, Hatzopoulos P(2014) Evolution and significance of the Lon gene family in Arabidopsis organelle biogenesis and energy metabolism. Front Plant Sci 5: 145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rodriguez-Concepcion M, D’Andrea L, Pulido P(2018) Control of plastidial metabolism by the Clp protease complex. J Exp Bot 70: 2049–2058 [DOI] [PubMed] [Google Scholar]
- Rudella A, Friso G, Alonso JM, Ecker JR, van Wijk KJ(2006) Downregulation of ClpR2 leads to reduced accumulation of the ClpPRS protease complex and defects in chloroplast biogenesis in Arabidopsis. Plant Cell 18: 1704–1721 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sarnovsky R, Rea J, Makowski M, Hertle R, Kelly C, Antignani A, Pastrana DV, Fitzgerald DJ(2009) Proteolytic cleavage of a C-terminal prosequence, leading to autoprocessing at the N terminus, activates leucine aminopeptidase from Pseudomonas aeruginosa. J Biol Chem 284: 10243–10253 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schilling O, Huesgen PF, Barré O, Overall CM(2011) Identification and relative quantification of native and proteolytically generated protein C-termini from complex proteomes: C-terminome analysis. Methods Mol Biol 781: 59–69 [DOI] [PubMed] [Google Scholar]
- Smith AM, Zeeman SC(2020) Starch: A flexible, adaptable carbon store coupled to plant growth. Annu Rev Plant Biol 71: 217–245 [DOI] [PubMed] [Google Scholar]
- Teixeira PF, Kmiec B, Branca RM, Murcha MW, Byzia A, Ivanova A, Whelan J, Drag M, Lehtiö J, Glaser E(2017) A multi-step peptidolytic cascade for amino acid recovery in chloroplasts. Nat Chem Biol 13: 15–17 [DOI] [PubMed] [Google Scholar]
- Tsirigotaki A, Van Elzen R, Van Der Veken P, Lambeir A-M, Economou A(2017) Dynamics and ligand-induced conformational changes in human prolyl oligopeptidase analyzed by hydrogen/deuterium exchange mass spectrometry. Sci Rep 7: 2456. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsuji A, Fujisawa Y, Mino T, Yuasa K(2011) Identification of a plant aminopeptidase with preference for aromatic amino acid residues as a novel member of the prolyl oligopeptidase family of serine proteases. J Biochem 150: 525–534 [DOI] [PubMed] [Google Scholar]
- Tsuji A, Kikuchi Y, Ogawa K, Saika H, Yuasa K, Nagahama M(2008) Purification and characterization of cathepsin B-like cysteine protease from cotyledons of daikon radish, Raphanus sativus. FEBS J 275: 5429–5443 [DOI] [PubMed] [Google Scholar]
- van Wijk KJ.(2015) Protein maturation and proteolysis in plant plastids, mitochondria, and peroxisomes. Annu Rev Plant Biol 66: 75–111 [DOI] [PubMed] [Google Scholar]
- Wetzel CM, Harmacek LD, Yuan LH, Wopereis JL, Chubb R, Turini P(2009) Loss of chloroplast protease SPPA function alters high light acclimation processes in Arabidopsis thaliana L. (Heynh.). J Exp Bot 60: 1715–1727 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xie Q, Michaeli S, Peled-Zehavi H, Galili G(2015) Chloroplast degradation: One organelle, multiple degradation pathways. Trends Plant Sci 20: 264–265 [DOI] [PubMed] [Google Scholar]
- Yadav P, Goyal VD, Gaur NK, Kumar A, Gokhale SM, Jamdar SN, Makde RD(2019) Carboxypeptidase in prolyl oligopeptidase family: Unique enzyme activation and substrate-screening mechanisms. J Biol Chem 294: 89–100 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yamauchi Y, Ejiri Y, Sugimoto T, Sueyoshi K, Oji Y, Tanaka K(2001) A high molecular weight glutamyl endopeptidase and its endogenous inhibitors from cucumber leaves. J Biochem 130: 257–261 [DOI] [PubMed] [Google Scholar]
- Yang J, Yan R, Roy A, Xu D, Poisson J, Zhang Y(2015) The I-TASSER suite: Protein structure and function prediction. Nat Methods 12: 7–8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Young GH, Chen HM, Lin CT, Tseng KC, Wu JS, Juang RH(2006) Site-specific phosphorylation of L-form starch phosphorylase by the protein kinase activity from sweet potato roots. Planta 223: 468–478 [DOI] [PubMed] [Google Scholar]
- Yu Y, Mu HH, Wasserman BP, Carman GM(2001) Identification of the maize amyloplast stromal 112-kD protein as a plastidic starch phosphorylase. Plant Physiol 125: 351–359 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zybailov B, Rutschow H, Friso G, Rudella A, Emanuelsson O, Sun Q, van Wijk KJ(2008) Sorting signals, N-terminal modifications and abundance of the chloroplast proteome. PLoS One 3: e1994. [DOI] [PMC free article] [PubMed] [Google Scholar]