Abstract
Proline‐specific endoproteases have been successfully used in, for example, the in‐situ degradation of gluten, the hydrolysis of bitter peptides, the reduction of haze during beer production, and the generation of peptides for mass spectroscopy and proteomics applications. Here we present the crystal structure of the extracellular proline‐specific endoprotease from Aspergillus niger (AnPEP), a member of the S28 peptidase family with rarely observed true proline‐specific endoprotease activity. Family S28 proteases have a conventional Ser‐Asp‐His catalytic triad, but their oxyanion‐stabilizing hole shows a glutamic acid, an amino acid not previously observed in this role. Since these enzymes have an acidic pH optimum, the presence of a glutamic acid in the oxyanion hole may confine their activity to an acidic pH. Yet, considering the presence of the conventional catalytic triad, it is remarkable that the A. niger enzyme remains active down to pH 1.5. The determination of the primary cleavage site of cytochrome c along with molecular dynamics‐assisted docking studies indicate that the active site pocket of AnPEP can accommodate a reverse turn of approximately 12 amino acids with proline at the S1 specificity pocket. Comparison with the structures of two S28‐proline‐specific exopeptidases reveals not only a more spacious active site cavity but also the absence of any putative binding sites for amino‐ and carboxyl‐terminal residues as observed in the exopeptidases, explaining AnPEP's observed endoprotease activity.
Keywords: endoprotease, fungal, molecular dynamics assisted docking, proline‐specific, proteomics, serine protease, α/β‐hydrolase fold
1. INTRODUCTION
Proline is the only amino acid in proteins with its side chain covalently bonded to its amino nitrogen, thereby forming a five‐membered pyrrolidine ring structure. Consequently, the peptide group preceding proline lacks the amide hydrogen atom, preventing the amide to act as a hydrogen bond donor. In addition, the pyrrolidine ring considerably restricts the available conformations of the polypeptide chain around the proline residue, not only by limiting the main chain rotation about its N‐Cα bond, but also, because of steric hindrance, that of its Cα‐C bond and the main chain rotations of the preceding residue (MacArthur & Thornton, 1991). As a result, prolines are rarely observed in α‐helices and β‐strands, but more frequently in turns. Therefore, on average, prolines in proteins are more solvent accessible than would be expected based on their hydrophobic nature (Kay et al., 2000; Williamson, 1994). On the other hand, prolines may decrease the overall susceptibility of a polypeptide to proteolytic digestion, probably by reducing the flexibility of the polypeptide around the proline which may prohibit local unfolding (Fu et al., 2009; Fu et al., 2010; Gass & Khosla, 2007; Kazlauskas, 2018; Markert et al., 2003). This resistance particularly applies to peptide bonds directly preceding (Xaa‐Pro) and directly following a proline (Pro‐Xaa). Examples of such resistant proline‐rich polypeptides include human elastin and mucins, as well as the gluten fraction of cereals like wheat, barley and rye. Yet, proline‐specific proteases that hydrolyse proline‐containing peptide bonds, do exist; they have been found in mammals, plants, fungi, bacteria and, more recently, in insects (Cunningham & O'Connor, 1997; Dunaevsky et al., 2020; Lone et al., 2010; Rosenblum & Kozarich, 2003; Yaron & Naider, 1993).
Some bacterial and fungal proline‐specific proteases have been developed into convenient tools for application in the food industry (Baharin et al., 2022; Mika et al., 2015). Still, most of them have a reaction specificity that limits their application. For example, many of these are exopeptidases that remove only one or two amino acids from either the amino‐ or carboxyl‐terminal end. The amino‐terminal exopeptidases comprise aminopeptidases which cleave H‐Xaa‐|‐Pro bonds, proline aminopeptidases which cleave H‐Pro‐|‐Xaa bonds, and dipeptidyl aminopeptidases which cleave H‐Xaa‐Pro‐|‐Xaa bonds. The Pro‐|‐Xaa‐OH carboxypeptidases remove carboxyl‐terminal amino acids preceded by a proline, but not carboxyl‐terminal prolines. Prolyl endopeptidases on the other hand cleave their substrates away from the substrate termini, but most of the presently characterized prolyl endopeptidases are restricted to cleaving peptides of a maximum length of around 30 residues and are therefore commonly referred to as prolyl oligopeptidases (Camargo et al., 1979; Fülöp et al., 1998; Moriyama et al., 1988; Polgár, 2002; Rea & Fülöp, 2006; Rea & Fülöp, 2011; Szeltner & Polgár, 2008; van Elzen & Lambeir, 2011). True proline‐specific endoprotease activity, not limited by the size of the peptide substrates, appears to be rare. Nevertheless, such a protease has been isolated from a culture filtrate of an Aspergillus niger strain CBS 109712 (Edens, Dekker, et al., 2005; Takahashi, 2013a). Notably, this protease, abbreviated as AnPEP, was found to be active over a broad pH range extending from neutral pH 7 to strongly acidic pH 1.5, preferentially cleaving Pro‐|‐Xaa bonds and, to a lesser extent, Ala‐|‐Xaa bonds, with only a small number of cleavages behind other amino acids (Edens, Dekker, et al., 2005; Samodova et al., 2020; Sebela et al., 2009; Tsiatsiani et al., 2017; van der Laarse et al., 2020; van Schaick et al., 2021). It also accepts hydroxyproline at the P1 position (Edens, Dekker, et al., 2005; Samodova et al., 2020). Surprisingly, sequence alignments suggested that despite its endopeptidase activity, AnPEP does not belong to the S9 serine peptidase family containing the prolyl oligopeptidases, but instead to the S28 family, which contains mainly proline‐specific exopeptidases (Rawlings et al., 2016), albeit with low sequence identity. The crystal structures of two exopeptidases within this family, hPRCP and hDPP7, revealed a peptidase domain with a canonical α/β‐hydrolase fold and Ser/Asp/His catalytic triad, while a mainly α‐helical domain caps the active site (Bezerra et al., 2012; Soisson et al., 2010).
The recognition of the unique enzymatic properties of AnPEP has already resulted in a substantial number of applications for the enzyme. For example, AnPEP is applied to degrade proline‐rich peptides present in wheat and barley gluten (Akeroyd et al., 2016; Mitea et al., 2008; Montserrat et al., 2015; Tack et al., 2013; Wei et al., 2020). Since proline‐specific proteases are absent in human gastric and pancreatic secretions, such gluten peptides normally reach the duodenum intact. This can cause severe immunological responses in individuals suffering from celiac disease, an autoimmune disorder estimated to occur in 0.5%–1% of the human population (Gujral et al., 2012). Dietary addition of proline‐specific proteases from microbial or fungal sources helps to prevent such responses by breaking down the immunogenic peptides in the stomach. AnPEP turned out to be particularly suitable for this purpose in view of its acidic pH optimum and resistance to pepsin (Mitea et al., 2008; Montserrat et al., 2015; Tack et al., 2013). Another application of AnPEP is in the brewing industry (Edens, van der Laan, & Craig, 2005; Lopez & Edens, 2005), where it can be added directly to the fermentations to degrade the proline‐rich barley proteinaceous fragments which remain after mashing and cooking. This prevents the formation of colloidal precipitate that otherwise would result in haziness upon storage. Similarly, AnPEP has been used in soy sauce preparation, where it was shown to prevent precipitation of soybean globulin G4 under high‐salt conditions (Shan et al., 2022).
Recently, AnPEP has also been successfully used to digest deuterated proteins in Hydrogen‐Deuterium Exchange Mass Spectrometry, which is routinely used in the pharmaceutical industry to check the structural integrity of proteins (Tsiatsiani et al., 2017). Due to its preference to cleave behind prolines and its activity around pH 2, the enzyme was found to be complementary to the widely used pepsin, especially for analyzing proteins with a high proline content. Furthermore, AnPEP has proven to be a powerful tool for proteomics applications, increasing the sequence coverage when used in combination with trypsin (Samodova et al., 2020). Since many phosphorylation sites in proteins precede a proline, the use of AnPEP also results in a more accurate phosphorylation profiling, revealing several additional phosphorylation sites that are not observed with trypsin (van der Laarse et al., 2020).
To better understand the unique specificity of AnPEP and to extend its applicability even more, a better insight into the structure–function relationship of this remarkable enzyme is desired. Very recently, the crystal structure of a highly homologous (96.3% sequence identity) proline specific endoprotease (PEP) from a different A. niger strain (ATCC 64974–102810) was published (Miyazono et al., 2022); a relatively wide catalytic pocket compared to the S28 exopeptidases was proposed to facilitate its endo‐specific cleavage of proteinaceous substrates. Here we present the three‐dimensional structure of AnPEP, the proline specific endoprotease from A. niger CBS 109712, with various PEG molecules bound in the active site. Furthermore, mass spectroscopy was applied to follow the degradation of cytochrome c by AnPEP over time. Based on the observed primary cleavage site and molecular dynamics (MD) assisted docking studies, we propose an explanation for the degradation of proline‐rich peptides and proteins by AnPEP, not being limited by the length of the polypeptide substrate. In addition, we discuss the catalytic mechanism of AnPEP in light of its activity at very low pH, which is quite exceptional for a conventional serine protease.
2. RESULTS
2.1. The structure of AnPEP
At the time when the x‐ray diffraction data were collected, only the 3D structures of a lysosomal prolyl carboxypeptidase (hPRCP; PDB 3N2Z [Soisson et al., 2010]) and a dipeptidyl aminopeptidase DPP7 (hDPP7; PDB's 3JYH, 3N0T, and 4EBB [Bezerra et al., 2012]) were available. The low sequence identity and variations in the architecture and orientation of the two domains thwarted a straightforward molecular replacement solution. Ultimately, the use of Rosetta structural modeling (mr_rosetta) was instrumental in achieving good enough phases for building the entire AnPEP structure. AnPEP crystallized in two crystal forms, AnPEP‐A and AnPEP‐B, with different space groups (I 2 3 and C 2 2 21, respectively). AnPEP‐A crystals contain one protomer in the asymmetric unit, and all amino acid residues of the mature protein (residues 42–526) were visible in electron density (Figure 1a, b). The α/β‐hydrolase domain of AnPEP consists of residues 42–206 and 449–526; although it resembles a prototypic α/β‐hydrolase fold (Ollis et al., 1992), helix αE between β7 and β8 is replaced by a short 310‐helix. Otherwise, a long insertion between β6 and β7 forms the cap domain of AnPEP (residues 207–448), creating a deep cavity at the interface of the two domains and giving access to the catalytic site. The cap domain consists of an SKS‐subdomain (Ollis et al., 1992; Soisson et al., 2010) followed by a region with mainly α‐helices connected by long loops (the SKS‐extension) and a short two‐stranded parallel β‐sheet. For AnPEP‐B crystals, the asymmetric unit contains two protomers; both are very similar to the one of AnPEP‐A, with root mean square deviations (RMSDs) of 0.36 and 0.38 Å, respectively (for Cα atoms). The packing of protomers in AnPEP‐B crystals is quite different, for example, the cap domain has less intermolecular interactions than in AnPEP‐A. According to the PISA server (Krissinel & Henrick, 2007), neither of the crystal forms suggests the formation of oligomers.
FIGURE 1.

The crystal structure of AnPEP. (a) Cartoon representation of the tertiary structure of AnPEP. The α/β‐hydrolase domain (residues 42–206 and 449–526) is shown in blue with the β‐strands in cyan. The cap domain starts with the SKS subdomain (residues 207–334, green) and continues with a more elongated segment (residues 335–448, yellow), the SKS extension. Glycan moieties and the catalytic triad are shown as sticks with green carbon atoms. The N‐ and C‐terminus are emphasized by blue and red spheres respectively. (b) Schematic diagram of the secondary structure of AnPEP according to DSSP. The residues forming the catalytic triad are indicated in red. The orange cylinders represent 3/10 helices, while the π‐helix is marked.
For AnPEP, seven N‐glycosylation sites were predicted, and at each of these (Asn residues 100, 162, 226, 244, 328, 355, 446) additional electron density was observed, which is consistent with previous biochemical studies showing that all sites are occupied by high‐mannose glycans (Sebela et al., 2009; van Schaick et al., 2021). A total of 20 sugar residues could be fitted into the electron density (Figure 1a). Although phosphorylation for some of the N‐glycans has been demonstrated for AnPEP (van Schaick et al., 2021), inspection of the electron density did not show any additional density consistent with an additional phosphate group. The 7 glycosylation sites are distributed evenly over the protein surface, but none of them is close to the active site. Of the 7 cysteine residues in the cap domain, 6 form 3 disulfide bridges (227–295, 336–350, 379–441), each connecting two helical segments. Remarkably, Cys227 forms a disulfide bridge with Cys295 and at the same time is part of the glycosylation motif Asn226‐Cys227‐Ser228. Electron density for the Cys336‐Cys350 in AnPEP‐A suggested that this disulfide bridge may be partly oxidized, but no such evidence was found for AnPEP‐B. The seventh cysteine, Cys493, is in its free thiol form and is located in the α/β‐hydrolase domain, near the catalytic triad. One cis‐proline is observed at position 382, allowing for a quite sharp turn between α‐helix 372–381 and β‐strand 384–386. Both AnPEP structures have Ramachandran outliers, but in all cases the electron density for these residues was unambiguous. In AnPEP‐A the single outlier is Phe384, adjacent to Tyr385 that lines the active site pocket, while its side chain is buried in the protein interior. AnPEP‐B also features this outlier (in both protomers); additional outliers, located far from the active site, are residues Asp266 (in both protomers) and Val402 (in protomer B).
2.2. Comparison with other S28 enzymes
The AnPEP‐A and ‐B crystal structures, determined at pH ≈ 5.0 and ≈6.0, respectively, closely resemble the recently published structure of the highly homologous PEP (PDB: 7WAB); despite an estimated crystallization pH of ≈2.5 for PEP and a different crystal packing, the superposition with AnPEP results in a RMSD of only 0.33 Å (for Cα atoms) with no noticeable differences in the main chain fold. Of the 18 amino acid residues that differ between the enzymes, 15 are outside the active site (SI Figure S1). Thr85, replacing an Asn, is relatively close to the active site serine (their side chain O atoms are 11.5 Å apart), but is buried in the protein interior. The closest and most striking differences are Tyr212 replacing a Phe and Tyr385 replacing a Trp, both at the same side of the central cavity. Otherwise, compared to the published PEP structure (Miyazono et al., 2022), we observed three additional N‐glycosylation sites (at Asn residues 162, 244, and 446).
Despite the low sequence identity (≈23%) with the human exopeptidases hPRCP and hDPP7 (SI Figure S2), the overall fold of AnPEP does resemble that of these S28 structures (SI Figure S1, S3) albeit that superposition results in high RMSD values (4.3 Å). The sequence identity is lowest in the cap domain (11% with hPRCP; 18% with hDPP7); here, AnPEP features many insertions and deletions in the SKS extension region, resulting in a larger cap domain (242 residues, compared to 214 in hPRCP and 219 in hDPP7). Superposition based on the α/β‐hydrolase domains shows that the AnPEP cap domain has a somewhat different relative position compared to the human peptidases (not shown), while the cap domain of PEP has exactly the same relative position. Currently, it is impossible to conclude whether these are actual structural differences between the enzymes, or that they reflect movements of the cap domains allowing for the opening and the closing of the active site cavity.
Besides a slightly different orientation of the cap domain, three regions of AnPEP contribute to differences in the shape and accessibility of the central cavity located between the α/β‐hydrolase domain and cap domain (Figure 2). These regions (site 1–3) have been described for PEP by Miyazono et al (Miyazono et al., 2022) and are virtually identical in AnPEP, with the exception that in contrast to PEP, residues 92–98 (site 2) in the α/β‐hydrolase domain have well‐defined electron density and do not show very high B‐factors, suggesting this region is not particularly flexible in AnPEP. Sites 1 and 3 reside in the cap domain and correspond to regions where AnPEP has shorter loops than the two human exopeptidases (Figure 2a). First, in hDPP7 and hPRCP, the connection between α‐helices D4 and D5 of the SKS domain (site 1) features a short antiparallel β‐sheet that protrudes into the active site and significantly narrows it down (Figure 2b,c resp.). In AnPEP, however, this connecting loop is only three residues long (residues 286–288), lacks secondary structure, and does not hinder access to the active site (Figure 2a). Second, at site 3, hDPP7 features two long loops (residues 311–319 and 324–340; Figure 2b) that protrude into the active site and considerably narrow it down. Of the corresponding loops in hPRCP, the second one is shorter (residues 347–352; Figure 2c) but lacked electron density; it is therefore not clear whether this loop would also protrude into the active site as in hDPP7. In AnPEP, however, the loops forming site 3 are absent or shorter, creating more space compared to the exopeptidases. Besides site 1–3, we noted that in AnPEP α‐helix D4 has a kink (Figure 1b), creating some additional space at the back of the active site. Altogether, the catalytic triad at the bottom of the active site cavity of AnPEP is significantly more accessible than that of hDPP7 and hPRCP, providing more space for binding peptides downstream of the scissile bond.
FIGURE 2.

Comparison of the active site cavities of AnPEP, hPRCP (3N2Z), and hDPP7 (3JYH), using solvent/protein contact surfaces. For AnPEP (panel a), the color scheme of Figure 1 is used with the catalytic domain in blue (42–206, 449–526), the SKS domain in green (207–334) and the SKS extension in yellow (335–448). The surface of the catalytic triad is emphasized in red and the three sites described by Miyazono et al (Miyazono et al., 2022) are indicated. Panels B and C show hDPP7 resp. hPRCP in the same orientation, showing how insertions in these proteases significantly reduce the access to the active site compared to AnPEP. The missing residues in hPRCP are indicated with a dotted line.
2.3. The catalytic site
The catalytic triad of AnPEP consists of the nucleophile Ser179, the base His491 and the acid Asp458 and resides at the bottom of a deep funnel‐shaped cavity at the interface of the α/β‐hydrolase domain and the cap domain (Figure 3); the cavity is about 18 Å deep, 18 Å long, and 13 Å wide. The predicted catalytic nucleophile Ser179 is part of the “nucleophile elbow” (Nardini & Dijkstra, 1999) between strand β5 and α‐helix C within the sequence motif GGS179YSG. Notably, three PEG molecules were fitted into the additional electron density in the active site of AnPEP‐A, namely a di‐, a tetra‐ and a penta‐ethyleneglycol (EG2, EG4, and EG5, resp.). At one end, the terminal hydroxyl moiety of EG4 is within hydrogen bonding distance of Ser179 Oγ and Glu88 Oε2, while the other end of this EG4 extends along the cleft, towards the cap domain, where also a glycerol molecule is bound. In AnPEP‐B, a tetra‐ethyleneglycol molecule was fitted in the active site of one of the two protomers superposing reasonably well with the tetra‐ethyleneglycol of AnPEP‐A, such that the terminal OH near Ser179 is almost at the same position (1 Å shift; not shown).
FIGURE 3.

Stereo figure of the active site of AnPEP (in AnPEP‐A crystals). The protein secondary structure is shown as cartoons, with the α/β‐hydrolase domain in blue and the cap domain in green. The catalytic triad (Ser179, His491, Asp458) and the residues forming the oxyanion hole (Glu88 Oε2, Tyr180 N) are shown as sticks with relevant hydrogen bonds (red dotted lines). The di‐, tetra‐, and penta‐ethyleneglycol are shown in stick representation (yellow carbons). One end of the tetra‐ethyleneglycol (EG4) is hydrogen bonded to Ser179 Oγ, as well as to Glu88 Oε1. Waters are shown as red spheres, including two important water molecules (red labels; W5 and W30).
Next to Ser179, AnPEP features a small pocket surrounded by four aromatic residues (Tyr180, Trp374, Tyr385, and Trp460), a proline (Pro205) and a glutamate (Glu88) (Figure 4); a superposition with PEP showed that Tyr385 replaces a Trp, while Glu88 has a different rotamer conformation. Furthermore, Tyr212 replaces a Phe that is directly behind Trp460. The pocket likely is the S1 pocket, with the side chain of Glu88 and the main chain amide of Tyr180 in a favorable position to form the oxyanion hole. Furthermore, we compared the active site of AnPEP with that of the two S28 exopeptidases, hPRCP and hDPP7 (Figure 4a), and with that of two S9 peptidases, a dipeptidyl aminopeptidase hDPP4 and a porcine prolyl‐oligopeptidase with peptides bound (Figure 4b) (Aertgeerts et al., 2004; Fülöp et al., 2001). SI Table S1 summarizes the residues of the catalytic triad and the oxyanion hole of these S28 and S9 enzymes. The superpositions show that the P1 prolines of the bound peptides and the piperidine ring of a bound inhibitor fit quite well in the postulated S1 pocket of AnPEP. Remarkably, the peptide substrates bound to the human dipeptidyl aminopeptidase hDPP4 and the porcine prolyl oligopeptidase are arranged in the same way as the tetra‐ and pentaethylene glycol molecules present in the active site of AnPEP. Together, these results strongly suggest that the architecture of the AnPEP cleavage site (catalytic triad, S1 pocket and oxyanion hole) is very similar to the known S28 and S9 peptidases.
FIGURE 4.

Catalytic triad, S1 pocket and oxyanion hole in families S28 and S9 proline specific proteases. (a) Superposition of S28 proteases: AnPEP (this study; brown sticks), PEP (PDB: 7WAB; orange), hPRCP (PDB: 3N2Z; cyan) and hDPP7 with bound inhibitor Dab‐Pip (L‐2‐4‐diaminobutyryl‐piperdinamide) (PDB: 3N0T; pink). The S1 pocket of AnPEP is shown as a surface and is shaped by Tyr180, Pro205, Trp374, Tyr385, and Trp460. Residue labels that are different in PEP are indicated between brackets; in particular, the substitutions Y212F and Y385W line the active site pocket. (b) Superposition of AnPEP (this study, brown) with two S9 proteases: proline‐specific dipeptidyl aminopeptidase hDPP4 (PDB: 1R9N; forest green) and porcine prolyl oligopeptidase pPOP (PDB: 1E8N; purple), which contain respectively the hexapeptides Tyr‐Pro‐Ser‐Lys‐Pro‐Asp and Abz‐Gly‐Phe‐Gly‐Pro‐Phe‐Gly. Residue labels refer to AnPEP (black), hDPP4 (purple), and pPOP (green), respectively.
2.4. Charge‐relay network
The hydroxyl group of Ser179 forms a charge‐relay network with the catalytic base (His491) and catalytic acid (Asp458) (Figure 3). Asp458 Oδ1 is at hydrogen bonding distance of His491 Nδ1, while its Oδ2 accepts a hydrogen bond from Ser203 Oγ and from a water molecule (W30), which is anchored by two other hydrogen bonds provided by Gly455 N and Asn454 Nδ2. The water molecule and the hydrogen bonding pattern are conserved in all known S28 structures.
Superposition with the S9 peptidases (Figure 4b) shows that the carbonyl oxygen of the P1 proline of the bound peptides is located at hydrogen bonding distance of the Tyr180 N and Glu88 O atoms in AnPEP, confirming the previously postulated oxyanion hole (Bezerra et al., 2012). The syn‐lone pair of Glu88 Oε2 is directed towards the putative oxyanion. Thus, to stabilize the oxyanion, Oε2 should be protonated syn with respect to Oε1, which happens to be the most stable conformation for a protonated carboxyl group (Gandour, 1981; Sofronov et al., 2020). The carboxyl Oε1 of Glu88 is anchored by two hydrogen bonds, one donated by its backbone amide proton, the other donated by a water molecule (W5), which in turn is fixed by hydrogen bonds with the carbonyl oxygen of Pro86 and the backbone amide of Ser181 (Figure 3). In PEP, due to a different rotamer conformation, the side chain of Glu88 has fewer stabilizing hydrogen bonding interactions than in AnPEP. The water‐mediated hydrogen bond with Glu88 Oε1 in AnPEP replaces the hydrogen bond provided by an asparagine side chain which is present in hPRCP and hDPP7, while in AnPEP and PEP a glycine (Gly87) is observed at the corresponding position. The asparagine is quite typical for the S28 exopeptidases while glycine seems typical for the endoproteases. Thus, the water molecule (W5) in AnPEP could play a similar role as the asparagine in hPRCP and hDPP7, by anchoring the Oε1 of the glutamic acid so that its Oε2, when protonated, is properly aligned to stabilize the oxyanion upon the formation of the tetrahedral intermediate.
The catalyzed peptidase reaction depends strongly on the protonation states of the involved amino acid residues. The AnPEP x‐ray diffraction data was collected from crystals soaked in cryo‐protecting solutions with pH ≈ 5. Unfortunately, at the present resolution, hydrogen atoms cannot be located by x‐ray analysis. Thus, to infer the putative protonation state of titratable amino acids from the observed interactions between polar atoms in the crystal structure, the “cell neutralization experiment” within YASARA was applied (Krieger et al., 2012). The predicted pKa's for His491 and Glu88 are 9.2 and 4.7, respectively, suggesting a fully protonated His491 and a partially protonated Glu88 at pH ≈ 5, which results in the hydrogen bonding network observed in SI Figure S4A. These predictions contradict what would be expected for a fully active enzyme at pH ≈ 5, namely a deprotonated histidine and a protonated glutamic acid. However, it is not uncommon that pKa shifts for active site residues are not captured well by prediction methods (Alexov et al., 2011; Grimsley et al., 2009; Harris & Turner, 2002). We therefore repeated the neutralization experiment using manually assigned pKa values for His491 and Glu88 which are more in agreement with the observed pH dependence of the activity: a pKa of 4 for His491 and a pKa of 6 for Glu88. The resulting hydrogen bonding network is also compatible with the x‐ray structure (SI Figure S4B). When considering the strength of the hydrogen bond between Ser179 and His 491, the x‐ray structure is most compatible with a deprotonated His491 accepting a hydrogen bond from Ser179 (SI Table S2, scenario B), which would allow complete transfer of the proton to His491 upon formation of the tetrahedral intermediate. However, the hydrogen bond between Glu88 and the hydroxyl moiety of EG4 is most compatible with an unprotonated Glu88 accepting the hydrogen bond from the hydroxyl moiety of EG4 (SI Table S2, scenario A). Therefore, the neutralization experiments are inconclusive regarding the protonation of the titratable residues in the active site of AnPEP at pH 5.
2.5. Substrate binding
As described above, the catalytic triad of AnPEP (and PEP) is the most accessible of the S28 structures elucidated so far. In contrast, the dipeptidyl aminopeptidases hDPP4 and hDPP7 have a narrower cavity; a short α‐helix extension comprising Glu205 and Glu206 in hDPP4 (Figure 5a), and Asp334, Thr336, and Gly337 in hDPP7 (Figure 5b), anchors the positively charged amino‐terminus of peptide substrates and blocks the binding of amino acids upstream of the P2 residue (Bezerra et al., 2012). AnPEP, PEP, and hPRCP lack these insertions which allows the binding of longer peptides upstream of the scissile bond (SI Figure S5). Superposition of AnPEP onto hDPP4 and porcine prolyl oligopeptidase, both in complex with substrate peptides, did place AnPEP's Gln280 with its Nε2 at hydrogen bonding distance of the carbonyl oxygen of the substrates' P2 residues (Figure 5a). Because proline lacks the main chain NH group, it was suggested that such a hydrogen bond is essential for the proper orientation of the P1 proline in post‐proline cleaving proteases (Roppongi et al., 2018; Szeltner & Polgár, 2008; van Elzen & Lambeir, 2011). Yet, while Gln280 is conserved in homologous sequences with at least 50% sequence identity to AnPEP, both hPRCP and hDPP7 have a methionine at the corresponding position.
FIGURE 5.

Blockade in hDPP4 and hDPP7 preventing binding upstream of the P2 residue lacks in AnPEP. (a) hDPP4 (PDB: 1R9N) with the hexapeptides Tyr‐Pro‐Ser‐Lys‐Pro‐Asp (forest green) superposed onto AnPEP (amino acids brown, cartoon coloring as in Figure 1). (b) hDPP7 (PDB: 3N0T) with its inhibitor Dab‐Pip (L‐2‐4‐diaminobutyryl‐piperdinamide) (pink) superposed onto AnPEP (amino acids brown, cartoon coloring as in Figure 1). Hydrogen bonds fixing the amino termini of the substrates are shown as red dotted lines. The carbonyl oxygen of the scissile bond is at hydrogen bonding distance of AnPEP's Glu88 Oε2 and Tyr180 N while the Q280 Nε2 is at hydrogen bonding distance of the P2 carbonyl oxygen (red dotted lines).
Downstream of the scissile bond, towards the substrates' carboxyl‐terminus, the superimposed peptides have no specific interactions, although the hydroxyl groups of Tyr94, Tyr97, and Tyr496 might interact with the P1', P2', and P4' carbonyl oxygens, respectively (SI Figure S5). As one would expect for an endoprotease, AnPEP does neither show any features that would prevent the binding of peptides beyond the P4' position, nor a possible binding site that could accommodate a peptide's carboxyl‐terminus.
2.6. Electrostatic properties
With five aspartates, three glutamates and only one histidine, the active site cavity of AnPEP contains an excess of negative charges at pH values above the pKa's of the acidic amino acids, which usually have a pKa around 4.2. Furthermore, the entrance of the active site cavity is decorated with five aspartate and six glutamate residues while there is only one arginine; the published PEP structure has an extra glutamate (Glu498) but lacks one of the aspartates (Asn334) (SI Figure S1). The Poisson‐Boltzmann electrostatic potential of AnPEP, calculated with pKa's predicted by PROPKA (Olsson et al., 2011), shows a negative electrostatic potential for the whole active site region above pH 5.0 (Figure 6). Below pH 5.0, the active site slowly loses its negative charge; at pH 4.0 the region downstream of the scissile bond becomes positive while the upstream region (starting from S2) is either neutral or still slightly negative. At pH 2.5, the entire active site is positively charged (not shown). In contrast, the active sites of hPRCP and hDPP7 become positively charged already at pH values around 5.5.
FIGURE 6.

Poisson‐Boltzmann electrostatic potential of AnPEP as function of pH for AnPEP, hPRCP, and hDPP4. The electrostatic potential is plotted onto the solvent excluded surface (Connolly surface) with a grid spacing of 0.50 and a range of ±5 K b T/e c. Negative potential is shown in red and positive potential in blue. The pKa's of titratable amino acids were calculated using PROPKA3. The active sites show the peptide Abz‐Gly‐Phe‐Gly‐Pro‐Phe‐Gly from porcine prolyl oligopeptidase (PDB: 1E8N) representing residues P4 (top) to P2' (bottom) in stick representation (green carbon atoms). For AnPEP, the glycosylation is also shown in stick representation (green carbon atoms).
2.7. The proteolysis of cytochrome c with AnPEP
To analyze the proteolytic degradation of an intact globular protein by AnPEP over time, horse heart cytochrome c was chosen as a model system. It is a relatively small protein of 104 amino acids, highly soluble and commercially available. It is stable up to boiling temperatures (Keilin & Hartree, 1937), and maintains its active conformation over a wide pH range (pH 2–12) (Goto et al., 1990; Miyashita et al., 2013; Paul, 1948). The 3D structure of horse heart cytochrome c has been determined with both x‐ray diffraction and NMR in a broad variety of conditions (SI Figure S6). Features which might make cytochrome c a less representative model substrate are the presence of a heme group, which is attached covalently to the protein by two thioether bonds involving Cys14 and Cys17, as well as the amino‐terminal acetylation. Horse heart cytochrome c contains four proline residues in the trans conformation (Figure 7). Pro44 and Pro76 are located in the second position of a type II β‐turn and are solvent accessible. Pro30 is located in the heme binding pocket and is completely shielded from the solvent by the heme, while Pro71 is only poorly accessible from the solvent.
FIGURE 7.

Peptide mass spectrometry (MS) intensities showing progress of cytochrome c digestion by AnPEP as function of time. The proteolytic digestion of horse heart cytochrome c by AnPEP was followed by UHPLC coupled to MS analysis. The first sample (t = 0) was taken immediately after the addition of AnPEP. Intensities were normalized with respect to the observed maximal intensity (100%). The cytochrome c sequence is shown for reference.
The proteolytic digestion of horse heart cytochrome c by AnPEP was followed in time (Figure 7). After 15 min, the fragments Gly1‐Pro44 and Gly45‐Glu104 were observed, covering the full cytochrome c, thus suggesting that the primary cleavage of cytochrome c occurs behind the solvent accessible Pro44. Fragment Gly77‐Glu104 also accumulated during the first 15 min, either through rapid digestion of fragment Gly45‐Glu104 or through a primary cleavage at Pro76. The first would yield the complementary fragment Gly45‐Pro76, the second the complementary fragment Gly1‐Pro76. Neither fragment was observed after the first 15 min. The early observation of fragment Gly1‐Pro30 could support the rapid digestion of fragment Gly1‐Pro76 but the absolute value of its MS intensity was very low, while fragments Asn31‐Pro44 and Gly45‐Pro76 only appeared later during the digestion. Most likely, the low MS intensity of fragment Gly1‐Pro30 is caused by precipitation due to the covalently linked heme overestimating the initial intensity after normalization. Therefore, besides the observed primary cleavage at Pro44, the present data do not exclude primary cleavage events at another proline or maybe even behind an alanine. For instance, in the first 15 min a fragment originating from cleavage behind Ala82 was also formed, most likely derived from the fragment Gly45‐Glu104 or Gly77‐Glu104.
During the proteolytic digestion 94 different peptides were identified (details are given in SI Table S3 and SI Figure S7). For 23 peptide pairs the carboxyl‐terminal residue of one peptide matches with the amino‐terminal residue of another peptide in the sequence, confirming the endopeptidase activity of AnPEP. The other 48 peptides lack such a match and may have been formed by exopeptidase activity. As the digestion progressed and the initial Pro and Ala cleavage sites decreased, peptides with a carboxyl‐terminal residue different from proline or alanine did increase. For example, 6 of the 9 available glutamates were observed, 3 of the 5 available asparagine residues and 8 of the 19 available lysine residues. Although AnPEP had been purified, it cannot be excluded that the sample contains minor traces of other A. niger proteases (e.g., aminopeptidases or carboxypeptidases) causing additional cleavages. After 2.5 h of digestion, the length of the peptides varied from 7 to 32 amino acids. The prolines were only observed at the C‐termini of the peptides, except in the case of Pro71, suggesting that cleavage of the peptide bond following Pro71 is less favored by AnPEP. An inventory of missed cleavages behind prolines in proteomics experiments using AnPEP suggested a disfavor of AnPEP for cleaving behind proline when the P2' position is a positively charged residue, together with a slight preference in the P1' position for a negatively charged residue (Tsiatsiani et al., 2017; van der Laarse et al., 2020). Note that in the sequence of cytochrome c, Pro71 is followed by two lysine residues (Figure 7) which would occupy the P1' and the P2' positions.
2.8. Modeling the possible interaction between AnPEP and cytochrome c
As suggested earlier, the hexapeptide bound to porcine brain prolyl oligopeptidase (PDB: 1E8N [Fülöp et al., 2001]) may be used as a template for modeling the binding of proteinaceous substrates to AnPEP. To get an impression how AnPEP could bind cytochrome c with its primary cleavage site (behind Pro44) correctly positioned relative to the catalytic residues of AnPEP, the backbone atoms of cytochrome c residues 41–45 (Gly‐Gln‐Ala‐Pro‐Gly) were superimposed onto the corresponding atoms of the hexapeptide, resulting in an RMSD of 0.9 Å (SI Figure S8A). Apart from residues P4–P5', which fit quite well into the active site cavity of AnPEP, the rest of the cytochrome c molecule shows severe overlap with AnPEP, especially with residues 335–420 of the SKS extension. Therefore, to have the scissile bond correctly positioned for cleavage, either a major rearrangement of the cap domain, in particular of the SKS extension (335–448), and/or a partial unfolding of cytochrome c would be required. Regarding the latter option, it is noteworthy that both the x‐ray structures as well as the NMR solution structures of cytochrome c show that the region around Pro44 exhibits a relatively high mobility with respect to the rest of the protein substrate (SI Figure S6). However, the degree of unfolding required to correctly fit the scissile bond into the active site of AnPEP without any overlap has never been observed, neither in the native structures nor in the molten globule structure. The closest distance of the scissile bond to its ideal position for attack without any overlap between AnPEP and cytochrome c atoms is about 14.5 Å. (SI Figure S8B).
Although cytochrome c has been reported to be stable down to pH 2 (Goto et al., 1990; Keilin & Hartree, 1937; Miyashita et al., 2013), it cannot be excluded that at the conditions used in the proteolysis experiment (100 mM NH4‐formate, pH 4.0) some local unfolding has occurred, enabling AnPEP to bind the loop containing the scissile bond at Pro44. For example, the type II β‐turn containing Pro44 may become accessible by partial unfolding of the region Gly34–Gly41 in cytochrome c. As the timescale of such an event is in the order of microseconds to milliseconds, such unfolding events would be hard to simulate by conventional MD. Therefore, to enable such a possible scenario within reasonable simulation time, the modeled complex between AnPEP and cytochrome c (SI Figure S8B) was subjected to a 50 ns MD run applying additional force to position Pro44 close enough to the catalytic serine to allow the nucleophilic attack on the carbonyl carbon of the scissile bond. While the additional forces carry the risk that the final result may not correctly represent what actually happens during the proteolysis, it does show that a complete reverse turn with a proline at the S1 position fits in the active site cavity without the need for major conformational changes to widen the active site. The targeted distance of 3.5 Å for Ser179Oγ–Pro44C was observed after 20 ns of simulation. During the next 30 ns the target distance Ser179Oγ–Pro44C fluctuated around an average of 4.6 Å with a standard deviation of 0.4 Å. The simulation indicated that an accessible reverse turn of at least 12 residues with proline in the seventh position is required to put the scissile bond in a favorable position for nucleophilic attack by Ser179 (Figure 8). In case of cytochrome c, a substantial local unfolding of the substrate is needed to make the loop accessible for cleavage by AnPEP, even though Pro44 is already quite accessible in the native cytochrome c. Currently, there are no indications that the domains forming the active site of AnPEP tend to be mobile allowing conformational changes that increase the access to the active site. First, the AnPEP molecules in the two observed crystal packings are perfectly superimposable despite a quite different crystal packing. Second, the B‐factor distributions of the crystal structures do not indicate that the cap domain, or parts of the cap domain, are more mobile than the catalytic domain. The same applies to the crystal structures of PEP, hPRCP and hDPP7 (SI Table S4). Third, the crystallization conditions and the crystal packing of PEP differ from AnPEP, while the structures match within 0.325 Å (RMSD). Clearly, the active site cavity of AnPEP as observed in the present X‐ray structure provides enough space to accommodate a proteinaceous loop, although the exact size of the reverse turn that is required may depend on the proteinaceous substrate.
FIGURE 8.

Model of AnPEP – cytochrome c complex. Cytochrome c proline 44 was pulled into the S1 pocket of AnPEP after a steered MD run of 50 ns. The inset zooms into the active site showing hydrogen bonds as red dotted lines and distance Ser179 Oγ‐Pro44 C as black dotted line. Color coding of AnPEP is the same as in Figure 1, with the S1 pocket shown in red; cytochrome c is shown in brown (cartoon/sticks) and white surface.
3. DISCUSSION
The crystal structure of AnPEP presented here is very similar to the crystal structure of the PEP endoprotease from A. niger that was very recently published by Miyazono et al (Miyazono et al., 2022). This is not surprising in view of the high degree of amino acid sequence identity of 96% between the two proteins. AnPEP and PEP are both serine proteases that have been classified in serine peptidase family S28 and that both preferentially cleave peptide bonds after proline. Two other S28 family members have been structurally characterized, hPRCP (Bezerra et al., 2012; Soisson et al., 2010) and hDPP7 (Bezerra et al., 2012). hPRCP is a carboxypeptidase that removes single amino acids from the carboxyl‐terminal end of proteins with a proline at the penultimate position (Pro‐X) (Kumamoto et al., 1981; Odya et al., 1978), while hDPP7 is a dipeptidyl aminopeptidase that removes X‐Pro dipeptides from the amino‐terminal end of proteins. However, what makes AnPEP and PEP true endoproteases, while hPRCP and hDPP7 function as exopeptidases has remained elusive. Our results on AnPEP now give clues to answer this question.
AnPEP and PEP have a broad and deep funnel‐like active site cavity with the catalytic residues (Ser179, His491, Asp458 of AnPEP) located at the bottom at about 18 Å from the bulk solvent. Near the catalytic triad, the active site cavity is much more spacious than the narrow substrate‐binding regions of the two exoproteases, particularly downstream of the scissile bond (Figure 2). This suggests that, rather than binding the terminus of a single peptide chain as do hPRCP and hDPP7, AnPEP and PEP may accommodate a complete reverse turn with a Pro residue at the tip. Support for this comes from our proteolytic digestion experiments of horse heart cytochrome c, which indicate that Pro44 of cytochrome c is the site of primary cleavage by AnPEP. Pro44 is the second residue of a type II β‐turn at the tip of a reverse turn of 12–14 residues (Figure 8). Modeling showed that this reverse turn can be fully accommodated in the active site region of AnPEP, such that Pro44 and the scissile peptide bond are productively bound near the catalytic residues, and the other 10–12 residues of the reverse turn interact with the wall of the active site funnel and each other (SI Figure S8). This is in contrast to the situation in other endoproteases like subtilisin and chymotrypsin, where only one extended peptide chain is bound near the active site. The results of the cytochrome c digestion experiments with AnPEP confirm its endo‐specific activity, and are in agreement with an earlier study describing the use of the enzyme in proteomics applications (Samodova et al., 2020; Tsiatsiani et al., 2017; van der Laarse et al., 2020).
While the crystal structures of AnPEP and PEP support their endopeptidase specificity, it proves to be more challenging to explain their low pH optimum/activity. AnPEP is still active around pH 1.5, in particular observed with proteinaceous substrates (Samodova et al., 2020; Tsiatsiani et al., 2017), while for the exopeptidases hPRCP and hDPP7 activity has been reported down to pH 4 with an optimum around pH 5 (Leiting et al., 2003; Odya et al., 1978). According to the generally accepted catalytic mechanism of serine proteases, the catalytic histidine needs to be deprotonated to accept a proton from the catalytic serine upon formation of the tetrahedral intermediate. The pKa of histidine residues in proteins is usually ≈6.5; therefore, it needs to be significantly lowered to make the S28 enzymes catalyze their reaction. In general, this can be achieved by specific hydrogen bond interactions, positive electrostatic potential and/or high hydrophobicity of the surrounding residues, disfavoring protonation (Connelly & McIntosh, 1998; Edgcomb & Murphy, 2002; Plesniak et al., 1996), but our combined results do not provide consistent answers in this regard. For example, a conserved tryptophan residue (Trp460 in AnPEP) (Figure 3) provides local hydrophobicity that may help lowering the pKa of the catalytic histidine. In addition, a nearby His/Arg dyad in hPRCP and hDPP7 has been proposed to have the same effect (Soisson et al., 2010); however, this dyad is replaced by Cys492/Tyr496 in PEP and AnPEP, suggesting it is unlikely that these residues play such a role. Analysis of the electrostatic surface potential also is not consistent with a decreased His pKa, since it was found to be negative around the active site (Figure 6), increasing the pKa of His491 to 6.7 (PROPKA [Olsson et al., 2011]) or even 9.2 (YASARA [Krieger et al., 2012]) instead of decreasing it. Because empirical methods like PROPKA and YASARA are known to perform less well in predicting pKa's for amino acids at the active site, a method based on constant pH molecular dynamics (CpHMD) was also applied. Unlike empirical methods, CpHMD derives the pKa from titration curves where the ratio of the protonated and deprotonated states is estimated by MD simulations. In contrast to the empirical methods, CpHMD predicts a decreased pKa for His491, although the downward shift is small (5.9; SI Figure S9). Evaluation of the possible hydrogen bond networks compatible with the AnPEP crystal structure (pH 5) also favors a lower pKa, with a non‐protonated His491 accepting a hydrogen bond of Ser179, although it should be kept in mind that the results may be influenced by the fact that the catalytic triad is shielded from the solvent by the bound PEG molecules. For example, binding of the EG4 molecule in the active site of AnPEP may explain why it lacks a water close to the catalytic Ser179 observed in the PEP structure (W709), although it remains to be elucidated whether this water molecule plays a role in the catalytic mechanism. Otherwise, despite the less acidic pH of the AnPEP crystals (≈5.0) compared to that of the PEP crystals (≈2.5), the catalytic triad residues perfectly superimpose. Together, even with the crystal structures of AnPEP and PEP available, it remains puzzling how these enzymes achieve a much more acidic pH‐optimum than the conventional serine proteases with a conserved Ser‐His‐Asp catalytic triad.
Although we have concluded earlier that the His/Arg pair near the catalytic triad of the two exopeptidases is likely not responsible for their acidic pH optimum, the His/Arg pair is strictly conserved in hPRCP and hDPP7 and their homologues with at least 50% sequence. Therefore, we propose instead that the His/Arg pair of the exopeptidases is involved in the binding the carboxyl‐terminal carboxyl group of proteinaceous substrates, a binding site that has not yet been resolved in their structures. This would explain why the His/Arg pair is missing in true endopeptidases like AnPEP and PEP and their homologues, why hPRCP specifically exhibits carboxypeptidase activity, and why hDPP7 prefers to cleave tripeptides with proline at the second position (Kozarich, 2010; Maes et al., 2005; Maes et al., 2007; Waumans et al., 2015).
Besides the increase of the nucleophilicity of the serine hydroxyl group by the charge relay network in the catalytic triad, another crucial requirement for the reaction catalyzed by proteases/peptidases is the stabilization of the negatively charged tetrahedral intermediate by the oxyanion hole. In serine carboxyl peptidases (Oda, 2012; Takahashi, 2013b) and in the aspartic proteases (Coates et al., 2006; Das et al., 2010) an aspartic acid is believed to protonate the carbonyl oxygen of the scissile bond during formation of the tetrahedral intermediate, and to act as a general base in the subsequent formation of the acyl‐enzyme. This second charge relay system (in addition to the one involving the catalytic triad), is also believed to contribute to the acidic pH‐optima of the enzymes. In AnPEP, the corresponding residue in the oxyanion hole is a glutamic acid (Glu88), which likely can fulfill the same role in oxyanion stabilization. To our knowledge, the presence of a Glu (instead of Asp) residue is a distinctive feature of S28 proteases; more than 97% of the sequences within this family have a glutamic acid at the position corresponding to Glu88 in AnPEP, while the few characterized ones are all active in the acidic pH range. In support of the oxyanion hole forming role, the observed negative electrostatic potential around Glu88 (Figure 6) likely disfavors its deprotonation and increases its pKa from the commonly observed value of 4.5 (Harris & Turner, 2002) to higher ones. Indeed, the pKa values predicted by PROPKA and CpHMD are 6.2 and 7.7, respectively; this is also in agreement with the observed decline of AnPEP activity above pH 6 (Edens, Dekker, et al., 2005; Sebela et al., 2009; Tsiatsiani et al., 2017) and suggests that Glu88 forms a functional oxyanion hole throughout the pH range where AnPEP is active. It remains to be seen whether at low pH the full charge relay system of AnPEP, involving a somewhat unconventional oxyanion hole combined with a conventional catalytic triad still functions in the same way as in conventional serine proteases.
Regarding substrate specificity of AnPEP, it has been reported that the proteome coverage strongly depends on whether the proteolytic digestion is performed at pH 2.0 or at pH 5.5 (Tsiatsiani et al., 2017). The marked increase in the number of peptides with a carboxyl‐terminal proline relative to other C‐termini at pH 2.0 suggests that the specificity for proline increases at lower pH (Samodova et al., 2020; Tsiatsiani et al., 2017; van der Laarse et al., 2020). Due to the hydrophobic nature of the S1 pocket, the binding of proline itself is unlikely to be very dependent on pH. However, pH does affect the electrostatic potential around the active site of AnPEP, from negative at pH 5.5 to positive at pH 2.0 (Figure 6), and this will affect the binding of proteinaceous substrates containing proline residues, not in the least because their electrostatic potential and conformational stability will also change. Lowering the pH thus seems to either increase the availability of prolines, promote the binding of proline to the S1 pocket, or demote the binding of non‐proline residues. The charge reversal could also cause the dip in the pH activity profile of AnPEP around pH 4.0; availability of additional cleavage sites in protein substrates at low pH could explain an increase of the overall activity after an initial decline (Tsiatsiani et al., 2017).
The cytochrome c digestion experiment shows that AnPEP efficiently degrades native proteins, starting at proline residues, followed by alanine residues and, when these sites get exhausted, cleavage occurs also behind other amino acids. Although the first cleavage occurs at one of the most accessible prolines in cytochrome c, it was not possible to place this proline in the S1 pocket of AnPEP without serious overlap between enzyme and substrate. MD simulation showed that a partial unfolding of cytochrome c is required to dislodge a loop of at least 12 residues, placing the P1 proline in its binding pocket (Figure 8). With such a scenario, further widening of the active site by conformational changes is not really required to explain the endo‐protease specificity of AnPEP.
This differs from what is observed for the prolyl oligopeptidases; their catalytic sites are also deeply buried and difficult to approach from the bulk solvent (Rea & Fülöp, 2011; Szeltner & Polgár, 2008; van Elzen & Lambeir, 2011). In contrast to the cap domain of AnPEP, the so‐called β‐propeller domain of the prolyl‐oligopeptidases completely occludes a large cavity at the domain interface and must rotate away from the catalytic domain to allow entry of the substrate, a move that disrupts the catalytic triad. To restore catalytic activity, the β‐propeller domain must rotate back enclosing the oligopeptide substrate. As a consequence, the proteolytic activity of prolyl oligopeptidase is limited to oligopeptides that fit the internal cavity after closure of the β‐propeller domain (Li et al., 2010; Rea & Fülöp, 2011). In the current 3D structure of AnPEP the catalytic site is intact, and the active site does not seem to limit the size of proteinaceous substrates, albeit there are certain requirements with respect to the accessibility of the proline. In this regard, one might wonder whether AnPEP could even promote the local unfolding of proline‐containing coils.
4. CONCLUSION
The overall structure of the endoprotease AnPEP is similar to the structures of the two proline‐specific exopeptidases hPRCP and hDPP7, belonging to the serine protease family S28. Its more spacious active site cavity and the absence of anchoring points for amino‐ or carboxyl‐terminal residues as observed in the exopeptidases explains it endoprotease activity, while structural features determining its proline specificity are maintained. However, for the peptide cleavage to occur, a freely accessible protein substrate loop is required to reach the proline‐binding pocket and the catalytic triad located at the bottom of a deep funnel‐like cavity. Our results show that the active site funnel of AnPEP can contain an oligopeptide fragment consisting of an incoming and an outgoing strand (totaling 12–14 amino acids), connected by a Pro‐containing reverse turn at its tip, with proline positioned in the S1 specificity pocket, requiring only limited conformational changes.
Finally, the pH dependence of AnPEP remains intriguing, especially regarding its proteolytic activity under highly acidic conditions. Compared to the conventional serine proteases like trypsin and subtilisin, the most striking difference in S28 peptidases is the stabilization of the oxyanion by a glutamic acid side chain (Glu88 in AnPEP). This glutamic acid may function as a general acid catalyst to protonate the carbonyl oxygen of the scissile bond upon formation of the tetrahedral intermediate, and as a general base upon subsequent formation of the acyl‐enzyme intermediate. Deprotonation of this glutamic acid would explain the loss of activity at neutral pH; however, the proteolytic activity under highly acidic conditions is less well understood. For conventional serine proteases, such activity requires an unprotonated catalytic histidine (His491) to be active. Notably, AnPEP is still active down to pH 1.5, requiring a downward pKa shift of several pH units for the His491 side chain, but different computational approaches (PROPKA, YASARA, CpHMD) did not predict such a large shift. Although His491 is in a more hydrophobic environment and more shielded from solvent than in, for example, trypsin or subtilisin, modeling of the putative hydrogen binding networks suggests that the current crystal structure at pH ≈ 5 would be compatible with an unprotonated His491 but does not provide any clues for such a major perturbation of the pKa. Further insight into pKa's of His491 and Glu88 might be provided by NMR spectroscopy approaches that enable the monitoring of the protonation behavior of ionizable groups in a more direct way, while site directed mutagenesis could shed light on the residues which are instrumental for the exceptional pH dependence of AnPEP.
5. MATERIALS AND METHODS
5.1. Crystallization and data collection
The A. niger acid prolyl endoprotease (AnPEP) was produced with an A. niger strain overexpressing the protease. The purification of AnPEP and the confirmation of its activity were performed as described earlier (Edens, Dekker, et al., 2005). The protease is secreted into the fermentation medium and contains 485 amino acids after removal of the signal sequence (22 amino acids) and the pro‐sequence (19 amino acids). SDS gel electrophoresis revealed a molecular weight of around 66 kDa. Isoelectric focusing of the purified enzyme showed an isoelectric point around pH 4.2.
AnPEP was crystallized from various conditions, most of them containing PEG as the main precipitating agent. Data collected from two crystal forms (hereafter named AnPEP‐A and AnPEP‐B), both obtained in hanging drop vapor diffusion experiments setup at 293 K, were used to determine the crystal structure. AnPEP‐A crystals were grown using a protein solution containing 12.5 mg/mL AnPEP in 20 mM HAc/NaAc buffer, pH 5.0, 50 mM NaCl; reservoirs contained 26%–29% (w/v) PEG 3350, 0.1 M sodium citrate buffer, pH 5.0. For the drops, 2.5 μL protein solution was mixed with 2.0 μL 16% (w/v) PEG 3350, 0.1 M ammonium tartrate, pH 7.2. This somewhat unusual setup (with different components in the drop and reservoir) arose from combining and optimizing the results from commercial screens such as PEG/Ion2 and JCSG+ (Hampton Research, Aliso Viejo, CA, USA). Before data collection, crystals were soaked for 3 min in a stabilizing solution containing 0.5 M KBr (final pH ≈ 5.0) and subsequently cryo‐protected for 20 s using 15% (v/v) glycerol and 0.3 M KBr. A multi‐wavelength anomalous diffraction (MAD) data set was collected at 100 K at the European Synchrotron Radiation Facility (ESRF, Grenoble, France), beamline ID14‐4, at three different wavelengths (peak, inflection point, remote) selected after a fluorescence scan of the crystal. Analysis of the diffraction data with SHELX (Sheldrick, 2010) indicated that the anomalous signal was too weak to be useful; therefore, only the data collected at the peak wavelength (0.9195 Å) was used (as a native data set) for structure determination.
AnPEP‐B crystals were obtained using 35 mg/mL protein in 20 mM HAc/NaAc, pH 5.0, 50 mM NaCl; drops contained 1.2 μL of this protein solution mixed with 0.8 μL of a reservoir solution containing 28%–32% (w/v) PEG 3000, 0.1 M sodium citrate buffer, pH 5.25 (final measured pH ≈ 6.0). Crystals were cryoprotected by including 15% (v/v) glycerol. A data set was collected at 100 K at beamline ID23‐2 of the ESRF.
5.2. Structure determination
Data sets of AnPEP‐A and AnPEP‐B were processed with iMosflm (Battye et al., 2011) or XDS (Kabsch, 2010), respectively, and analyzed by SCALA (Evans, 2006) or AIMLESS (Evans & Murshudov, 2013); further details and processing statistics are listed in Table 1. Despite extensive search model editing (domain separation, loop trimming, etc.) using available structural homologues, molecular replacement with PHASER (McCoy et al., 2007) was unsuccessful; a plausible solution was only achieved with the Phenix MR‐Rosetta protocol (DiMaio et al., 2013; DiMaio, Leaver‐Fay, et al., 2011; DiMaio, Terwilliger, et al., 2011) and AnPEP‐A data, yielding an initial model that contained about 70% of the residues. This partial model could be extended and completed using the AnPEP‐B data, combining Phenix Phase & Build (Adams et al., 2010), Buccaneer (Cowtan, 2006) and REFMAC5 restrained refinement protocols (Murshudov et al., 1997) iterated with manual rebuilding in COOT (Emsley et al., 2010) at a resolution of 2.69 Å. Finally, via molecular replacement, the partial AnPEP‐A structure could be completed at 2.42 Å resolution. The models for both crystal forms were validated using MolProbity (Chen et al., 2010) and PDB‐REDO (Joosten et al., 2012).
TABLE 1.
Crystallographic data.
| AnPEP‐A | AnPEP‐B | |
|---|---|---|
| PDB entry | 8B57 | 8BBX |
| Data collection | ||
| Wavelength (Å) | 0.9195 | 0.8726 |
| Space group | I 2 3 | C 2 2 21 |
| No. of molecules per AU | 1 | 2 |
| Cell dimensions a, b, c (Å) | 162.0, 162.0, 162.0 | 102.2, 156.2, 193.4 |
| Resolution (Å) | 66.24–2.42 (2.55–2.42) | 49.4–2.69 (2.79–2.69) |
| R merge (%) a | 0.108 (0.636) | 0.136 (0.593) |
| <I/σ> a | 14.5 (3.7) | 4.9 (1.4) |
| Completeness (%) a | 100.0 (99.9) | 99.4 (95.3) |
| Redundancy a | 9.2 (9.1) | 4.0 (3.6) |
| Refinement | ||
| Resolution (Å) | 66.24–2.42 (2.55–2.42) | 49.39–2.70 (2.76–2.70) |
| Unique observations a | 27,027 (3897) | 42,704 (2144) |
| R/R free | 0.172/0.209 | 0.204/0.255 |
| Number of atoms | ||
| Protein | 3845 | 7876 |
| Carbohydrate/waters | 253/109 | 217/16 |
| Ligands | EG2, EG4, EG5, glycerol | EG4 |
| B‐factors | ||
| Protein (Å2) | 22.9 | 43.6 |
| Waters (Å2) | 35.0 | 32.0 |
| Carbohydrate (Å2) | 53.9 | 64.1 |
| Ligands (Å2) | 72.4 | 60.5 |
| Root mean square deviations | ||
| Bond lengths (Å) | 0.010 | 0.010 |
| Bond angles (°) | 1.40 | 1.16 |
| Ramachandran | ||
| Favored (%) | 96.1 | 96.1 |
| Allowed (%) | 3.5 | 3.4 |
| Outliers (%) | 0.2 | 0.5 |
In highest resolution shell.
5.3. The proteolytic digestion of cytochrome c with AnPEP
A 1 mg/mL horse heart cytochrome c solution was prepared by weighing the appropriate amount of ≥95% cytochrome c (cat. no. C2506, Sigma–Aldrich, Taufkirchen, Germany, UniProtKB entry P00004) and dissolving it into Water LC–MS grade (VWR, Radnor, PA, USA). Ammonium formate buffer (100 mM) was prepared by dissolving 315 mg ≥99% ammonium formate (cat. no. 516961, Sigma–Aldrich, Taufkirchen, Germany) in 50 mL Water LC–MS grade. The pH of the solution was adjusted to pH 4.0 using 98%–100% formic acid (Merck, Darmstadt, Germany). A 0.1 mg/mL incubation solution was prepared by adding 100 μL cytochrome c (1 mg/mL) to 900 μL ammonium formate (100 mM, pH 4.0). Shortly before use, AnPEP (batch 813475101, DSM, Netherlands) was diluted 1000‐fold in Water LC–MS grade. 2 μL of this solution was added to 1 mL cytochrome c incubation solution in an injection vial. This solution was mixed thoroughly for 20 s and incubation was performed in an autosampler at 37°C. 10 μL of this solution was injected every 15 min using a Dionex Ultimate 3000 UHPLC connected to an LTQ‐Orbitrap Fusion mass spectrometer (Thermo Fisher, Waltham, MA, USA). Intact cytochrome c and incubation products were separated on an Atlantis T3 Column, 100 Å, 3 μm, 2.1 mm × 100 mm (Waters, USA), with the column oven set to 50°C. Mobile phase A contained 0.1% formic acid in water UHPLC–MS Grade (VWR, Radnor, PA, USA) and mobile phase B contained 0.1% formic acid in acetonitrile UHPLC–MS Grade (VWR, Radnor, PA, USA). A gradient of 1%–50% B over 10 min was applied with a flow rate of 400 μL/min, followed by column washing at 70% B and column re‐equilibration at 1% B. The total runtime was 15 min. Data‐dependent acquisition (DDA) was performed with a resolution setting at 120,000 within the 400–1600 m/z range and a maximum injection time of 100 ms in the Orbitrap, followed by high‐energy collision‐induced dissociation activated (HCD) MS/MS on the most abundant precursors using a maximum cycle time of 1 s (top speed method) detected in the Iontrap using rapid scan rate. Isolation of the parent ions was performed in quadrupole and a collision energy of 27% and a fixed first mass of 120 m/z with a maximum injection time of 75 ms were used. The minimum intensity threshold for MS/MS was 1000 counts and peptide species with 1 and >8 charges were excluded. A list of all possible incubation products from cytochrome c with at least 5 amino acids was generated manually. All exact masses of the 1, 2, or 3 times charged species with a mass window of 10 ppm were extracted from the raw data using the program Pinpoint (Thermo Fisher, Waltham, MA, USA) and manually curated. Due to the nature and conditions of the mass spectrometry (MS) experiments, a quantitative comparison of different peptides is not possible. In order to follow individual peptides in time, it was assumed that the ionization of each peptide was constant during the digestion, although the changing composition of the reaction mixture during proteolysis might affect the ionization to a certain extent.
5.4. Molecular modeling
The program PyMOL (The PyMOL Molecular Graphics System, Version 2.4.0 Schrödinger, LLC) was used for visualization and preparation of the figures showing the structural details (DeLano, 2020; DeLano & Bromberg, 2004); all figures were made using the AnPEP‐A structure. Unless indicated differently, superpositions in PyMOL were done with “CEalign,” which performs a sequence independent structure‐based dynamic programming alignment (Shindyalov & Bourne, 1998). RMSD's between the structures were calculated with pair_fit including the common residues for each pair as depicted in the multiple sequence alignment in SI Figure S2. Global sequence alignments and corresponding sequence identity calculations were done with NEEDLE (Needleman & Wunsch, 1970). The secondary structure was defined using DSSP (Kabsch & Sander, 1983). Electrostatics calculations were performed within PyMOL using the Adaptive Poisson‐Boltzmann Solver (APBS) and PDB2PQR software (Jurrus et al., 2018; Unni et al., 2011). The required pKa values for titratable amino acids were calculated with PROPKA 3.0 revision 182 (Olsson et al., 2011).
Energy minimizations were done with YASARA (Krieger & Vriend, n.d.; Krieger & Vriend, 2014; Krieger & Vriend, 2015). The AMBER14 force field was used (Maier et al., 2015), applying a 10 Ǻ force cut‐off and the Particle Mesh Ewald algorithm (Essmann et al., 1995) to treat long range electrostatic interactions. After a short steepest descent minimization to remove conformational stress, the molecules were subjected to simulated annealing (timestep 2 fs, atom velocities scaled down by 0.9 every 10th step) until the energy improved by less than 0.05 kJ/mol per atom during 200 steps.
Constant pH MD simulations (CpHMD) were run to estimate the pKa of all titratable residues of AnPEP (Mongan et al., 2004). The initial protein structure was AnPEP‐A, with all non‐protein residues removed before carrying out the simulations. MD simulations were run using AmberTools22 (Case et al., 2022) with the MPI implementation of sander. Topologies were prepared with the tLEaP module using the ff99SB force field. Simulations were run for 1.0 ns at constant pH, from pH 0 to 7 in increments of 1. Trajectories were analyzed with cpptraj of AmberTools. The fraction of deprotonated species was used to fit a titration curve, from which a pKa was obtained.
To establish the protonation state of titratable amino acids in X‐ray structures at a given pH and to optimize the hydrogen bonding network, the “Cell neutralization and pKa prediction” experiment within YASARA was used (Krieger et al., 2012). The experiment adapts the pKa values of titratable residues depending on their local environment and accordingly determines their protonation state at a given pH. Next, the hydrogen bonding network was optimized while the heavy atoms were kept fixed. Finally, a short MD simulation was run for the solvent using the AMBER14 forcefield. NaCl ions were added with a physiological concentration of 0.9%, with an excess of either Na or Cl ions to neutralize the cell. The cut‐off was 10.5 Å for Van der Waals forces. No cut‐off was applied to the electrostatic forces, and the “Particle Mesh Ewald algorithm” was employed. The hydrogen bond strengths (S HB) were analyzed in terms of the angle at the hydrogen atom (Χ (DHA)), the distance between acceptor and hydrogen (d AH) and the angle at the acceptor atom (Χ (HAX)), assuming 25 kJ/mol for an optimal hydrogen bond: S HB = 25 × 2 × [2.6 – max (dAH, 2.1)] × (Χ (DHA)) × (Χ (HAX)), where Χ (DHA) and Χ (HAX) are scaling factors (Krieger et al., 2012). Putative hydrogen bonds with a calculated strength of less than 6.25 kJ/mol were omitted.
Molecular dynamic simulations were done within YASARA (Essmann et al., 1995; Krieger & Vriend, 2015; Maier et al., 2015). The setup included an optimization of the hydrogen bonding network to increase the solute stability, and a pKa prediction to fine‐tune the protonation states of protein residues at the chosen pH of 4.0 (Krieger et al., 2012), the pH of the cytochrome c digestion experiment. Dimensions of the rectangular simulation cell were 74.6 × 80.8 × 92.3 Å. NaCl was added with a physiological concentration of 0.9%, with an excess of either Na or Cl to neutralize the cell. The final system consisted of 57,098 atoms (AnPEP contributed 7850 atoms, cytochrome c contributed 1704 atoms, 15,795 water molecules, 33 sodium ions and 52 chloride ions). After steepest descent and simulated annealing minimizations to remove clashes, the simulations were run using the AMBER14 force field for the solute, GAFF2 (Wang et al., 2004) and AM1BCC (Jakalian et al., 2002) for ligands and TIP3P for water. The cut‐off was 8 Å for Van der Waals forces; no cut‐off was applied to electrostatic forces (using the Particle Mesh Ewald algorithm [Krieger & Vriend, n.d.]). The equations of motions were integrated with multiple timesteps of 1.25 fs for bonded interactions and 2.5 fs for non‐bonded interactions at a temperature of 298 K and a pressure of 1 atm (NPT ensemble). The starting complex between AnPEP and cytochrome c was modeled manually by putting the P1 proline with its scissile bond into the proper position for nucleophilic attack by the catalytic serine of AnPEP; then the cytochrome c molecule was moved outward until there was no Van der Waals overlap with AnPEP (SI Figure 8B). Next, the MD simulation was started until the RMSD of the complex had stabilized. To pull the loop with the P1 proline Pro44 back into the active site of AnPEP, harmonic restraints were added to the force field after 5 ns by the “AddSpring” command which sets an additional Spring Stretching Force Constant (SFC). The length of the spring (LEN) was measured every 100 ps and gradually reduced until the desired spring length was obtained by resetting it to 0.95 × LENobserved. SFC was set to 25 N/m. When LENobserved < LENtargeted, the additional force was switched off by setting SFC = 0. The following distances were subjected to an additional force: Ser179 Oγ – Pro44C (3.5 Å), Glu88 Oε2H – Pro44 O (2.2 Å), Tyr180 OH – Pro44 O (2.2 Å), Pro205 Cγ – Pro44 Cβ (5 Å), Pro205 Cγ – Pro44 Cγ (6 Å) where Pro44 belongs to cytochrome c, the other residues are part of AnPEP and the desired target distances are between parentheses). To avoid distortion of the catalytic residues due to the pulling force, a constraint was also applied to the intra‐molecular distances Ser179 OγH – His491 Nε (2.0 Å) and Glu88 Oε1 – Glu88 NH (2.0 Å). Simulation snapshots were saved every 100 ps. To follow the simulation, snapshots were superposed onto the starting complex using all heavy protein atoms and the RMSD was plotted against the simulation time (SI Figure S10).
To describe the binding of peptides, the Schechter and Berger notation was used, where “P” refers to an amino acid position in the peptide while “S” refers to a binding site on the protease (Schechter & Berger, 1967).
AUTHOR CONTRIBUTIONS
Tjaard Pijning: Data curation; software; validation; writing – original draft; writing – review and editing; formal analysis; visualization; investigation; methodology; project administration. Andreja Vujičić‐Žagar: Investigation; data curation; formal analysis; visualization. Jan‐Metske van der Laan: Methodology; software; validation; investigation; data curation; formal analysis; visualization; writing – original draft; writing – review and editing; conceptualization; supervision. René M. de Jong: Conceptualization; methodology; software; validation; investigation; data curation; formal analysis; visualization; writing – review and editing; supervision; project administration. Carlos Ramirez‐Palacios: Writing – original draft; software; investigation. Andre Vente: Data curation; investigation; resources; formal analysis. Luppo Edens: Writing – original draft; conceptualization. Bauke W. Dijkstra: Writing – review and editing; formal analysis; investigation; validation.
CONFLICT OF INTEREST STATEMENT
The authors declare no conflicts of interest. Dr. René M. de Jong is affiliated with DSM‐Firmenich, a global company active in Nutrition, Health, Beauty & Bioscience.
Supporting information
Data S1. Supporting Information.
Figure S1. Structural differences between AnPEP (this work) and PEP.
Figure S2. Structure based sequence alignment of AnPEP, hPRCP, and hDPP7.
Figure S3. Comparison of AnPEP (left, this work), hPRCP (middle, PDB: 3N2Z), and hDPP7 (right, PDB: 4EBB) structures.
Figure S4. Possible hydrogen bonding networks involving His491, Ser179, EG4, and Glu88 at pH = 5, as determined by Glu88/His491 protonation.
Figure S5. Differences between the endo‐protease AnPEP and exoproteases hPRCP and hDPP7.
Figure S6. Structural variation observed for 3D structures of horse and bovine heart cytochrome c.
Figure S7. Progress curves for the formation and degradation of peptides upon the hydrolysis by AnPEP.
Figure S8. Modeling of the primary cleavage site of cytochrome c into the active site of AnPEP.
Figure S9. Constant pH molecular dynamics (CpHMD) graphs of titratable residues in AnPEP.
Figure S10. Root mean square differences (RMSD) during MD simulation of the AnPEP‐cytochrome c complex.
Table S1. The catalytic triad and the oxyanion hole of AnPEP and the corresponding residues in hPRCP, hDPP7, hDPP4, and porcine prolyl oligopeptidase (pPOP).
Table S2. pKa dependent hydrogen bonds between His491, Ser179, EG4, and Glu88 at pH = 5.
Table S3. The peptides identified with MS during the incubation of cytochrome c with AnPEP.
Table S4. Average B‐factors (Cα atoms) for the hydrolase and cap domains of S28 crystal structures.
ACKNOWLEDGMENTS
We thank the beamline staff of the European Synchrotron Radiation Facility (ESRF) for assistance in collecting X‐ray diffraction data.
Pijning T, Vujičić‐Žagar A, van der Laan J‐M, de Jong RM, Ramirez‐Palacios C, Vente A, et al. Structural and time‐resolved mechanistic investigations of protein hydrolysis by the acidic proline‐specific endoprotease from Aspergillus niger . Protein Science. 2024;33(1):e4856. 10.1002/pro.4856
Review Editor: Lynn Kamerlin
REFERENCES
- Adams PD, Afonine PV, Bunkóczi G, Chen VB, Davis IW, Echols N, et al. PHENIX: a comprehensive python‐based system for macromolecular structure solution. Acta Crystallogr D Biol Crystallogr. 2010;66:213–221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Aertgeerts K, Ye S, Tennant MG, Kraus ML, Rogers J, Sang B, et al. Crystal structure of human dipeptidyl peptidase IV in complex with a decapeptide reveals details on substrate specificity and tetrahedral intermediate formation. Protein Sci. 2004;13:412–421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Akeroyd M, van Zandycke S, den Hartog J, Mutsaers J, Edens L, van den Berg M, et al. AN‐PEP, proline‐specific endopeptidase, degrades all known immunostimulatory gluten peptides in beer made from barley malt. J Am Soc Brew Chem. 2016;74:91–99. [Google Scholar]
- Alexov E, Mehler EL, Baker N, Baptista AM, Huang Y, Milletti F, et al. Progress in the prediction of pKa values in proteins. Proteins. 2011;79:3260–3275. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baharin A, Ting T, Goh H. Post‐proline cleaving enzymes (PPCEs): classification, structure, molecular properties, and applications. Plants (Basel). 2022;11:1330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Battye TGG, Kontogiannis L, Johnson O, Powell HR, Leslie AGW. iMOSFLM: a new graphical interface for diffraction‐image processing with MOSFLM. Acta Crystallogr D Biol Crystallogr. 2011;67:271–281. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bezerra GA, Dobrovetsky E, Dong A, Seitova A, Crombett L, Shewchuk LM, et al. Structures of human DPP7 reveal the molecular basis of specific inhibition and the architectural diversity of proline‐specific peptidases. PloS One. 2012;7:e43019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Camargo AC, Caldo H, Reis ML. Susceptibility of a peptide derived from bradykinin to hydrolysis by brain endo‐oligopeptidases and pancreatic proteinases. J Biol Chem. 1979;254:5304–5307. [PubMed] [Google Scholar]
- Case DA, Aktulga HM, Belfon K, Ben‐Shalom IY, Berryman JT, Brozell SR, et al. Amber 2022. San Francisco, CA: University of California; 2022. [Google Scholar]
- Chen VB, Arendall WB, Headd JJ, Keedy DA, Immormino RM, Kapral GJ, et al. MolProbity: all‐atom structure validation for macromolecular crystallography. Acta Crystallogr D Biol Crystallogr. 2010;66:12–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Coates L, Erskine PT, Mall S, Gill R, Wood SP, Myles DAA, et al. X‐ray, neutron and NMR studies of the catalytic mechanism of aspartic proteinases. Eur Biophys J. 2006;35:559–566. [DOI] [PubMed] [Google Scholar]
- Connelly GP, McIntosh LP. Characterization of a buried neutral histidine in Bacillus circulans xylanase: internal dynamics and interaction with a bound water molecule. Biochemistry. 1998;37:1810–1818. [DOI] [PubMed] [Google Scholar]
- Cowtan K. The buccaneer software for automated model building. 1. Tracing protein chains. Acta Crystallogr D Biol Crystallogr. 2006;62:1002–1011. [DOI] [PubMed] [Google Scholar]
- Cunningham DF, O'Connor B. Proline specific peptidases. Biochim Biophys Acta. 1997;1343:160–186. [DOI] [PubMed] [Google Scholar]
- Das A, Mahale S, Prashar V, Bihani S, Ferrer J, Hosur MV. X‐ray snapshot of HIV‐1 protease in action: observation of tetrahedral intermediate and short ionic hydrogen bond SIHB with catalytic aspartate. J Am Chem Soc. 2010;132:6366–6373. [DOI] [PubMed] [Google Scholar]
- DeLano WL. The PyMOL Molecular Graphics System, Version 2.3. 2020.
- DeLano WL, Bromberg S. PyMOL User's guide. 2004.
- DiMaio F, Echols N, Headd JJ, Terwilliger TC, Adams PD, Baker D. Improved low‐resolution crystallographic refinement with Phenix and Rosetta. Nat Methods. 2013;10:1102–1104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- DiMaio F, Leaver‐Fay A, Bradley P, Baker D, André I. Modeling symmetric macromolecular structures in Rosetta3. PloS One. 2011;6:e20450. [DOI] [PMC free article] [PubMed] [Google Scholar]
- DiMaio F, Terwilliger TC, Read RJ, Wlodawer A, Oberdorfer G, Wagner U, et al. Improved molecular replacement by density‐ and energy‐guided protein structure optimization. Nature. 2011;473:540–543. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dunaevsky YE, Tereshchenkova VF, Oppert B, Belozersky MA, Filippova IY, Elpidina EN. Human proline specific peptidases: a comprehensive analysis. Biochim Biophys Acta Gen Subj. 2020;1864:129636. [DOI] [PubMed] [Google Scholar]
- Edens L, Dekker P, van der Hoeven R, Deen F, de Roos A, Floris R. Extracellular prolyl endoprotease from Aspergillus niger and its use in the debittering of protein hydrolysates. J Agric Food Chem. 2005;53:7950–7957. [DOI] [PubMed] [Google Scholar]
- Edens L, van der Laan JM, Craig HD. Chill haze formation: mechanisms and prevention by proline‐specific proteases. Brauwelt Int. 2005;3:157–161. [Google Scholar]
- Edgcomb SP, Murphy KP. Variability in the pKa of histidine side‐chains correlates with burial within proteins. Proteins. 2002;49:1–6. [DOI] [PubMed] [Google Scholar]
- Emsley P, Lohkamp B, Scott WG, Cowtan K. Features and development of coot. Acta Crystallogr D Biol Crystallogr. 2010;66:486–501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Essmann U, Perera L, Berkowitz ML, Darden T, Lee H, Pedersen LG. A smooth particle mesh Ewald method. J Chem Phys. 1995;103:8577–8593. [Google Scholar]
- Evans P. Scaling and assessment of data quality. Acta Crystallogr D Biol Crystallogr. 2006;62:72–82. [DOI] [PubMed] [Google Scholar]
- Evans PR, Murshudov GN. How good are my data and what is the resolution? Acta Crystallogr D Biol Crystallogr. 2013;69:1204–1214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fu H, Grimsley G, Scholtz JM, Pace CN. Increasing protein stability: importance of DeltaC (p) and the denatured state. Protein Sci. 2010;19:1044–1052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fu H, Grimsley GR, Razvi A, Scholtz JM, Pace CN. Increasing protein stability by improving beta‐turns. Proteins. 2009;77:491–498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fülöp V, Böcskei Z, Polgár L. Prolyl oligopeptidase: an unusual beta‐propeller domain regulates proteolysis. Cell. 1998;94:161–170. [DOI] [PubMed] [Google Scholar]
- Fülöp V, Szeltner Z, Renner V, Polgár L. Structures of prolyl oligopeptidase substrate/inhibitor complexes. Use of inhibitor binding for titration of the catalytic histidine residue. J Biol Chem. 2001;276:1262–1266. [DOI] [PubMed] [Google Scholar]
- Gandour RD. On the importance of orientation in general base catalysis by carboxylate. Bioorg Chem. 1981;10:169–176. [Google Scholar]
- Gass J, Khosla C. Prolyl endopeptidases. Cell Mol Life Sci. 2007;64:345–355. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goto Y, Calciano LJ, Fink AL. Acid‐induced folding of proteins. Proc Natl Acad Sci USA. 1990;87:573–577. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grimsley GR, Scholtz JM, Pace CN. A summary of the measured pK values of the ionizable groups in folded proteins. Protein Sci. 2009;18:247–251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gujral N, Freeman HJ, Thomson ABR. Celiac disease: prevalence, diagnosis, pathogenesis and treatment. World J Gastroenterol. 2012;18:6036–6059. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harris TK, Turner GJ. Structural basis of perturbed pKa values of catalytic groups in enzyme active sites. IUBMB Life. 2002;53:85–98. [DOI] [PubMed] [Google Scholar]
- Jakalian A, Jack DB, Bayly CI. Fast, efficient generation of high‐quality atomic charges. AM1‐BCC model: II. Parameterization and validation. J Comput Chem. 2002;23:1623–1641. [DOI] [PubMed] [Google Scholar]
- Joosten RP, Joosten K, Murshudov GN, Perrakis A. PDB_REDO: constructive validation, more than just looking for errors. Acta Crystallogr D Biol Crystallogr. 2012;68:484–496. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jurrus E, Engel D, Star K, Monson K, Brandi J, Felberg LE, et al. Improvements to the APBS biomolecular solvation software suite. Protein Sci. 2018;27:112–128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kabsch W. XDS. Acta Crystallogr D Biol Crystallogr. 2010;66:125–132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen‐bonded and geometrical features. Biopolymers. 1983;22:2577–2637. [DOI] [PubMed] [Google Scholar]
- Kay BK, Williamson MP, Sudol M. The importance of being proline: the interaction of proline‐rich motifs in signaling proteins with their cognate domains. FASEB J. 2000;14:231–241. [PubMed] [Google Scholar]
- Kazlauskas R. Engineering more stable proteins. Chem Soc Rev. 2018;47:9026–9045. [DOI] [PubMed] [Google Scholar]
- Keilin D, Hartree EF. Preparation of pure cytochrome c from heart muscle and some of its properties. Proc R Soc Lond Ser B Bio Sci. 1937;122:298–308. [Google Scholar]
- Kozarich JW. S28 peptidases: lessons from a seemingly ‘dysfunctional’ family of two. BMC Biol. 2010;8:87. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krieger E, Dunbrack RL, Hooft RWW, Krieger B. Assignment of protonation states in proteins and ligands: combining pKa prediction with hydrogen bonding network optimization. Methods Mol Biol. 2012;819:405–421. [DOI] [PubMed] [Google Scholar]
- Krieger E, Vriend G. YASARA view–molecular graphics for all devices–from smartphones to workstations. Bioinformatics. 2014;30:2981–2982. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krieger E, Vriend G. New ways to boost molecular dynamics simulations. J Comput Chem. 2015;36:996–1007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krieger E, Vriend G. Spronk C YASARA–Yet another scientific artificial reality application.
- Krissinel E, Henrick K. Inference of macromolecular assemblies from crystalline state. J Mol Biol. 2007;372:774–797. [DOI] [PubMed] [Google Scholar]
- Kumamoto K, Stewart TA, Johnson AR, Erdös EG. Prolylcarboxypeptidase (angiotensinase C) in human lung and cultured cells. J Clin Invest. 1981;67:210–215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leiting B, Pryor KD, Wu JK, Marsilio F, Patel RA, Craik CS, et al. Catalytic properties and inhibition of proline‐specific dipeptidyl peptidases II, IV and VII. Biochem J. 2003;371:525–532. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li M, Chen C, Davies DR, Chiu TK. Induced‐fit mechanism for prolyl endopeptidase. J Biol Chem. 2010;285:21487–21495. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lone AM, Nolte WM, Tinoco AD, Saghatelian A. Peptidomics of the prolyl peptidases. AAPS J. 2010;12:483–491. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lopez M, Edens L. Effective prevention of chill‐haze in beer using an acid proline‐specific endoprotease from Aspergillus niger . J Agric Food Chem. 2005;53:7944–7949. [DOI] [PubMed] [Google Scholar]
- MacArthur MW, Thornton JM. Influence of proline residues on protein conformation. J Mol Biol. 1991;218:397–412. [DOI] [PubMed] [Google Scholar]
- Maes M, Lambeir A, Gilany K, Senten K, van der Veken P, Leiting B, et al. Kinetic investigation of human dipeptidyl peptidase II (DPPII)‐mediated hydrolysis of dipeptide derivatives and its identification as quiescent cell proline dipeptidase (QPP)/dipeptidyl peptidase 7 (DPP7). Biochem J. 2005;386:315–324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maes M, Scharpé S, de Meester I. Dipeptidyl peptidase II (DPPII), a review. Clin Chim Acta. 2007;380:31–49. [DOI] [PubMed] [Google Scholar]
- Maier JA, Martinez C, Kasavajhala K, Wickstrom L, Hauser KE, Simmerling C. ff14SB: improving the accuracy of protein side chain and backbone parameters from ff99SB. J Chem Theory Comput. 2015;11:3696–3713. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Markert Y, Köditz J, Ulbrich‐Hofmann R, Arnold U. Proline versus charge concept for protein stabilization against proteolytic attack. Protein Eng. 2003;16:1041–1046. [DOI] [PubMed] [Google Scholar]
- McCoy AJ, Grosse‐Kunstleve RW, Adams PD, Winn MD, Storoni LC, Read RJ. Phaser crystallographic software. J Appl Cryst. 2007;40:658–674. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mika N, Zorn H, Rühl M. Prolyl‐specific peptidases for applications in food protein hydrolysis. Appl Microbiol Biotechnol. 2015;99:7837–7846. [DOI] [PubMed] [Google Scholar]
- Mitea C, Havenaar R, Drijfhout JW, Edens L, Dekking L, Koning F. Efficient degradation of gluten by a prolyl endoprotease in a gastrointestinal model: implications for coeliac disease. Gut. 2008;57:25–32. [DOI] [PubMed] [Google Scholar]
- Miyashita Y, Wazawa T, Mogami G, Takahashi S, Sambongi Y, Suzuki M. Hydration‐state change of horse heart cytochrome c corresponding to trifluoroacetic‐acid‐induced unfolding. Biophys J. 2013;104:163–172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miyazono K, Kubota K, Takahashi K, Tanokura M. Crystal structure and substrate recognition mechanism of the prolyl endoprotease PEP from Aspergillus niger . Biochem Biophys Res Commun. 2022;591:76–81. [DOI] [PubMed] [Google Scholar]
- Mongan J, Case DA, McCammon JA. Constant pH molecular dynamics in generalized born implicit solvent. J Comput Chem. 2004;25(16):2038–2048. [DOI] [PubMed] [Google Scholar]
- Montserrat V, Bruins MJ, Edens L, Koning F. Influence of dietary components on Aspergillus niger prolyl endoprotease mediated gluten degradation. Food Chem. 2015;174:440–445. [DOI] [PubMed] [Google Scholar]
- Moriyama A, Nakanishi M, Sasaki M. Porcine muscle prolyl endopeptidase and its endogenous substrates. J Biochem. 1988;104:112–117. [DOI] [PubMed] [Google Scholar]
- Murshudov GN, Vagin AA, Dodson EJ. Refinement of macromolecular structures by the maximum‐likelihood method. Acta Crystallogr D Biol Crystallogr. 1997;53:240–255. [DOI] [PubMed] [Google Scholar]
- Nardini M, Dijkstra BW. Alpha/beta hydrolase fold enzymes: the family keeps growing. Curr Opin Struct Biol. 1999;9:732–737. [DOI] [PubMed] [Google Scholar]
- Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970;48:443–453. [DOI] [PubMed] [Google Scholar]
- Oda K. New families of carboxyl peptidases: serine‐carboxyl peptidases and glutamic peptidases. J Biochem. 2012;151:13–25. [DOI] [PubMed] [Google Scholar]
- Odya CE, Marinkovic DV, Hammon KJ, Stewart TA, Erdös EG. Purification and properties of prolylcarboxypeptidase (angiotensinase C) from human kidney. J Biol Chem. 1978;253:5927–5931. [PubMed] [Google Scholar]
- Ollis DL, Cheah E, Cygler M, Dijkstra B, Frolow F, Franken SM, et al. The alpha/beta hydrolase fold. Protein Eng. 1992;5:197–211. [DOI] [PubMed] [Google Scholar]
- Olsson MHM, Søndergaard CR, Rostkowski M, Jensen JH. PROPKA3: consistent treatment of internal and surface residues in empirical pKa predictions. J Chem Theory Comput. 2011;7:525–537. [DOI] [PubMed] [Google Scholar]
- Paul K. Stability of cytochrome c at extreme pH values. Acta Chem Scand. 1948;2:430–439. [Google Scholar]
- Plesniak LA, Connelly GP, Wakarchuk WW, McIntosh LP. Characterization of a buried neutral histidine residue in Bacillus circulans xylanase: NMR assignments, pH titration, and hydrogen exchange. Protein Sci. 1996;5:2319–2328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Polgár L. The prolyl oligopeptidase family. Cell Mol Life Sci. 2002;59:349–362. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rawlings ND, Barrett AJ, Finn R. Twenty years of the MEROPS database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res. 2016;44:343–350. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rea D, Fülöp V. Structure‐function properties of prolyl oligopeptidase family enzymes. Cell Biochem Biophys. 2006;44:349–365. [DOI] [PubMed] [Google Scholar]
- Rea D, Fülöp V. Prolyl oligopeptidase structure and dynamics. CNS Neurol Disord Drug Targets. 2011;10:306–310. [DOI] [PubMed] [Google Scholar]
- Roppongi S, Suzuki Y, Tateoka C, Fujimoto M, Morisawa S, Iizuka I, et al. Crystal structures of a bacterial dipeptidyl peptidase IV reveal a novel substrate recognition mechanism distinct from that of mammalian orthologues. Sci Rep. 2018;8:2714. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rosenblum JS, Kozarich JW. Prolyl peptidases: a serine protease subfamily with high potential for drug discovery. Curr Opin Chem Biol. 2003;7:496–504. [DOI] [PubMed] [Google Scholar]
- Samodova D, Hosfield CM, Cramer CN, Giuli MV, Cappellini E, Franciosa G, et al. ProAlanase is an effective alternative to trypsin for proteomics applications and disulfide bond mapping. Mol Cell Proteom. 2020;19:2139–2157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schechter I, Berger A. On the size of the active site in proteases. I Papain. Biochem Biophys Res Commun. 1967;27:157–162. [DOI] [PubMed] [Google Scholar]
- Sebela M, Rehulka P, Kábrt J, Rehulková H, Ozdian T, Raus M, et al. Identification of N‐glycosylation in prolyl endoprotease from Aspergillus niger and evaluation of the enzyme for its possible application in proteomics. J Mass Spectrom. 2009;44:1587–1595. [DOI] [PubMed] [Google Scholar]
- Shan P, Ho C, Zhang L, Gao X, Lin H, Xu T, et al. Degradation mechanism of soybean protein B3 subunit catalyzed by prolyl endopeptidase from Aspergillus niger during soy sauce fermentation. J Agric Food Chem. 2022;70:5869–5878. [DOI] [PubMed] [Google Scholar]
- Sheldrick GM. Experimental phasing with SHELXC/D/E: combining chain tracing with density modification. Acta Crystallogr D Biol Crystallogr. 2010;66:479–485. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shindyalov IN, Bourne PE. Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng. 1998;11:739–747. [DOI] [PubMed] [Google Scholar]
- Sofronov OO, Giubertoni G, de Alba P, Ortíz A, Ensing B, Bakker HJ. Peptide side‐COOH groups have two distinct conformations under biorelevant conditions. J Phys Chem Lett. 2020;11:3466–3472. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Soisson SM, Patel SB, Abeywickrema PD, Byrne NJ, Diehl RE, Hall DL, et al. Structural definition and substrate specificity of the S28 protease family: the crystal structure of human prolylcarboxypeptidase. BMC Struct Biol. 2010;10:16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Szeltner Z, Polgár L. Structure, function and biological relevance of prolyl oligopeptidase. Curr Protein Pept Sci. 2008;9:96–107. [DOI] [PubMed] [Google Scholar]
- Tack GJ, Van De Water JMW, Bruins MJ, Kooy‐Winkelaar EMC, van Bergen J, Bonnet P, et al. Consumption of gluten with gluten‐degrading enzyme by celiac patients: a pilot‐study. World J Gastroenterol. 2013;19:5837–5847. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Takahashi K. Chapter 762: Acid prolyl endopeptidase. In: Barret AJ, Rawlings ND, Woessner JF, editors. Handbook of Proteolytic Enzymes. Elsevier, Amsterdam; 2013a. p. 3446–3447. [Google Scholar]
- Takahashi K. Structure and function studies on enzymes with a catalytic carboxyl group (s): from ribonuclease T1 to carboxyl peptidases. Proc Jpn Acad Ser B Phys Biol Sci. 2013b;89:201–225. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsiatsiani L, Akeroyd M, Olsthoorn M, Heck AJR. Aspergillus niger prolyl endoprotease for hydrogen‐deuterium exchange mass spectrometry and protein structural studies. Anal Chem. 2017;89:7966–7973. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Unni S, Huang Y, Hanson RM, Tobias M, Krishnan S, Li WW, et al. Web servers and services for electrostatics calculations with APBS and PDB2PQR. J Comput Chem. 2011;32:1488–1491. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van der Laarse SAM, van Gelder CAGH, Bern M, Akeroyd M, Olsthoorn MMA, Heck AJR. Targeting proline in (phospho)proteomics. FEBS J. 2020;287:2979–2997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van Elzen R, Lambeir A. Structure and function relationship in prolyl oligopeptidase. CNS Neurol Disord Drug Targets. 2011;10:297–305. [DOI] [PubMed] [Google Scholar]
- van Schaick G, Domínguez‐Vega E, Gstöttner C, van den Berg‐Verleg JH, Schouten O, Akeroyd M, et al. Native structural and functional proteoform characterization of the prolyl‐alanyl‐specific endoprotease EndoPro from Aspergillus niger . J Proteome Res. 2021;20:4875–4885. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang J, Wolf RM, Caldwell JW, Kollman PA, Case DA. Development and testing of a general amber force field. J Comput Chem. 2004;25:1157–1174. [DOI] [PubMed] [Google Scholar]
- Waumans Y, Baerts L, Kehoe K, Lambeir A, de Meester I. The dipeptidyl peptidase family, prolyl oligopeptidase, and prolyl carboxypeptidase in the immune system and inflammatory disease, including atherosclerosis. Front Immunol. 2015;6:387. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wei G, Helmerhorst EJ, Darwish G, Blumenkranz G, Schuppan D. Gluten degrading enzymes for treatment of celiac disease. Nutrients. 2020;12:2095. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Williamson MP. The structure and function of proline‐rich regions in proteins. Biochem J. 1994;297(2):249–260. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yaron A, Naider F. Proline‐dependent structural and biological properties of peptides and proteins. Crit Rev Biochem Mol Biol. 1993;28:31–81. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data S1. Supporting Information.
Figure S1. Structural differences between AnPEP (this work) and PEP.
Figure S2. Structure based sequence alignment of AnPEP, hPRCP, and hDPP7.
Figure S3. Comparison of AnPEP (left, this work), hPRCP (middle, PDB: 3N2Z), and hDPP7 (right, PDB: 4EBB) structures.
Figure S4. Possible hydrogen bonding networks involving His491, Ser179, EG4, and Glu88 at pH = 5, as determined by Glu88/His491 protonation.
Figure S5. Differences between the endo‐protease AnPEP and exoproteases hPRCP and hDPP7.
Figure S6. Structural variation observed for 3D structures of horse and bovine heart cytochrome c.
Figure S7. Progress curves for the formation and degradation of peptides upon the hydrolysis by AnPEP.
Figure S8. Modeling of the primary cleavage site of cytochrome c into the active site of AnPEP.
Figure S9. Constant pH molecular dynamics (CpHMD) graphs of titratable residues in AnPEP.
Figure S10. Root mean square differences (RMSD) during MD simulation of the AnPEP‐cytochrome c complex.
Table S1. The catalytic triad and the oxyanion hole of AnPEP and the corresponding residues in hPRCP, hDPP7, hDPP4, and porcine prolyl oligopeptidase (pPOP).
Table S2. pKa dependent hydrogen bonds between His491, Ser179, EG4, and Glu88 at pH = 5.
Table S3. The peptides identified with MS during the incubation of cytochrome c with AnPEP.
Table S4. Average B‐factors (Cα atoms) for the hydrolase and cap domains of S28 crystal structures.
