 One enzyme, one substrate, but two different reaction mechanisms: HIV-1 protease follows different reaction mechanisms depending on its instantaneous conformation.
One enzyme, one substrate, but two different reaction mechanisms: HIV-1 protease follows different reaction mechanisms depending on its instantaneous conformation.
Abstract
The role of conformational diversity in enzyme catalysis has been a matter of analysis in recent studies. Pre-organization of the active site has been pointed out as the major source for enzymes' catalytic power. Following this line of thought, it is becoming clear that specific, instantaneous, non-rare enzyme conformations that make the active site perfectly pre-organized for the reaction lead to the lowest activation barriers that mostly contribute to the macroscopically observed reaction rate. The present work is focused on exploring the relationship between structure and catalysis in HIV-1 protease (PR) with an adiabatic mapping method, starting from different initial structures, collected from a classical MD simulation. The first, rate-limiting step of the HIV-1 PR catalytic mechanism was studied with the ONIOM QM/MM methodology (B3LYP/6-31G(d):ff99SB), with activation and reaction energies calculated at the M06-2X/6-311++G(2d,2p):ff99SB level of theory, in 19 different enzyme:substrate conformations. The results showed that the instantaneous enzyme conformations have two independent consequences on the enzyme's chemistry: they influence the barrier height, something also observed in the past in other enzymes, and they also influence the specific reaction pathway, which is something unusual and unexpected, challenging the “one enzyme–one substrate–one reaction mechanism” paradigm. Two different reaction mechanisms, with similar reactant probabilities and barrier heights, lead to the same gem-diol intermediate. Subtle nanosecond-timescale rearrangements in the active site hydrogen bonding network were shown to determine which reaction the enzyme follows. We named this phenomenon chemical disorder. The results make us realize the unexpected mechanistic consequences of conformational diversity in enzymatic reactivity.
Introduction
As is the case with very large molecules, enzymes have a very large number of internal degrees of freedom. Their structures fluctuate significantly over time at physiological temperature, resulting in many interchanging conformations.1–3 There are many enzyme conformations, well populated, which can be the starting point of the corresponding catalytic reactions. In the same way, these reactant conformations may lead to transition states and products with different geometries, or different energetics for similar transition states and products. Due to the time scale of typical bond-breaking and bond-forming processes (femtoseconds), enzyme conformations remain mostly unchanged during the event of barrier crossing. The experimental reaction rate is usually measured on macroscopic amounts of proteins and over macroscopic time scales, and reflects a weighted average over all these possible barriers arising from different instantaneous reactant conformations. Single-molecule kinetic measurements confirmed this picture.4 As these experiments measure chemical turnovers, the diversity of the observed activation barriers was limited to those that are crossed during the time of the experiment. It is tempting to speculate that, in these experiments, instantaneous barriers that lead to turnovers slower than the experimental timescale will always pass unnoticed, as the enzyme will change to lower-barrier conformations before a barrier crossing can be observed. Enzymatic studies using computational methods may easily be performed on single structures to measure barriers that are insurmountable in the time scale of experiments and clearly isolate this effect from the macroscopic average, something that experimentally is very difficult to do.
The influence of enzyme conformations on reactivity
Active site pre-organization has been surfacing as the most relevant factor behind the origin of an enzyme's catalytic power.5,6 Chemical reactions in enzymes only occur on physiologically useful timescales when the active site residues are in a suitable pre-organization position/orientation to promote the reaction; otherwise the energetic barriers will be very high, and the reaction will not take place. Enzymes are very proficient in keeping the residues in an orientation that specifically stabilizes the transition structures, as they fold in a way to lock them in those orientations. However, the position of the key residues and, in particular, the distances between the residues and the reacting substrate atoms still fluctuate significantly in the ps–ns timescale, and even at larger timescales due to greater global enzyme movements, caused by thermal motion. This strong dependence of the reactivity on specific enzyme conformations has been demonstrated by different computational studies, in which the activation barriers were calculated for several different initial conformations of the same enzyme.7–16 For example, in ketosteroid isomerase, the barriers change is about 20 kcal mol–1 due to a structural variation on the active site, which determines its closure and, consequently, the progress of the catalyzed reaction.12 In fluoroacetate dehalogenase, twenty snapshots in three different systems were used, and it was found that the barrier of each reaction varies by up to 15 kcal mol–1. These energetic fluctuations were associated with structural parameters of the active site.13 In the reaction catalyzed by P450, variations up to 17 kcal mol–1 were also found.14 Our previous study on α-amylase showed that the position of a buried water molecule highly influences the barrier of activation for this enzyme. Values from 9.3 to 28.6 kcal mol–1 were calculated. The fluctuations in the barrier were caused, to a very significant extent, by the fluctuations in the position and orientation of this water molecule and of the two active-site reactive residues.7 These are only a few of a number of studies in which this phenomenon was studied in detail. Combining all these barriers into a single, observed one is a matter of intense research.8,17–19 The main difficulty lies in the necessary extensive sampling that is needed to obtain good statistical convergence in the barrier height and to determine the contribution of each of the individual barriers to the final observed barrier. The thermodynamic ensembles generated by molecular mechanics force field based molecular dynamics (MM-MD) simulations from which the structures are extracted can be well balanced, but there is no guarantee that this good balance will be retained when the ensemble is reduced to a few tenths of snapshots and the MM force field is transformed into a QM/MM Hamiltonian. However, the objective of these studies is not to calculate an accurate value for the “macroscopic barrier” but instead to shed light on the fine structural requirements for a reaction to take place with a low activation barrier. Other methods, such as QM/MM MD, implicitly deal with and average these instantaneous barrier fluctuations. However, in these cases, the Hamiltonian used to describe the catalytic region is usually simplified, and the size of the QM region is reduced, in order to calculate the energies of very many structures generated by these methods.20–25 Recent state-of-the-art studies have already calculated potentials of mean force with high-level hybrid–meta Hamiltonians, even though the size of the QM region is still reduced and the MD time, despite being remarkable, is still smaller than the timescale of many enzyme structural fluctuations, due to insurmountable CPU limitations.26 In these excellent studies, the effects of conformational diversity are accounted for implicitly, through the sampling made in each point along the reaction coordinate, and thus not used to gain understanding of the nature of the fluctuations in the interactions that stabilize the transition structures, which is the major purpose of this study and other studies of this kind.
In the present work, we have studied the first, rate-limiting step of the catalytic mechanism of HIV-1 PR (Fig. 1). This reaction step was found to take place through either the One Asp mechanism (Fig. 1, top) or the Two Asp mechanism (Fig. 1, bottom), depending on the reactant's conformation. We have thus tried to identify correlations between enzyme:substrate structural fluctuations and the reaction mechanism (and reaction rate), to understand the factors that led to this mechanistic divergence.
Fig. 1. The first step of the One Asp (top) and Two Asp (bottom) catalytic mechanisms on the carbonyl carbon of the substrate scissile bond, forming a tetrahedral intermediate. In the One Asp mechanism, the same aspartate residue (AspB) deprotonates the hydrolytic water molecule and protonates the peptide oxygen – one carboxylate oxygen acts as an acid and the other as a base. In the Two Asp mechanism, which is most widely accepted, one of the two active site aspartates (AspB) protonates the oxyanion generated by the water attack, while the other (AspA) deprotonates the nucleophilic water. Conventionally, AspB is the residue that is considered to be protonated – the two chains are equivalent.
Earlier studies27–29 systematically favored the Two Asp mechanism, where the Asp25 from a chain acts as a base and the Asp25 from the other chain acts as an acid, over the One Asp mechanism, in which a single Asp25 takes on the two roles. Here we investigate the extent to which the preference for the Two Asp mechanisms could have been caused by the insufficient conformational space explored in these earlier studies.
To do so, one of the possible approaches is to sample several reaction paths with QM(DFT)/MM methods, using different initial structures of the enzyme:substrate complex gathered from a long MM-MD simulation, which can span much larger timescales than a QM/MM MD. In that sense, ONIOM QM/MM calculations were performed on HIV-1 PR, starting from 19 different initial structures taken from a 130 ns MM-MD simulation, as well as the X-ray structure. This system was selected for two main reasons: (a) the size of this enzyme is small, and its structure is relatively simple and well known, composed of two identical amino acid chains, with 99 residues in each one; (b) its catalytic mechanism is well studied, both experimentally and theoretically, and can easily be transposed to similar, also well-studied, aspartic proteases.8,15,29
The results were analyzed focusing on the understanding of the relationship between reaction pathways and activation energies obtained from each reactant conformation, and specific interactions that occur in that same conformation. We related the obtained activation barriers with structural parameters of each conformation, analyzing in particular the main enzyme–substrate distances of the active site. The results showed that the fine-level structural organization of the active site hydrogen bonding network is determinant to define the pathway of the chemical reaction. Different conformations of the active site led not only to different barriers but also to two different reaction paths with comparable barrier heights. This kind of mechanistic divergence took place at the nanosecond timescale in which the conformations were sampled.
Methods
The computational protocol used in this work was very similar to the one used in one of our previous studies:30 we started by modeling the enzyme–substrate complex using the ; 4HVP PDB structure,31 added hydrogen atoms to the structure with the software xleap32 (standard protonation states were predicted by PropKa,33 except for Asp25B, which is known to be neutral at physiological pH), inserted the system into a pre-equilibrated rectangular water box whose faces were at least 12 Ångstrom away from any protein atom (details in the ESI†), equilibrated the system with a short MM-MD heating run, from 0 to 300 K in 40 ps, and subsequently performed a 130 ns MM-MD simulation in the NPT ensemble to sample the conformational space of the system. All the modeling and MD details can be found in the ESI.† Next, we studied the first step of the HIV-1 protease catalytic mechanism using a QM/MM methodology, starting from 30 different structures, collected from the latter MM-MD simulation. From these, 19 were possible to characterize with full convergence criteria, and over the latter we performed a structural analysis of the active site residues, correlating their structural fluctuations with the obtained activation energies. Different snapshots from the MM-MD simulation were taken based on the simulation time. We started by selecting some structures from the initial nanoseconds of the MM-MD simulation. However, the results associated with these structures were discarded due to very high barriers. We associated these results with an incorrect position of the catalytic water molecule (not present in any X-ray structure), modeled in the active site by us. Taking this into account, we decided to select well-equilibrated snapshots at regular intervals (1 ns) subsequent to the first 100 ns of MM-MD simulation.
Fig. S1† shows the distribution of energies of the reactant state in the MM-MD simulation (grey bars). The purple line indicates how many structures from each MM-MD energy range were extracted, as a full protein:substrate plus water shell model, and changed to the QM/MM level, to calculate the reaction mechanism and activation barriers. As can be seen, the structures taken from the MM-MD simulation belong to well-populated areas of the MM-MD ensemble and follow the MM-MD distribution reasonably well, even though direct comparison should be made with caution due to the change in the system's Hamiltonian. Both distributions do not need to be “superimposable”, but only “reasonably similar”. In fact, the reaction rate depends linearly on the conformation populations but exponentially on the activation free energy and, therefore, differences between the distributions only have an impact on the reaction rate if they amount to orders of magnitude.
Similar QM/MM models, applying an ONIOM scheme as implemented in the Gaussian 09 software package,34 were defined for each enzyme:substrate complex. All the prepared systems contained a total of 6232 atoms with 90 atoms in the QM layer and the remaining system in the MM layer. The QM layer contained the two catalytic aspartates, the nucleophilic water molecule, seventeen atoms of the substrate, two other water molecules and some residues around the groups that actively participate in the reaction (Ala28B, Gly27B, Thr26B, Gly27A, Ala28A, and the carbonyl group of Thr26A). A cap of 1000 water molecules (∼3 Å around the protein) was kept in the model. The water molecules were frozen during all the calculations, except the ones present in the high layer. The freezing scheme was shown to be adequate in an earlier study.30 The interaction between the layers was treated with the electrostatic embedding scheme. The QM layer was optimized with the density functional B3LYP35 and 6-31G(d) basis set. Hydrogen atoms were used as link atoms where QM covalent bonds were truncated. The reaction path was studied in the same manner for all models, using the same reaction coordinate (the distance between the oxygen of the nucleophilic water molecule and the carbonyl carbon of the scissile peptide bond of the substrate) for an initial exploitation. The structures with the highest energy in the performed scans were used as starting guesses for a full optimization of the transition structure geometry. Nuclear vibrational frequencies were determined to confirm the nature of the stationary points (absence of imaginary frequencies in minima and one imaginary frequency in each transition state). Zero-point energies were computed at the B3LYP/6-31G(d) level of theory,36–38 using the harmonic oscillator/rigid rotor formalism.39,40 Intrinsic reaction coordinate (IRC) calculations were performed to obtain reactant, transition state and product structures in the same relative minimum. Single point energy calculations were performed using the M06-2X density functional and a higher basis set (6-311++G(2d,2p)). This theoretical level was used because we have seen in earlier benchmarks that it provides excellent results for main group chemistry reactions in enzymes.41–44 The final results were represented as QM/MM energies plus ZPE. All the calculations were performed using the ONIOM scheme45 as implemented in the Gaussian 09 software package.34
Moving from the MM-MD simulation to the QM/MM studies implies methodological differences that are important to have in mind. First of all, the molecular model is different. While in an MM-MD simulation the system is studied as periodic, with explicit solvent, in QM/MM calculations a single protein:substrate system in a small cap of constrained water molecules is used. The Hamiltonian, which is used to treat the system, is also different in both methodologies. These factors, together with the (still) limited QM/MM sampling, make it difficult to assign a rigorous relative weight to each of the calculated barriers. Despite these limitations, the methodology used here is obviously adequate to help us understand the influence of enzyme thermal conformational fluctuations on the chemical pathway and the activation barrier.
Results and discussion
The MM-MD performed in this work generated an isothermic–isobaric ensemble distribution of different microstates, in the reactant state. We studied the barriers of a significant number of structures and used them as a “reduced ensemble” of initial structures to study the rate-limiting step of the reaction (Fig. 1). Enzyme conformational rearrangements occurring in a larger time scale than that covered during the MM-MD simulation (130 ns) were not explored. This is a general limitation of MM-MD simulations, due to the large timescale of enzyme motions, which easily goes beyond milliseconds. A more thorough conformational space of the active site requires invoking advanced sampling techniques such as parallel tempering and related methods46,47 or collective coordinate based sampling methods.48 Out of the 30 selected initial structures, only 19 were used in our analysis. These were the ones where the calculations and analysis were possible to carry out with rigor. The remaining 11 had to be excluded due to optimization problems or difficulties in characterizing the stationary points (reactants or transition states). The transition state for this step is particularly difficult to optimize, due to the intricate network of hydrogen bonds; despite having the active site residues and substrate properly positioned to react, some specific structures are so difficult to optimize that computationally it becomes more economical to start with a large number of structures and afterwards just discard the most problematic ones.
Spread of the activation barriers and chemical disorder
The results showed activation barriers (ΔE + ΔZPE) ranging from 17.3 to 32.2 kcal mol–1 at the M06-2X/6-311++G(2d,2p):ff99SB level of theory (Fig. 2), covering a span of 15 kcal mol–1. More importantly, two different reaction mechanisms (One Asp mechanism and Two Asp mechanism) were observed, the occurrence of which depended on the specific reactant structure (Fig. 3). The activation barriers changed widely in the nanoseconds time scale, as did the reaction mechanisms. The energy barrier, obtained using the structure taken after 120 ns of MM-MD simulation, was the lowest among all our measurements, and corresponds to 17.3 kcal mol–1. Just 1 ns later, the activation barrier was above 30 kcal mol–1.
Fig. 2. Activation barriers for the snapshots selected from the MM-MD simulation. These barriers corresponded to zero-point corrected total energies  , calculated at the M06-2X/6-311++G(2d,2p):ff99SB level of theory. The purple and cyan points correspond to structures that react through the One Asp mechanism and the Two Asp mechanism, respectively. The lowest activation barrier (17.3 kcal mol–1) was obtained, starting from the structure taken after 120 ns of MM-MD simulation. The dashed line provides a visual guidance for the chronological order of the barriers and does not correspond to an interpolation of the energies between them.
, calculated at the M06-2X/6-311++G(2d,2p):ff99SB level of theory. The purple and cyan points correspond to structures that react through the One Asp mechanism and the Two Asp mechanism, respectively. The lowest activation barrier (17.3 kcal mol–1) was obtained, starting from the structure taken after 120 ns of MM-MD simulation. The dashed line provides a visual guidance for the chronological order of the barriers and does not correspond to an interpolation of the energies between them.
Fig. 3. (a) Reactant state from the structure taken after 120 ns of MM-MD simulation, which is associated with the lowest energetic barrier. Only the QM layer was represented for simplicity. Important active site distances (explained in the main text above) are highlighted. (b) 2D representation of the relevant active-site interactions.
Small changes in the position/orientation of the active site residues, as well as small movements of the catalytic water molecule, seem to justify, to a great extent, the propensity for each of the two different mechanisms. The conformational fluctuations do not correspond to changes in global folding (which take place in much larger timescales), but instead to much more subtle changes (mostly hydrogen bonding distances) that, despite being small, can modify the very important chemical interactions between the substrates and the active site, leading to a change in the reaction pathway.
The turnover of HIV-1 PR takes place in seconds, a timescale at least nine orders of magnitude slower than the fluctuations of the barrier. This means that the experimentally observed kinetics could be a consequence of the overcoming of a few low barriers that occur at well-defined conformations and generate perfectly pre-organized active site conformations. Despite the limited sampling achieved here, due to the high-level theoretical methods and large QM regions employed, the high frequency at which these low-barrier structures appear (7 out of 19, with barriers smaller than 22 kcal mol–1) is more than enough to overcome the very high Boltzmann penalties associated with the more frequent, high-barrier structures, and thus the former should determine the reaction kinetics. In this regard, it is important to keep in mind that the turnover rate depends linearly on the frequency of reactive conformations and exponentially on the barrier heights, meaning that the turnover rate is much more sensitive to the latter than to the former. The conclusion is consistent with other previous studies, where low-energy barriers were not found in the majority of the explored conformations but still in a very reasonable number of cases.7,8,49 In analogy to the concept of “static/dynamic/instantaneous disorder” coined in the past for the dispersion in kcat arising from folding fluctuations at several timescales,50–53 we refer to the phenomenon seen herein as “chemical disorder”, with the adjective “instantaneous” due to the nanosecond timescale in which it takes place.
Two different reaction mechanisms
The Two Asp mechanism is well described and widely accepted in the literature, for HIV-1 PR and other similar aspartic proteases.27,28,54,55 Considering that the Asp25 that is known to be protonated is the one from chain B (conventionally since the chains are equivalent), the mechanism is characterized by a nucleophilic attack of the water molecule present between both catalytic Asp25 residues, on the carbonyl carbon of the scissile bond, while it loses a proton to Asp25A. During the same reaction step the carboxylate of Asp25B protonates the carbonyl oxygen of the peptide bond. In the One Asp mechanism, the Asp25A does not participate directly in the reaction, even though it is still fundamental because it raises the pKa of Asp25B, making it neutral at the beginning of the reaction. In this mechanism, the unprotonated oxygen of Asp25B abstracts the water proton when the water molecule attacks the carbonyl carbon, while the Asp25B acidic proton is transferred to the carbonyl oxygen. These two mechanisms present a similar chemistry, the main difference being whether a single Asp residue acts as an acid and a base in the same step, or if the two functions are divided by two equivalent Asp residues, and whether a negative or a neutral Asp acts as a base. The fact that their chemistry is similar may be the reason why the enzyme was found to stabilize both transition states to a comparable extent.
Equivalent low barriers were found in both mechanisms, provided that the conformation of the active site is adequate. For example, the structure taken after 109 ns of MM-MD simulation is associated with a high barrier of 31.5 kcal mol–1, which means that this structure is not adequate to initiate the reaction mechanism. However, after 1 ns of MM-MD simulation (110 ns), small fluctuations on the enzyme and substrate structure enable the reaction (activation barrier of 17.8 kcal mol–1) to take place. For these two structures, the reaction occurs via the Two Asp mechanism. The structure found just one nanosecond later (111 ns) reacted, in turn, via the One Asp mechanism with a favorable activation barrier of 18.0 kcal mol–1. A structural analysis comparing both mechanisms and correlating the enzyme–substrate structure with them, and with the activation barriers, was performed and is detailed in the next section.
Why two mechanisms?
Fig. 3 presents the QM layer used in the QM/MM calculations, highlighting the most relevant distances for the reaction, which we analyzed to understand the origin of the propensity for each of the two mechanisms. These specific distances were chosen because they correspond to all the bond-breaking/bond-forming distances and first-shell interactions with reacting atoms. For simplicity, we only represent the QM layer from the structure that reacted with the smallest barrier (120 ns – 17.3 kcal mol–1). Six distances were selected for analysis, which encompass interactions between reacting atoms (d1 to d4) and fundamental hydrogen bonds that tune the pKa of the reacting groups (d5 and d6, which tune the pKa of Asp25A). d1 is the distance between the oxygen of the catalytic water and the carbonyl carbon of the substrate scissile bond (Met201); d2 is the distance between the acidic hydrogen atom from the Asp25B carboxylic group and the oxygen atom from the carbonyl group of Met201 that will be protonated; d3 corresponds to the smallest distance between a (basic) oxygen from the Asp25A carboxylic group and a proton from the catalytic water molecule; d4 corresponds to the distance between the non-protonated oxygen from the carboxylic group of Asp25B and the catalytic water molecule; d5 is the distance between the hydroxyl proton of Thr26B and the carboxyl oxygen from the Asp25A and, finally, d6 is the hydrogen bond distance between a carboxyl oxygen atom from Asp25A and a buried water molecule present at the active site. These structural parameters were evaluated in the optimized reactant structures (after IRC calculations) and in the optimized transition structures.
Table S1† summarizes the results, indicating the reaction mechanism that corresponds to each reactant structure, and the obtained activation barriers, as well as the d1–d6 values. The obtained barriers ranged from 17.3 kcal mol–1 to 32.2 kcal mol–1. This range was similar for both mechanisms: 17.3–31.5 kcal mol–1 for the Two Asp mechanism, and 18.0–32.2 kcal mol–1 for the One Asp mechanism. The barriers for the two mechanisms can be considered as equivalent within the accuracy of the method.
It is evident that the “instantaneous” propensity of each of the two mechanisms will depend on the geometry of the whole protein system and solvent at that specific moment. The question that arises is whether some (few) of the interactions of the whole system have a very preponderant role in determining the mechanistic route. It is expectable that the local interactions around the reactive atoms represent most of the determining effect, but it remains to be known if these are dominant enough to allow explaining the mechanistic route just by themselves. After all, it is not easy to reduce a whole protein, having many thousand degrees of freedom, plus the solvent, to only six active-site degrees of freedom, and still explain the observed effects just based on the latter.
Therefore, to understand the origin of the propensity for the two reaction mechanisms, we investigated if any of the individual key distances d1–d6 for the reaction correlated with the tendency for a specific mechanism. However, no correlation was found. Additionally, all the individual distances of each structure were represented against the respective barrier. The results showed that there is no evident correlation between the individual distances and the energetic barriers (see Fig. S2 and S3 in the ESI†).
The instantaneous active site hydrogen-bonding network determines the chemical mechanism
The previous results show that the propensity to follow a given reaction mechanism is too complex to be captured by a single internal degree of freedom, a single chemical interaction. Instead, this tendency seems to depend on the overall pre-organization of the whole active site. To test if this is the case, we hypothesized what would be the chemical interactions that would drive the reaction through each of the two mechanisms, basing ourselves on the principles of chemical reactivity and transition state stabilization.
It is expectable from the point of view of chemical reactivity that the pKa of Asp25A is relevant in this regard, as in the Two Asp mechanism it should act as a base but in the One Asp mechanism it should not. Therefore, low pKa values should deviate the chemical flow from the Two Asp mechanism, due to a less competent Asp25A base. It is also evident that the strength of the two hydrogen bonds established between the Asp25A carboxylate and the Thr26B and the structural water molecule will be relevant to tune the Asp25A pKa – the shorter the hydrogen bonds, the lower the pKa.
A second aspect that will determine the reaction pathway is how close a basic Asp25 is from the hydrolytic water, and how far the competing Asp25 is from the same water. For example, the One Asp mechanism will be promoted when Asp25B is close to the hydrolytic water (distance d4), making easier the water deprotonation by this residue and, similarly, when the competing Asp25A base and the hydrolytic water are far apart (distance d3), as the water deprotonation by the competing Asp25A becomes increasingly difficult.
Looking at the active site interactions shown in Fig. 3, there do not seem to exist any more differentiating interactions from the point of view of chemical reactivity. The remaining interactions are related to the attack of the hydrolytic water on the peptide bond (d1) and protonation of the peptide oxygen (d2). As these reactions take place in both mechanisms, they are not expected to exert a discriminatory effect that selectively drives the system through one chemical pathway over the other.
The effect of the active site hydrogen bonding on the Asp25B pKa can be summarized by the variable d5 + d6 (the smaller the sum, the stronger the hydrogen bonds, and the lower the pKa will be); the difference between the proximity of the two competing Asp25 bases and the hydrolytic water can be interpreted by analyzing the variable d4 – d3 (the smaller this value is, the closer is Asp25B to the water molecule in relation to Asp25A). Overall, the collective variable dcol = d4 + d5 + d6 – d3 expresses the two conditions together; therefore, it is expectable that small dcol values should represent an active site pre-organized to drive the reaction through the One Asp and large dcol values express an active site pre-organized to drive the reaction through the Two Asp mechanism.
To confirm that this is the case, we plotted dcol against the barrier height (Fig. 4). A very clear distinction between the two mechanisms emanates, with the One Asp mechanism being dominant at low dcol values and the Two Asp mechanism being dominant at high coordinate values, as anticipated, confirming that the specific interactions pointed out have a very prominent role in determining the reaction pathway. As their distances fluctuate, due to thermal motion, the chemical pathway changes as a consequence, bringing nanosecond-timescale “instantaneous” chemical disorder to the system.
Fig. 4. Activation barrier, chemical pathway and dcol (d4 + d5 + d6 – d3). A very clear correlation between the value of dcol and the reaction pathway is visible. The One Asp mechanism is correlated with small values of d5 + d6 (stronger hydrogen bonding to Asp25A and, consequently, lower Asp25A pKa) as well as small values of d4 and large values of d3, promoting water deprotonation by Asp25B over competing Asp25A. The opposite tendency is observed for the Two Asp mechanism. In the three exceptional cases where these conditions were not fulfilled, the barriers were far too high for any of the two mechanisms to take place.
The origin of the barrier fluctuations in each reaction mechanism
Besides the existence of chemical disorder, leading to two different chemical pathways, it is also interesting to analyze the barrier fluctuations within each pathway. In both mechanisms, low activation barriers are expected to be promoted by reactant conformations having short distances between the atoms that will react. These are the distance between the hydrolytic water and the peptide carbon atom (d1) and the distance between the substrate's carbonyl oxygen and the Asp25B that will protonate it (d2).
Additionally, for the One Asp mechanism, the hydrogen bond distance between the hydrolytic water and Asp25B (d4) is important for the barrier, as Asp25B has to deprotonate this water. Short hydrogen bonds will be associated with smaller barriers. Larger distances between Asp25A and the hydrolytic water (large d3) will also promote small barriers in this mechanism, by reducing the electrostatic repulsion between the negative Asp25A and the hydroxide ion that is formed in the transition state. These proposals, based on chemical principles of reactivity, can be tested by correlating the variable done Aspcol = d1 + d2 + d4 – d3 (that summarizes all the hypotheses) with the activation barrier heights (Fig. 5a). As expected, the correlation is very clear, confirming that the described network of hydrogen bonds has a relevant role in defining the barrier height.
Fig. 5. Correlation between the collective variable done Aspcol = d1 + d2 + d4 – d3, for the One Asp mechanism (a), and dtwo Aspcol = d1 + d2 + d3 – d5 – d6, for the Two Asp mechanism (b), and the activation barriers. A correlation can be seen in the first one, with the barrier growing as done Aspcol grows, as expected, almost without exception. For the second, there is a clear tendency for finding high barriers when dtwo Aspcol is above ∼1.5 Å, but additional factors not contemplated in dtwo Aspcol may find some relevance for smaller values.
For the Two Asp mechanism, short distances between the hydrolytic water and the Asp25A base (small d3) are expected to promote water deprotonation by Asp25A; longer hydrogen bonds to Asp25A formed by the catalytic water and by Thr26B (large d5 and d6, respectively) should also be important to promote this mechanism as they increase the Asp25A pKa, making it a better base. These criteria can be tested by correlating dtwo Aspcol = d1 + d2 + d3 – d5 – d6 with the barrier (Fig. 5b). In this case, the correlation is not as clear as in the previous ones, in particular for small dtwo Aspcol values, but it is still quite clear that values larger than ∼1.5 Å for this collective variable lead to high activation barriers, making difficult the progress of the reaction. The correlation seems to indicate that other geometric aspects, not incorporated into the already complex collective variable, may also be making a substantial contribution for short dtwo Aspcol values.
We note that it is tempting to make a multiple linear regression of the d1–d6 distances (or for many other possible distances) and the activation barriers, for the discrimination between mechanisms. A recent study by Lodola et al.56 has shown that interesting information can be extracted from the reactant conformation using principal component analysis, partial least squares regression analysis, and multiple linear regression analysis. We avoided doing so here because the number of independent observables was not large enough for automated methods to extract better information than the one obtained by an expert analysis based on chemical reactivity principles. The number of independent measurements (19) and the number of variables to be fitted (6) might easily lead to overfitting, in particular when the 19 barriers were split between two mechanisms (11 for one and 8 for the other). The purpose for relating collective variables with the barrier height or reaction mechanism was to demonstrate that a very significant fraction of the origin of the fluctuations and the origin of the preference between mechanisms was related to these key interatomic interactions (that make sense from a chemical point of view) and to their thermal fluctuations, and not to reproduce the barriers or the choice between mechanisms through a fitted function. In summary, our purpose was to achieve “chemical understanding” and use the collective variables to test the chemical insights, and not the contrary.
Finally, it is important to discuss the use of the X-ray conformation (the most abundant conformation) for the determination of the chemical mechanism and barrier height. A very large body of studies19,57–63 shows that a correct X-ray conformation almost always provides barrier heights in agreement with experiments, and chemical mechanisms that are widely accepted to also be correct. This is particularly true for X-ray structures co-crystallized with a substrate/transition state analogue. Here, the use of X-ray structures is futile, as the catalytic water molecule is never present in the experimental structures, and it is its exact position that determines the outcome of the reaction mechanism and barrier height. Therefore, the very act of modeling this missing molecule (with all the involved assumptions and ambiguities) would mostly determine the result of the calculation. As such, the best solution here was to embark on a multi-PES study.
Conclusions
The aim of this work was to understand the effect of conformational fluctuations on the reaction mechanism followed by HIV-1 PR. A total of 19 different structures, collected from a long classical MM-MD simulation were used as initial reactants to study the reaction path of the first step of this mechanism, with QM/MM calculations.
The results showed that the different conformations of the enzyme in the reactant state lead to two different reaction mechanisms. We studied the reasons behind this effect, which we have named chemical disorder. We found that fast rearrangements in the hydrogen bonding network of the active site push the chemical reaction through one way or the other of this mechanistic bifurcation. As these interactions fluctuate at the nanosecond timescale, due to thermal motions, the most favorable mechanism also changed, according to the instantaneous enzyme conformations, bringing nanosecond-timescale “chemical disorder” to this amazing enzyme, a phenomenon very rare in enzymatic systems that challenges the “one enzyme–one substrate–one reaction mechanism” paradigm.
The competent activation barriers for both mechanisms were similar, as well as the populations for the reactants that drive the mechanism through each of the pathways.
The One Asp mechanism was considered as less probable in the past, due to the large barriers found by other authors.8 The Two Asp mechanism is well described and accepted in previous studies of HIV-1 PR and other proteases. The One Asp mechanism was found to take place when Asp25A is well stabilized through hydrogen bonds with a structural water molecule and Thr26B, lowering its pKa and making it a weaker base. The relative proximity between the competing Asp25A and Asp25B bases and the hydrolytic water proton, which fluctuates due to thermal motion, also influences which of them will deprotonate it, and thus the outcome in terms of the reaction pathway. The role of this structural water molecule was not documented before, even though its prevalence during the MM-MD simulation is consistent with its relevance to the HIV-1 PR mechanism. The results provide a very interesting glimpse into the intricacies of an apparently well-known enzyme, revealing the underlying richness of HIV-1 PR chemistry. The extent to which chemical disorder is prevalent in the general enzymatic world is yet to be known.
Conflicts of interest
There are no conflicts to declare.
Supplementary Material
Acknowledgments
The work was supported by UID/MULTI/04378/2019 with funding from FCT/MCTES through national funds. The authors acknowledge financing from Fundação para a Ciência e Tecnologia through the project PTDC/QUI-QFI/28714/2017. A. R. Calixto would like to thank FCT for her PhD Fellowship SFRH/BD/95962/2013.
Footnotes
†Electronic supplementary information (ESI) available: Methodology used in the initial model and MM-MD simulations; Fig. S1 – energy distribution of the NPT ensemble generated by the MM-MD calculations and of the microstates extracted for the QM/MM calculations; Fig. S2 – correlations between six selected active site distances from reactant structures and the corresponding activation barriers; Fig. S3 – correlations between six selected active site distances from transition state structures and the corresponding activation barriers (PDF). Geometries of the optimized stationary points (ZIP file). See DOI: 10.1039/c9sc01464k
References
- Henzler-Wildman K., Kern D. Nature. 2007;450(7172):964–972. doi: 10.1038/nature06522. [DOI] [PubMed] [Google Scholar]
- Hanoian P., Liu C. T., Hammes-Schiffer S., Benkovic S. Acc. Chem. Res. 2015;48(2):482–489. doi: 10.1021/ar500390e. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Palmer III A. G. Acc. Chem. Res. 2015;48(2):457–465. doi: 10.1021/ar500340a. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Min W., English B. P., Luo G., Cherayil B. J., Kou S. C., Xie X. S. Acc. Chem. Res. 2005;38(12):923–931. doi: 10.1021/ar040133f. [DOI] [PubMed] [Google Scholar]
- Warshel A. J. Biol. Chem. 1998;273(42):27035–27038. doi: 10.1074/jbc.273.42.27035. [DOI] [PubMed] [Google Scholar]
- Warshel A., Sharma P. K., Kato M., Xiang Y., Liu H., Olsson M. H. Chem. Rev. 2006;106(8):3210–3235. doi: 10.1021/cr0503106. [DOI] [PubMed] [Google Scholar]
- Santos-Martins D., Calixto A. R., Fernandes P. A., Ramos M. J. ACS Catal. 2018;8(5):4055–4063. [Google Scholar]
- Ribeiro A. N. J., Santos-Martins D., Russo N., Ramos M. J., Fernandes P. A. ACS Catal. 2015;5(9):5617–5626. [Google Scholar]
- Mata R. A., Werner H. J., Thiel S., Thiel W. J. Chem. Phys. 2008;128(2):025104. doi: 10.1063/1.2823055. [DOI] [PubMed] [Google Scholar]
- Lonsdale R., Hoyle S., Grey D. T., Ridder L., Mulholland A. J. Biochemistry. 2012;51(8):1774–1786. doi: 10.1021/bi201722j. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li Y., Zhang R., Du L., Zhang Q., Wang W. Int. J. Mol. Sci. 2016;17(8):1372. doi: 10.3390/ijms17081372. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van der Kamp M. W., Chaudret R., Mulholland A. J. FEBS J. 2013;280(13):3120–3131. doi: 10.1111/febs.12158. [DOI] [PubMed] [Google Scholar]
- Li Y., Zhang R., Du L., Zhang Q., Wang W. Catal. Sci. Technol. 2016;6(1):73–80. [Google Scholar]
- Schöneboom J. C., Lin H., Reuter N., Thiel W., Cohen S., Ogliaro F., Shaik S. J. Am. Chem. Soc. 2002;124(27):8142–8151. doi: 10.1021/ja026279w. [DOI] [PubMed] [Google Scholar]
- Piana S., Carloni P. Proteins. 2000;39(1):26–36. doi: 10.1002/(sici)1097-0134(20000401)39:1<26::aid-prot3>3.0.co;2-n. [DOI] [PubMed] [Google Scholar]
- Saura P., Kaganer I., Heydeck D., Lluch J. M., Kuhn H., Gonzalez-Lafont A. Chem.–Eur. J. 2018;24(4):962–973. doi: 10.1002/chem.201704672. [DOI] [PubMed] [Google Scholar]
- Cooper A. M., Kästner J. ChemPhysChem. 2014;15(15):3264–3269. doi: 10.1002/cphc.201402382. [DOI] [PubMed] [Google Scholar]
- Lonsdale R., Harvey J. N., Mulholland A. J. J. Phys. Chem. B. 2009;114(2):1156–1162. doi: 10.1021/jp910127j. [DOI] [PubMed] [Google Scholar]
- Sousa S. F., Ribeiro A. J., Neves R. P., Brás N. F., Cerqueira N. M., Fernandes P. A., Ramos M. J. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2017;7(2):e1281. [Google Scholar]
- Wong K.-Y., Gao J. Biochemistry. 2007;46(46):13352–13369. doi: 10.1021/bi700460c. [DOI] [PubMed] [Google Scholar]
- Repič M., Vianello R., Purg M., Duarte F., Bauer P., Kamerlin S. C., Mavri J. Proteins. 2014;82(12):3347–3355. doi: 10.1002/prot.24690. [DOI] [PubMed] [Google Scholar]
- Maršavelski A., Petrović D. a., Bauer P., Vianello R., Kamerlin S. C. L. ACS Omega. 2018;3(4):3665–3674. doi: 10.1021/acsomega.8b00346. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barrozo A., Liao Q., Esguerra M., Marloie G., Florián J., Williams N. H., Kamerlin S. C. L. Org. Biomol. Chem. 2018;16(12):2060–2073. doi: 10.1039/c8ob00312b. [DOI] [PubMed] [Google Scholar]
- Hu P., Zhang Y. J. Am. Chem. Soc. 2006;128(4):1272–1278. doi: 10.1021/ja056153+. [DOI] [PubMed] [Google Scholar]
- Reis M., Alves C. N., Lameira J., Tuñón I., Martí S., Moliner V. Phys. Chem. Chem. Phys. 2013;15(11):3772–3785. doi: 10.1039/c3cp43968b. [DOI] [PubMed] [Google Scholar]
- Prieß M., Göddeke H., Groenhof G., Schäfer L. V. ACS Cent. Sci. 2018;4(10):1334–1343. doi: 10.1021/acscentsci.8b00369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brik A., Wong C.-H. Org. Biomol. Chem. 2003;1(1):5–14. doi: 10.1039/b208248a. [DOI] [PubMed] [Google Scholar]
- Piana S., Bucher D., Carloni P., Rothlisberger U. J. Phys. Chem. B. 2004;108(30):11139–11149. [Google Scholar]
- Piana S., Carloni P., Parrinello M. J. Mol. Biol. 2002;319(2):567–583. doi: 10.1016/S0022-2836(02)00301-7. [DOI] [PubMed] [Google Scholar]
- Calixto A. R., Ramos M. J., Fernandes P. A. J. Chem. Theory Comput. 2017;13(11):5486–5495. doi: 10.1021/acs.jctc.7b00768. [DOI] [PubMed] [Google Scholar]
- Miller M., Schneider J., Sathyanarayana B. K., ToTH M. V., Marshall G. R., Clawson L., Selk L., Kent S., Wlodawer A. Science. 1989;246(4934):1149–1152. doi: 10.1126/science.2686029. [DOI] [PubMed] [Google Scholar]
- Case D., Darden T., Cheatham III T., Simmerling C., Wang J., Duke R., Luo R., Walker R., Zhang W. and Merz K., AMBER 12, University of California, San Francisco, 2012, vol. 2010, pp. 1–826. [Google Scholar]
- Olsson M. H., Søndergaard C. R., Rostkowski M., Jensen J. H. J. Chem. Theory Comput. 2011;7(2):525–537. doi: 10.1021/ct100578z. [DOI] [PubMed] [Google Scholar]
- Frisch M., Trucks G., Schlegel H., Scuseria G., Robb M., Cheeseman J., Scalmani G., Barone V., Mennucci B. and Petersson G., Gaussian 09, Revision D. 01, Gaussian. Inc., Wallingford, CT, 2009.
- Becke A. D. J. Chem. Theory Comput. 1993;98(7):5648–5652. [Google Scholar]
- Lee C., Yang W., Parr R. G. Phys. Rev. B: Condens. Matter Mater. Phys. 1988;37(2):785. doi: 10.1103/physrevb.37.785. [DOI] [PubMed] [Google Scholar]
- Raghavachari K. Theor. Chem. Acc. 2000;103(3–4):361–363. [Google Scholar]
- Dill J. D., Pople J. A. J. Chem. Phys. 1975;62(7):2921–2923. [Google Scholar]
- Ochterski J. W., Thermochemistry in Gaussian, Gaussian Inc, 2000, pp. 1–19.
- McQuarrie D. A. and Simon J. D., Molecular Thermodynamics, University Science Books, Sausalito, CA, 1999, vol. 63. [Google Scholar]
- Ribeiro A. J., Ramos M. J., Fernandes P. A. J. Chem. Theory Comput. 2010;6(8):2281–2292. doi: 10.1021/ct900649e. [DOI] [PubMed] [Google Scholar]
- Brás N. F., Perez M. A., Fernandes P. A., Silva P. J., Ramos M. J. J. Chem. Theory Comput. 2011;7(12):3898–3908. doi: 10.1021/ct200309v. [DOI] [PubMed] [Google Scholar]
- Neves R. P., Fernandes P. A., Varandas A. N. J., Ramos M. J. J. Chem. Theory Comput. 2014;10(11):4842–4856. doi: 10.1021/ct500840f. [DOI] [PubMed] [Google Scholar]
- Pereira A. T., Ribeiro A. J., Fernandes P. A., Ramos M. J. Int. J. Quantum Chem. 2017;117(18):e25409. [Google Scholar]
- Vreven T., Byun K. S., Komáromi I., Dapprich S., Montgomery Jr J. A., Morokuma K., Frisch M. J. J. Chem. Theory Comput. 2006;2(3):815–826. doi: 10.1021/ct050289g. [DOI] [PubMed] [Google Scholar]
- Sugita Y., Okamoto Y. Chem. Phys. Lett. 1999;314(1–2):141–151. [Google Scholar]
- Liu P., Kim B., Friesner R. A., Berne B. J. Proc. Natl. Acad. Sci. U. S. A. 2005;102(39):13749–13754. doi: 10.1073/pnas.0506346102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Awasthi S., Nair N. N. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2019;9(3):e1398. [Google Scholar]
- Oliveira E. F., Cerqueira N. M., Ramos M. J., Fernandes P. A. Catal. Sci. Technol. 2016;6(19):7172–7185. [Google Scholar]
- Lu H. P., Xun L., Xie X. S. Science. 1998;282(5395):1877–1882. doi: 10.1126/science.282.5395.1877. [DOI] [PubMed] [Google Scholar]
- Xue Q., Yeung E. S. Nature. 1995;373(6516):681–683. doi: 10.1038/373681a0. [DOI] [PubMed] [Google Scholar]
- Smiley R. D., Hammes G. G. Chem. Rev. 2006;106(8):3080–3094. doi: 10.1021/cr0502955. [DOI] [PubMed] [Google Scholar]
- English B. P., Min W., Van Oijen A. M., Lee K. T., Luo G., Sun H., Cherayil B. J., Kou S., Xie X. S. Nat. Chem. Biol. 2006;2(2):87–94. doi: 10.1038/nchembio759. [DOI] [PubMed] [Google Scholar]
- Calixto A. R., Bras N. F., Fernandes P. A., Ramos M. J. ACS Catal. 2014;4(11):3869–3876. [Google Scholar]
- Brás N. F., Ramos M. J., Fernandes P. A. Phys. Chem. Chem. Phys. 2012;14(36):12605–12613. doi: 10.1039/c2cp41422h. [DOI] [PubMed] [Google Scholar]
- Lodola A., Sirirak J., Fey N., Rivara S., Mor M., Mulholland A. J. J. Chem. Theory Comput. 2010;6(9):2948–2960. doi: 10.1021/ct100264j. [DOI] [PubMed] [Google Scholar]
- Cerqueira N., Fernandes P., Ramos M. J. ChemPhysChem. 2018;19(6):669–689. doi: 10.1002/cphc.201700339. [DOI] [PubMed] [Google Scholar]
- Ramos M. J., Fernandes P. A. Acc. Chem. Res. 2008;41(6):689–698. doi: 10.1021/ar7001045. [DOI] [PubMed] [Google Scholar]
- Sousa S. F., Fernandes P. A., Ramos M. J. Phys. Chem. Chem. Phys. 2012;14(36):12431–12441. doi: 10.1039/c2cp41180f. [DOI] [PubMed] [Google Scholar]
- Himo F. J. Am. Chem. Soc. 2017;139(20):6780–6786. doi: 10.1021/jacs.7b02671. [DOI] [PubMed] [Google Scholar]
- Siegbahn P. E., Himo F. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2011;1(3):323–336. doi: 10.1002/wcms.44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Senn H. M., Thiel W. Angew. Chem., Int. Ed. 2009;48(7):1198–1229. doi: 10.1002/anie.200802019. [DOI] [PubMed] [Google Scholar]
- Llano J., Gauld J. W. Quantum Biochem. 2010:643–666. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





