Significance
The dissociation of water is arguably the most fundamental chemical reaction occurring in the aqueous phase. Despite that the splitting of a water molecule very seldom occurs, the reaction is of major importance in many areas of chemistry and biology. Direct experimental probing of the event is still impossible and also simulating the event via accurate computer simulations is challenging. Here, we achieved the latter via specialized rare-event algorithms estimating rates of dissociation in agreement with indirect experimental measurements. Even more interestingly, by a rigorous analysis of our results we identified anomalies in the water structure that act as initiators of the reaction, a finding that suggests paradigms for steering and catalyzing chemical reactions.
Keywords: autoionization, water, path sampling, machine learning, ab initio molecular dynamics
Abstract
The pH of liquid water is determined by the infrequent process in which water molecules split into short-lived hydroxide and hydronium ions. This reaction is difficult to probe experimentally and challenging to simulate. One of the open questions is whether the local water structure around a slightly stretched OH bond is actually initiating the eventual breakage of this bond or whether this event is driven by a global ordering that involves many water molecules far away from the reaction center. Here, we investigated the self-ionization of water at room temperature by rare-event ab initio molecular dynamics and obtained autoionization rates and activation energies in good agreement with experiments. Based on the analysis of thousands of molecular trajectories, we identified a couple of local order parameters and show that if a bond stretch occurs when all these parameters are around their ideal range, the chance for the first dissociation step (double-proton jump) increases from to . Understanding these initiation triggers might ultimately allow the steering of chemical reactions.
Among all possible chemical reactions that occur in water, the most fundamental is the water dissociation reaction (1), which is of major importance in many areas of chemistry and biology (2). Water plays an important role as a universal solvent for a wide variety of chemical processes and can act both as an acid and as a base. In aqueous solution, water will self-ionize and form hydroxide (OH−) and hydronium (H3O+) ions which take on Eigen- or Zundel-like structures (2–6). Experiments show that the mean lifetime for an individual molecule before undergoing autoionization is about h (7, 8).
The autoionization event has not been directly probed by experiments and the dissociation rate is obtained using the water dissociation equilibrium constant and the rate for the much faster recombination reaction (7, 8). The experimental challenges make the autoionization event a pertinent target for computer simulations for which previous constrained ab initio simulations have given important information about the mechanism (9–11). However, the use of constraints leads to a loss of the spontaneous dynamics of the system and the selection of a reaction coordinate that accurately measures the progress of the reaction is challenging. These limitations can be avoided by path-sampling methods such as transition path sampling (TPS) (12) or replica exchange transition interface sampling (RETIS) (13, 14) which are specifically designed for sampling rare events without altering the dynamics while less influenced by the choice of the order parameter (15). Geissler et al. (16) applied TPS with ab initio molecular dynamics (MD) to simulate just uncorrelated autoionization events and demonstrated that the mechanism involves transfer of protons along a hydrogen bond wire with concomitant breaking of the wire. In their work, local solvent properties (e.g., ion coordination numbers and the presence of specific hydrogen bonds) were used to interpret the destabilization that leads to ionization. The absence of clear visually observable correlations led to the conclusion that the destabilization is caused by rare electric-field fluctuations which arise primarily from long-range electrostatic interactions and thus that local order parameters are not suitable to describe the event. Hassanali et al. (17) studied the reverse recombination reaction (i.e., neutralization of ionized water molecules) with standard ab initio MD and reported that this event takes place by a collective compression of the water wire bridging the ions, followed by a triple concerted proton jump. The OH− ion which is neutralized remains in a hypercoordinated state and Hassanali et al. (17) hypothesized that it could serve, together with the compression of the wire, as a nucleation site for autoionization. This view opposes the statement of Geissler et al. (16) that the dissociation event is primarily triggered by nonlocal structural fluctuations. We note that concerted proton transfers and collective compression of water wires have also been observed for the recombination of a weak base in water (18).
Both of these studies give important information about the autoionization mechanism, although they do not unambiguously reveal the conditions that need to accompany a bond stretch fluctuation to initiate the reaction. In this work, we aim to tackle this ambiguity and quantitatively identify initiation conditions for water autoionization. Simulating the dissociation events may not be sufficient as the apparent initiation conditions observed in trajectories that lead to dissociation may also be present in trajectories with an initial bond stretch but still fail to dissociate. Also nonreactive or “almost reactive” trajectories contain important information as these allow for identification of effective initiation conditions that really matter: those that discriminate between reactive and nonreactive trajectories. To collect this information, we applied the RETIS method and harvested reactive and nonreactive trajectories which we analyzed using the recently developed predictive power method (19) and we built a predictive machine-learning model (20). This allowed us to quantitatively examine the importance of local order parameters and initiation conditions for water autoionization. Based on this analysis we identify important initiation triggers and calculate the full rate of dissociation.
Results and Discussion
The autoionization event was investigated using ab initio RETIS simulations as described in Materials and Methods. For the RETIS simulations, we used a relatively simple geometric distance order parameter, , as illustrated in Fig. 1: When the system consists of only H2O species, is the largest covalent O–H bond distance, and when the system contains OH− and H3O+ species, is taken as the shortest distance between the oxygen in OH− and the hydrogen atoms in H3O+. In the following, we refer to the oxygen atom used for the order parameter as Oλ. The type of species (OH−, H2O, or H3O+) was identified by allocating to each hydrogen a single bond connecting it to the closest oxygen. Note that the definition of the order parameter does not require a threshold for defining a chemical bond nor does it constrain the order parameter to specific water molecules for the duration of the simulation. This means that we compute the rate of dissociation of any water molecule in the system instead of a single targeted O–H bond or water molecule.
From our RETIS simulations, the water dissociation rate constant, , can be obtained as the product of a flux, , and a (conditional) probability, :
[1] |
Here, and are interfaces defining the initial () and final () states and is the probability of reaching the final state before (possibly) reentering the initial state, given that the initial interface has been crossed. The flux, , is a measure of the frequency of crossings with . Since we consider the dissociation of any water molecule in our system, we have normalized by the number of water molecules present. Typically, for rare events, the crossing probability is very small and in practice, is calculated by first positioning several more interfaces between the initial and the final state. The overall crossing probability is then obtained as a product of several (history-dependent) conditional probabilities (14). The conditional probabilities are calculated in a separate path ensemble simulation where the path ensemble defines the collection of paths crossing . The number and location of the interfaces alter the efficiency of the method, but not the results.
In the present case, we placed the final interface beyond the maximum distance obtainable in our system. All trajectories were thus propagated until the system contained only H2O species again. Separated ions may still recombine fast (within a few femtoseconds) even if the separation is large (16) and this observation was confirmed in our analysis (SI Appendix, Fig. S1). To better identify and distinguish the metastable ionized states, we used path reweighting (21) to project the crossing probability on an alternative order parameter, , which equals the trajectory length (in femtoseconds).
In Fig. 1, we show the calculated crossing probability from our simulations as a function of the order parameter. In principle, there are two potential mechanisms which lead to an increase of the reaction coordinate after the first proton jump. The ionic species can separate further by another proton jump, the so-called Grotthuss mechanism, reassigning the H3O+ or OH− ion to another oxygen and causing a sudden discontinuous increase in the reaction coordinate. A second possible mechanism keeps the first ionic species intact and lets them move away from each other by diffusion, yielding a more gradual increase of the reaction coordinate. Based on the completely flat intermediate plateau region between Å and Å, we can conclude that only the first mechanism is effective. For Å, we consider as the order parameter and we have used a threshold of ps as a criterion to identify a stable dissociation event. This choice is rather arbitrary since there is not a clear separation of timescales for the reverse recombination reaction which would result in another flat plateau region of the crossing probability. With a threshold of ps, the crossing probability is . Combined with the initial flux, calculated to be fs−1 in our simulations, the resulting dissociation constant is s−1. An alternative rate constant not requiring any time threshold can be defined by counting the trajectories that undergo a hydrogen swap; i.e., in the last frame some of the water molecules have swapped their protons. The rationale behind this definition is that the proton swap must imply a significant reorganization of the hydrogen bond network so that the reverse reaction can be considered as an independent recombination reaction. Vice versa, the forward reaction has established a quasi-stable state since it is not followed up by a correlated reverse reaction. This definition yields a rate of s−1.
Comparing with experimentally determined dissociation constants at 25 °C [ s−1 (7) and s−1 (8)] we overestimate the rate constant by a factor (although the simulated rate will drop and gets closer to the experimental rate if a larger threshold is chosen). Considering all factors that play a role in the accuracy (statistical error, functional, small system size, purely classical treatment of protons, the time threshold value) the deviation with experiments is satisfactory and comparable to other density functional theory studies. Depending on the functionals considered in the ab initio calculation, energy barriers may be in error by – kJ/mol (22) which at room temperature would already correspond to a factor – difference between experimental and theoretical rate constants. Still, density functional theory generally manages to reproduce trends and mechanistic information in reasonable agreement with experiments (23).
We also calculated the average energy of the generated trajectories as a function of the order parameter (Fig. 1). The energy is expected to converge to the activation energy as can be derived from the temperature derivative of the rate constant (24, 25). We note that this activation energy gives a more direct comparison with experiments than free energy barriers which depend on the choice of order parameter. The activation energy obtained from the average energy of the accepted paths is approximately kcal/mol. For comparison, an Arrhenius plot of the experimental data of Natzle and Moore (8) results in an activation energy of approximately kcal/mol while Eigen and Maeyer (7) reported an activation energy between kcal/mol and kcal/mol. The deviation with our result is lower than the typical error margin mentioned above and the fact that the experimental activation barriers are lower than our simulation result, despite having lower rate constants, is rather remarkable. Since experimental data on this topic are at least three decades old, we hope that our finding will encourage future experimental investigations on the dissociation reaction.
Path sampling methods generate reactive (and nonreactive) trajectories which can be used to discover possible mechanisms and initiation conditions. To characterize these conditions, we considered additional collective variables, which we label . In principle, these s can be functions of all positions and momenta in the system, and they do not necessarily have simple physical interpretations. Since the ability to form hydrogen bonds is one of the characteristic features of water (26) and since previous computational studies have demonstrated the relevance of the hydrogen bond wire connecting the ionized species (16, 17), we have focused on a set of relatively simple collective variables which quantify the hydrogen bond network and the distortion from tetrahedral geometry.
The first collective variable we consider is the length of the hydrogen bond wire bridging the nascent ion species. Our aim is to predict the outcome of initiated trajectories and in particular the initiation conditions for reactive events. Thus, we cannot define the hydrogen bond wires as connecting the ionic species, since this is one of the outcomes we wish to predict. For a single trajectory, we define the hydrogen bond wire as the shortest wire containing the Oλ species and other water species at the first point in time when is greater than a given threshold value, Å. Typically, this threshold is reached within – fs in our trajectories. This defines a wire containing water species whose length, , is obtained as the sum of the O–O distances of consecutive members.
In addition, we have considered the following four collective variables which describe the local structure surrounding the Oλ species: (i) The number of hydrogen bonds accepted, , and (ii) donated, , by the water species containing Oλ; (iii) the tetrahedral order parameter, , obtained using the angles defined by Oλ and its four nearest oxygen atoms (27, 28) (by the definition for a perfect tetrahedral structure and otherwise); and (iv) an angle order parameter, , defined as the smallest of the cosine of the two internal angles in the wire. We refer to Materials and Methods for additional information on these collective variables.
After defining the extra s, we analyzed the trajectories using the predictive power method (19). This method begins by classifying the trajectories as reactive or nonreactive based on two thresholds and defined such that . A trajectory is considered reactive if it reaches the specified ; otherwise it is considered nonreactive. At the first crossing point with , we record the s and form two distributions using the reactive/nonreactive classification: , the fraction of -passing trajectories that cross at a point and reach , and , the fraction of -passing trajectories that cross at a point but fail to reach . These two distributions give information on the relation between the additional order parameters and the reactivity. For instance, if , it could be that is inaccessible, but if we can cross at , the trajectory will be reactive. To quantify the importance of the different s, we calculate the predictive ability, , defined as (19)
[2] |
such that . If the collective variables do not correlate with reactivity, the lower limit is attained but if the s are relevant for the reaction, . We use the ratio to measure how much the predictive ability is increased when considering the extra s, compared with using the crossing probability alone. Note that the definition in Eq. 2 shows that if the overlap of the two distributions is small, then the predictive ability increases.
We first investigated the lengths of hydrogen bond wires containing three, four, and five water molecules. Comparing the predictive abilities for these collective variables (respectively, , , ) we find that and are more correlated with reactivity and that is more relevant for larger (SI Appendix, Fig. S2). Thus, in the following we focus on wires containing four water molecules. For the water wires, we observe that when the ionic species are separated by at least two water molecules, the ionic state survives for a longer time compared with cases where they are separated by just one water molecule. This implies that (at least) three proton transfer events have occurred. We monitored the distances of the initially covalent O–H bonds and show these for the first (), the second (), and the third () transferred proton in Fig. 2. As can be expected from the Grotthuss mechanism (29, 30), the initial autoionization event is followed by several proton transfers in which the ionic species separate along the wire. Fig. 2 shows that this can happen in both a concerted and a stepwise way: The transfer of the first and the second proton occurs almost exclusively in a concerted way, while the transfer of the third proton (if it occurs) can happen stepwise or concertedly. This is also reflected in the waiting time between these events (SI Appendix, Fig. S3): The waiting time distribution between the second and the third proton transfer is broader compared with the first and the second transfer. To investigate the stability of the wires, we also calculated the hydrogen bond wire in time-reversed trajectories (SI Appendix, Fig. S4). We find that trajectories are indeed starting and ending with a contracted wire ( Å) as reported by Hassanali et al. (17), but at the end these wires do not necessarily contain the same oxygen atoms. This might occur due to an actual breakage of the hydrogen bond wire or by a lesser disruption (for example, by a shift of the selection of four consecutive oxygens within a five-membered wire). The majority of the longer trajectories reform via another wire, but there are still a significant number of long trajectories ( ps) for which the recombination is exactly the same as the dissociation path. This contradicts the hypothesis (16) that a breakage of the wire is a necessary condition to reach a metastable state. Also, visual inspection shows that relatively long trajectories exist in which the hydrogen bond wire remains intact except for some very short on/off fluctuations in the hydrogen bonds. We find the abovementioned hypothesis therefore difficult to defend. Conversely, we can also examine whether an actual breakage always leads to a long-lived metastable state. For this we adopt again the assumption that all trajectories with a hydrogen swap necessarily imply an indisputable breakage of the hydrogen bond wire. SI Appendix, Fig. S5, shows that trajectories with a proton swap are on average longer, but can still be relatively short ( fs).
Comparing the additional collective variables (SI Appendix, Figs. S6 and S7), we find that is less relevant than the other variables and we do not consider it further. The other collective variables are more correlated with reactivity and in Fig. 3A, we show the predictive ability for some of their combinations. In Fig. 3B we show as function of Å for Å compared with the crossing probability using several combinations of the collective variables. Fig. 3 A and B shows that we can increase the predictive ability by a factor compared with the crossing probability. We note that since the crossing probability is small in this case, with a , we cannot perfectly predict the outcome. This indicates that there are other collective variables important for the description, possibly even nonlocal ones, as suggested by Geissler et al. (16). Also, we stress that here we are focusing only on the first concerted-jump step of the reaction in which the order parameter increases from Å to Å. As is clear from Fig. 1C the vast majority of trajectories reaching Å will not lead to long-lived metastable states. Predicting this from the very first snapshot seems still a step too far since it depends on collisions between water molecules after many MD steps far away from the initially stretched OH bond.
Inspecting the initiation conditions in more detail, we investigate the reactive and nonreactive distributions and in Fig. 4 for Å and Å. Here, we examine all dissociation events, even the ones that recombine quickly and show the distributions for in Fig. 4A [see SI Appendix, Fig. S8 for the distributions for and ]. Along the coordinate we observe a clear separation of the two distributions which indicates that trajectories crossing Å have a larger probability of being reactive for shorter wires (smaller ). This supports the hypothesis of a “compressed” wire as an important condition for autoionization, as first suggested by Hassanali et al. (17). Along the coordinate we observe a higher probability of reactivity for wires in which Oλ is hypercoordinated, which was also proposed by Hassanali et al. (17). Still, the chance of not being reactive is larger at any point in Fig. 4A [ would not be visible if it had not been normalized]. For example, if (i) and at the same time , the probability for a reactive event is , which is small but still a factor 58 larger than the chance to be reactive from a random point at . In a more extreme case, if (ii) and simultaneously , the chance increases to . The predictive ability provides a weighted average of these chances in which the weights are proportional to the relevance (19); since of all reactive trajectories, 45% cross in region i and only 0.6% in region ii, the latter will have 75 times lower weight.
If we consider the coordinate, we observe that is shifted toward lower values compared with , which indicates that a distortion from a tetrahedral arrangement around the dissociating water species may also initiate the event. This finding is somewhat surprising as in some other aqueous phase chemical reactions the opposite effect was found (31). Similar conclusions can be drawn for the distribution of . Here, there is a peak along the coordinate for the reactive distribution closer to a linear arrangement of the water molecules. In Fig. 4B we show a representative snapshot, obtained early (after fs) in a reactive trajectory. Overall the results shown in Fig. 3 report that a compression of the water wire (measured by ) and hypercoordination (measured by ) or distortion (measured by and ) are necessary initiation conditions for autoionization. However, these are not sufficient conditions as shown by the values of in Fig. 3B: Still of the trajectories starting off within the ideal parameter range fail to establish a concerted proton jump.
Machine learning (ML) applied to path-sampling data (33, 34) is a promising approach to find important collective variables that can easily be missed by human intuition. To explore this possibility, we built ML models for predicting the outcome of trajectories given the state of the water system early in the trajectories. We focus on the same range as in the predictive power analysis and we use the state of the system, when Å is first attained, to predict the outcome. We used several ML techniques in which every odd path ensemble was included in the calibration and the even path ensembles were used for the test set. An alternative split in which the data within each path ensemble were evenly divided in two gave similar results. Moreover, as heavily skewed distributions are difficult to treat with ML, we further omitted the reweighting of the datasets with the statistical weights of the corresponding path ensembles. However, we applied the ML techniques as a qualitative approach to find new parameters that could be tested quantitatively within the predictive power method (19).
In addition, to avoid a potential risk of overinterpretation we opted to restrict the complexity of the ML decision process and imposed a maximum of four order parameters when computing . For instance, excellent predictive performances () were obtained using the ensemble-based gradient-boosting machines (35, 36). However, the interpretation of the model is problematic since an ensemble of – deep decision trees (added in a sequence) is used. Although the performance is improved, the chance of overfitting with accidental correlations increases. We have therefore restricted ourselves to the single-tree–based decision models based on classification and regression decision trees (CART) (20). The restriction to four order parameters for the function is based on similar reasons. Adding more parameters gives more sparse matrices representing the reactive/nonreactive distributions, and, as a result, numerical integration for computing the overlap between these distributions becomes very sensitive to the bin size and could underestimate the overlap due to bins being empty by insufficient statistics.
We considered collective variables consisting of oxygen–oxygen distances; oxygen–hydrogen distances for initially bound water molecules; all angles formed by Oλ and its four closest oxygen neighbors; and the Steinhardt order parameters of orders , , and (32) (see Materials and Methods for more details). In addition, the order parameters already considered were added. Fig. 5A shows the resulting decision tree. Remarkably, of all of the input parameters, the parameter is both on top of the decision tree and the most important variable as measured by the reduction in the classification error attributed to each variable at each split in the decision tree (20) (SI Appendix, Fig. S9). Also the tetrahedral ordering and the number of accepted hydrogen bonds appear in the decision tree. To describe the first effect, the ML approach prioritized the Steinhardt order parameter above the similar parameter previously used by us. Some distances that also appear in the decision tree like , the distance between Oλ and its 25th closest oxygen, are most likely due to accidental correlations caused by the limited size of the dataset. This is verified by inspecting the importance of this variable: does not appear among the most important variables (SI Appendix, Fig. S9), and, in fact, other similar variables (e.g., ) are ranked higher, albeit with low importance. A more important and intuitively sound parameter that is suggested by the ML approach is , the OH distance between the oxygen closest to Oλ and its hydrogen with the largest intramolecular bond. Recomputing the predictive ability using parameters from the ML tree (Fig. 5B) did not yield higher performances than the combination , , , and , but should be conceived as equally good, considering statistical uncertainties.
Conclusions
We investigated the autoionization of water at room temperature, using an unconstrained ab initio rare-event simulation method. Our simulations sample reactive events that happen on the timescale of minutes and we demonstrated that autoionization can be initiated by the hypercoordination of a stretched OH bond and the compression of a hydrogen bond wire as suggested by Hassalani et al. (17). However, these are not sufficient conditions. Only when the wire is strongly condensed ( Å) and the stretched OH bond accepts four hydrogen bonds, does the reaction probability become significant (0.15), but only 0.6% of the reactive trajectories start off with such extreme conditions. The vast majority of reactive paths start with milder initial values for these two parameters. In this region of parameter space the reaction probability is largely enhanced compared with an arbitrary case, but still extremely small. Hence, the reaction takes place when additional structural parameters have values inside the right range. We identified additional structural parameters which correspond to the alignment of the hydrogen bond wire and the distortion from a tetrahedral arrangement. Hence, we showed that the local order parameters can be used to predict the self-ionization event, although it requires a combination of several conditions.
Due to the multiple correlated factors that influence the water autoionization, we combined our analysis method with ML techniques which identified additional parameters not considered before, in particular the O–H stretch of the oxygen closest to Oλ. Even though the ML result did not outperform the level of predictiveness by the human effort based on intuition, visual inspection of many molecular movies, and intensive trial-and-error approaches, the ML approach found all previously identified parameters very efficiently and, in addition, revealed some equally important parameters that were overlooked. We therefore believe that ML applied to path sampling has a great potential especially since data limitations will become less of an issue in the future due to the further expected increase of high-performance computing, a better parallelization scheme of sampling unequal trajectory-length path ensembles, and the use of more efficient Monte Carlo (MC) path-generating moves (37). It would therefore be promising to apply the same method to other aqueous-phase chemistry studies which so far have mainly been based on biased dynamics (31, 38).
The fundamental understanding of reaction triggers that can be gathered by this approach could open up avenues of practical applications. For instance, even if not all identified parameters correlating with reactivity will necessarily imply causal correlation, it is plausible that an intelligent manipulation of their equilibrium distribution via external electric fields (39) or inclusion of additives might lead to catalytic ways to steer reactions and in particular water dissociation.
Materials and Methods
Simulation Methods.
The MD simulations required by the RETIS algorithm (14) were performed with the Born–Oppenheimer MD capabilities of the CP2K program package (40). We used the Becke–Lee–Yang–Parr (BLYP) functional with a DZVP-MOLOPT (41) basis set and a plane-wave cutoff of Ry. The BLYP functional gives a reasonable description of the structure and dynamics of liquid water (42, 43) and the absence of dispersion corrections (44) is likely of minor importance for ion–water interactions where the dominant interactions are mainly electrostatic. However, we note that the BLYP functional is known to give an overstructured description of liquid water with a low diffusion coefficient (45). Previous studies on the recombination mechanism for water (17, 46) and for weak bases in water (18) have, however, found that the collective compression of the hydrogen bond wire and the motion of the protons are reproduced with different choices of the functional and basis set.
The initial system consisted of water molecules placed in a cubic simulation box of Å3. All MD simulations were carried out under constant energy (microcanonical) dynamics, with a time step of fs and periodic boundaries.
The transition region was divided into path ensembles by positioning RETIS interfaces at {, , , , , , , , , , , , , , , , , } Å. In addition, a final interface was placed at such that all trajectories were propagated until they reached the pure water state again. After generating an initial path for each path ensemble (this was done by repeatedly modifying the momenta of the particles and evolving the system forward in time until valid paths were obtained) the RETIS algorithm attempts to either swap paths between different path ensembles or generate new trajectories by the so-called shooting or the time-reversal move. In our simulations the probability of performing a swapping move was set to while the probabilities of the two other moves were both set to . New velocities for the shooting move were drawn from a Maxwell–Boltzmann distribution corresponding to an average temperature of K.
We performed , MC moves for each path ensemble, using the RETIS algorithm. This generated between and , distinct trajectories in each path ensemble. The length of the trajectories ranged from fs to fs and we disregarded the first trajectories in our analysis.
Analysis of Trajectories.
Crossing probabilities along the reaction coordinate were computed by matching the results of the different path ensembles. Projection of the crossing probability along was obtained using the reweighting scheme of Rogal et al. (21) for the path ensembles in the transition interface sampling framework.
For trajectories harvested with the RETIS algorithm we calculated additional collective variables: the hydrogen bond wire length (), the number of hydrogen bond donors () and acceptors (), the orientation order parameter (), and the angle formed by Oλ and its closest oxygen neighbors (). Using the first configuration in each trajectory, hydrogen atoms were assigned to the closest oxygen atom and this defined the initial H2O molecules. Then, the hydrogen bond network was obtained for each configuration in the trajectory. Hydrogen bonds were identified using the criteria of Luzar and Chandler (47) and all (shortest) hydrogen bond connections between all pairs of water molecules were determined using the Floyd–Warshall algorithm (48). This allowed us to represent the hydrogen bond structure as a graph. Next, the oxygen atom (Oλ) used in the definition of the order parameter was identified. With no present, this is the oxygen atom for which the covalent O–H distance is largest and when we have present in the system, this is the oxygen atom. After identifying Oλ, we obtained the number of hydrogen bonds accepted () and donated () by the water species containing it. The relevant hydrogen bond wire was obtained using the following criteria: (i) The wire should contain the oxygen atom used for the order parameter (identified as explained above) when the order parameter first crossed Å, (ii) the wire should contain water species, and (iii) the wire should be the shortest of the wires where two criteria i and ii are met. The length of the wire was defined as the sum of the O–O distances of consecutive molecules in the wire.
The orientation order parameter measures the distortion from a tetrahedral orientation of four water molecules around a central molecule and is defined by (27, 28)
[3] |
Here, is the angle formed by the central oxygen and its four nearest oxygen neighbors. The central oxygen is always Oλ or the oxygen with the largest OH bond for pure water or the OH− oxygen if it is present. For a perfect tetrahedral orientation and it is otherwise. The angle order parameter, , was obtained directly as , where and are the two internal angles in the wire.
After calculating these additional collective variables, we analyzed the trajectories using the methodology of van Erp et al. (19). For the analysis we used subinterfaces for both and for the range . The histograms in the collective variable space were constructed using bins for , , ; bins for ; and bins for , while the bins (midpoints) were placed at for both and .
The classification models were constructed using CARTs (20) available within the R (49) software package. The mean of sensitivity and specificity was used as the classifier performance measure (50).
For the CART models we considered several sets of collective variables and we obtained these variables at the frame in the trajectories where the order parameter first crossed Å. The trajectories were classified as reactive if they reached a and as nonreactive otherwise. The first set of collective variables consisted of all atom–atom separations in the system, which gave a model in which the oxygen–oxygen distances were most important. This model did not lend itself to an easy interpretation and we next considered several models with a reduced number of collective variables.
In the best-performing model (performance measure for training and for testing ) we considered collective variables: all oxygen–hydrogen distances for initially bound water molecules, all oxygen–oxygen distances involving Oλ, the averaged distances between Oλ and its oxygen neighbors, the cosine of all angles formed by Oλ and its closest oxygen neighbors, all of the collective variables considered in the predictive power analysis, and the Steinhardt order parameters of order , , and (32). When performing the predictive power analysis for the collective variables used by the CART analysis, we used bins in the range for oxygen–hydrogen distances and bins in the range for oxygen–oxygen distances, and for angles and the Steinhardt order parameters we used similar bins to those for and given above.
Supplementary Material
Acknowledgments
The authors thank Øivind Wilhelmsen for fruitful discussions. The authors thank the Research Council of Norway Projects 237423 and 250875 and the Faculty of Natural Sciences and Technology, Norwegian University of Science and Technology (NTNU) for support. This research was supported in part with computational resources at NTNU provided by the Norwegian Metacenter for Computational Science (NOTUR), www.sigma2.no.
Footnotes
The authors declare no conflict of interest.
This article is a PNAS Direct Submission. P.L.G. is a guest editor invited by the Editorial Board.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1714070115/-/DCSupplemental.
References
- 1.Stillinger FH. Proton transfer: Reactions and kinetics in water. In: Eyring H, Henderson D, editors. Theoretical Chemistry: Advances and Perspectives. Vol 3. Academic; New York: 1978. pp. 177–234. [Google Scholar]
- 2.Agmon N, et al. Protons and hydroxide ions in aqueous systems. Chem Rev. 2016;116:7642–7672. doi: 10.1021/acs.chemrev.5b00736. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Tuckerman M, Laasonen K, Sprik M, Parrinello M. Ab initio molecular dynamics simulation of the solvation and transport of H3O+ and OH− ions in water. J Phys Chem. 1995;99:5749–5752. [Google Scholar]
- 4.Marx D, Tuckerman ME, Hutter J, Parrinello M. The nature of the hydrated excess proton in water. Nature. 1999;397:601–604. [Google Scholar]
- 5.Tuckerman ME, Marx D, Parrinello M. The nature and transport mechanism of hydrated hydroxide ions in aqueous solution. Nature. 2002;417:925–929. doi: 10.1038/nature00797. [DOI] [PubMed] [Google Scholar]
- 6.Aziz EF, Ottosson N, Faubel M, Hertel IV, Winter B. Interaction between liquid water and hydroxide revealed by core-hole de-excitation. Nature. 2008;455:89–91. doi: 10.1038/nature07252. [DOI] [PubMed] [Google Scholar]
- 7.Eigen M, de Maeyer L. Self-dissociation and protonic charge transport in water and ice. Proc R Soc A. 1958;247:505–533. [Google Scholar]
- 8.Natzle WC, Moore CB. Recombination of H+ and OH− in pure liquid water. J Phys Chem. 1985;89:2605–2612. [Google Scholar]
- 9.Trout BL, Parrinello M. The dissociation mechanism of H2O in water studied by first-principles molecular dynamics. Chem Phys Lett. 1998;288:343–347. [Google Scholar]
- 10.Trout BL, Parrinello M. Analysis of the dissociation of H2O in water using first-principles molecular dynamics. J Phys Chem B. 1999;103:7340–7345. [Google Scholar]
- 11.Sprik M. Computation of the pK of liquid water using coordination constraints. Chem Phys. 2000;258:139–150. [Google Scholar]
- 12.Dellago C, Bolhuis PG, Chandler D. Efficient transition path sampling: Application to Lennard-Jones cluster rearrangements. J Chem Phys. 1998;108:9236–9245. [Google Scholar]
- 13.van Erp TS, Moroni D, Bolhuis PG. A novel path sampling method for the sampling of rate constants. J Chem Phys. 2003;118:7762–7774. doi: 10.1063/1.1644537. [DOI] [PubMed] [Google Scholar]
- 14.van Erp TS. Reaction rate calculation by parallel path swapping. Phys Rev Lett. 2007;98:268301. doi: 10.1103/PhysRevLett.98.268301. [DOI] [PubMed] [Google Scholar]
- 15.van Erp TS. Efficiency analysis of reaction rate calculation methods using analytical models I: The two-dimensional sharp barrier. J Chem Phys. 2006;125:174106. doi: 10.1063/1.2363996. [DOI] [PubMed] [Google Scholar]
- 16.Geissler PL, Dellago C, Chandler D, Hutter J, Parrinello M. Autoionization in liquid water. Science. 2001;291:2121–2124. doi: 10.1126/science.1056991. [DOI] [PubMed] [Google Scholar]
- 17.Hassanali A, Prakash MK, Eshet H, Parrinello M. On the recombination of hydronium and hydroxide ions in water. Proc Natl Acad Sci USA. 2011;108:20410–20415. doi: 10.1073/pnas.1112486108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Cuny J, Hassanali AA. Ab initio molecular dynamics study of the mechanism of proton recombination with a weak base. J Phys Chem B. 2014;118:13903–13912. doi: 10.1021/jp507246e. [DOI] [PubMed] [Google Scholar]
- 19.van Erp TS, Moqadam M, Riccardi E, Lervik A. Analyzing complex reaction mechanisms using path sampling. J Chem Theory Comput. 2016;12:5398–5410. doi: 10.1021/acs.jctc.6b00642. [DOI] [PubMed] [Google Scholar]
- 20.Breiman L, Friedman J, Olshen R, Stone C. Classification and Regression Trees. Chapman & Hall; New York: 1984. [Google Scholar]
- 21.Rogal J, Lechner W, Juraszek J, Ensing B, Bolhuis PG. The reweighted path ensemble. J Chem Phys. 2010;133:174109. doi: 10.1063/1.3491817. [DOI] [PubMed] [Google Scholar]
- 22.Piccini G, Alessio M, Sauer J. Ab initio calculation of rate constants for molecule-surface reactions with chemical accuracy. Angew Chem Int Edit. 2016;55:5235–5237. doi: 10.1002/anie.201601534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Van Speybroeck V, et al. Advances in theory and their application within the field of zeolite chemistry. Chem Soc Rev. 2015;44:7044–7111. doi: 10.1039/c5cs00029g. [DOI] [PubMed] [Google Scholar]
- 24.Dellago C, Bolhuis PG. Activation energies from transition path sampling simulations. Mol Simul. 2004;30:795–799. [Google Scholar]
- 25.van Erp TS, Bolhuis PG. Elaborating transition interface sampling methods. J Comput Phys. 2005;205:157–181. [Google Scholar]
- 26.Nilsson A, Pettersson LGM. The structural origin of anomalous properties of liquid water. Nat Commun. 2015;6:8998. doi: 10.1038/ncomms9998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Chau PL, Hardwick AJ. A new order parameter for tetrahedral configurations. Mol Phys. 1998;93:511–518. [Google Scholar]
- 28.Errington JR, Debenedetti PG. Relationship between structural order and the anomalies of liquid water. Nature. 2001;409:318–321. doi: 10.1038/35053024. [DOI] [PubMed] [Google Scholar]
- 29.Agmon N. The Grotthuss mechanism. Chem Phys Lett. 1995;244:456–462. [Google Scholar]
- 30.Marx D. Proton transfer 200 years after von Grotthuss: Insights from ab initio simulations. ChemPhysChem. 2006;7:1848–1870. doi: 10.1002/cphc.200600128. [DOI] [PubMed] [Google Scholar]
- 31.van Erp TS, Meijer EJ. Proton-assisted ethylene hydration in aqueous solution. Angew Chem Int Ed. 2004;43:1660–1662. doi: 10.1002/anie.200353103. [DOI] [PubMed] [Google Scholar]
- 32.Steinhardt PJ, Nelson DR, Ronchetti M. Bond-orientational order in liquids and glasses. Phys Rev B. 1983;28:784–805. [Google Scholar]
- 33.Ma A, Dinner AR. Automatic method for identifying reaction coordinates in complex systems. J Phys Chem B. 2005;109:6769–6779. doi: 10.1021/jp045546c. [DOI] [PubMed] [Google Scholar]
- 34.Mullen RG, Shea JE, Peters B. Transmission coefficients, committors, and solvent coordinates in ion-pair dissociation. J Chem Theory Comput. 2014;10:659–667. doi: 10.1021/ct4009798. [DOI] [PubMed] [Google Scholar]
- 35.Friedman JH. Greedy function approximation: A gradient boosting machine. Ann Stat. 2001;29:1189–1232. [Google Scholar]
- 36.Friedman JH. Stochastic gradient boosting. Comput Stat Data Anal. 2002;38:367–378. [Google Scholar]
- 37.Riccardi E, Dahlen O, van Erp TS. Fast decorrelating Monte Carlo moves for efficient path sampling. J Phys Chem Lett. 2017;8:4456–4460. doi: 10.1021/acs.jpclett.7b01617. [DOI] [PubMed] [Google Scholar]
- 38.Galib M, Hanna G. Mechanistic insights into the dissociation and decomposition of carbonic acid in water via the hydroxide route: An ab initio metadynamics study. J Phys Chem B. 2011;115:15024–15035. doi: 10.1021/jp207752m. [DOI] [PubMed] [Google Scholar]
- 39.Saitta AM, Saija F, Giaquinta PV. Ab initio molecular dynamics study of dissociation of water under an electric field. Phys Rev Lett. 2012;108:207801. doi: 10.1103/PhysRevLett.108.207801. [DOI] [PubMed] [Google Scholar]
- 40.Hutter J, Iannuzzi M, Schiffmann F, VandeVondele J. cp2k: Atomistic simulations of condensed matter systems. Wiley Interdiscip Rev Comput Mol Sci. 2014;4:15–25. [Google Scholar]
- 41.VandeVondele J, Hutter J. Gaussian basis sets for accurate calculations on molecular systems in gas and condensed phases. J Chem Phys. 2007;127:114105. doi: 10.1063/1.2770708. [DOI] [PubMed] [Google Scholar]
- 42.Sprik M, Hutter J, Parrinello M. Ab initio molecular dynamics simulation of liquid water: Comparison of three gradient-corrected density functionals. J Chem Phys. 1996;105:1142–1152. [Google Scholar]
- 43.Kumar PP, Kalinichev AG, Kirkpatrick RJ. Hydrogen-bonding structure and dynamics of aqueous carbonate species from car-parrinello molecular dynamics simulations. J Phys Chem B. 2009;113:794–802. doi: 10.1021/jp809069g. [DOI] [PubMed] [Google Scholar]
- 44.Grimme S, Ehrlich S, Goerigk L. Effect of the damping function in dispersion corrected density functional theory. J Comput Chem. 2011;32:1456–1465. doi: 10.1002/jcc.21759. [DOI] [PubMed] [Google Scholar]
- 45.Gillan MJ, Alfè D, Michaelides A. Perspective: How good is DFT for water? J Chem Phys. 2016;144:130901. doi: 10.1063/1.4944633. [DOI] [PubMed] [Google Scholar]
- 46.Hassanali A, Giberti F, Cuny J, Kühne TD, Parrinello M. Proton transfer through the water gossamer. Proc Natl Acad Sci USA. 2013;110:13723–13728. doi: 10.1073/pnas.1306642110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Luzar A, Chandler D. Effect of environment on hydrogen bond dynamics in liquid water. Phys Rev Lett. 1996;76:928–931. doi: 10.1103/PhysRevLett.76.928. [DOI] [PubMed] [Google Scholar]
- 48.Cormen TH, Leiserson CE, Rivest RL, Stein C. Introduction to Algorithms. 3rd Ed MIT Press; Cambridge, MA: 2009. [Google Scholar]
- 49.R Core Team . R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing; Vienna: 2017. [Google Scholar]
- 50.Brodersen KH, Ong CS, Stephan KE, Buhmann JM. 2010. The balanced accuracy and its posterior distribution. Proceedings of the 2010 20th International Conference on Pattern Recognition, ICPR ’10 (IEEE Computer Society, Washington, DC), pp 3121–3124. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.