Skip to main content
ACS AuthorChoice logoLink to ACS AuthorChoice
. 2024 Aug 15;64(16):6623–6635. doi: 10.1021/acs.jcim.4c00966

Reinforcing Tunnel Network Exploration in Proteins Using Gaussian Accelerated Molecular Dynamics

Nishita Mandal †,, Bartlomiej Surpeta †,‡,*, Jan Brezovsky †,‡,*
PMCID: PMC11351021  PMID: 39143923

Abstract

graphic file with name ci4c00966_0008.jpg

Tunnels are structural conduits in biomolecules responsible for transporting chemical compounds and solvent molecules from the active site. They have been shown to be present in a wide variety of enzymes across all functional and structural classes. However, the study of such pathways is experimentally challenging, because they are typically transient. Computational methods, such as molecular dynamics (MD) simulations, have been successfully proposed to explore tunnels. Conventional MD (cMD) provides structural details to characterize tunnels but suffers from sampling limitations to capture rare tunnel openings on longer time scales. Therefore, in this study, we explored the potential of Gaussian accelerated MD (GaMD) simulations to improve the exploration of complex tunnel networks in enzymes. We used the haloalkane dehalogenase LinB and its two variants with engineered transport pathways, which are not only well-known for their application potential but have also been extensively studied experimentally and computationally regarding their tunnel networks and their importance in multistep catalytic reactions. Our study demonstrates that GaMD efficiently improves tunnel sampling and allows the identification of all known tunnels for LinB and its two mutants. Furthermore, the improved sampling provided insight into a previously unknown transient side tunnel (ST). The extensive conformational landscape explored by GaMD simulations allowed us to investigate in detail the mechanism of ST opening. We determined variant-specific dynamic properties of ST opening, which were previously inaccessible due to limited sampling of cMD. Our comprehensive analysis supports multiple indicators of the functional relevance of the ST, emphasizing its potential significance beyond structural considerations. In conclusion, our research proves that the GaMD method can overcome the sampling limitations of cMD for the effective study of tunnels in enzymes, providing further means for identifying rare tunnels in enzymes with the potential for drug development, precision medicine, and rational protein engineering.

Introduction

Enzymes function as natural catalysts to enable the execution of complex reactions in living cells. However, the structural basis of their efficiency and sensitivity remains incompletely understood due to the complexity of their specific biochemical reactions. Enzymatic reactions occur at an active site, which may be localized either on their surfaces, in pockets that are generally surface accessible, or in cavities buried within the protein’s core.1,2 In enzymes with buried active sites, the transport of molecules from the bulk solvent to the active site cavity and their release is carried out through transport pathways known as tunnels,27 sometimes called channels.810 Enzymes often feature multiple tunnels, some of which are dedicated to transporting particular molecules.6,1113 Such tunnels are formed from separate continuous voids in biomolecular structures. Their narrow parts, known as bottlenecks, often determine the selectivity for compounds passing through or out of the active site cavity. Bottlenecks are primarily controlled through gates represented by residues, loops, secondary structures, or domains.14 Due to the presence of these gates, the tunnels are transient, making their characterization a nontrivial task and sometimes even impossible in a single static structure.6,14,15 Gates are frequently involved in regulating transport, especially in multistep reactions.14,15 Gating residues often act as selectivity filters, allowing only certain-sized substrates or products to traverse the tunnel, thereby playing a crucial role in the catalytic activity of the enzyme.14

These tunnels are widespread, comprising over 50% of enzymes across all six Enzyme Commission (EC) classes.1 To understand the structural basis of activity related to these tunnels, in-depth investigation is required to study the dynamic behavior of these pathways, which determines the exchange rate of substrates entering the active site or products being released into bulk solvents.6 The tunnels also regulate the movement of solvents and products,15 providing additional control over enzymatic reactions. Given that many enzymes containing molecular tunnels have been associated with various diseases and that the inhibitors binding to these tunnels can function as effective medications, we can comprehend the biological significance of tunnels.15 Mutations in tunnel residues can result in variants with significantly altered properties.3,16 Recent research shows that the catalytic functions of enzymes and their prospects can be altered by the dynamics, geometry, and physicochemical properties of the tunnel.15 Therefore, the dynamics and flexibility of enzymes and their tunnels must be considered important factors when studying structural biology in the context of structure-based drug design.15

Due to advancements in experimental methods such as X-ray crystallography,17 cryo-EM,18 various types of NMR spectroscopy,19 and cutting-edge computational technologies such as deep learning-based protein structure prediction with AlphaFold,20 an increasing number of high-quality three-dimensional static structures are available for in-depth investigation, including the analysis of potential tunnels. Although experimental methods yield the structures of proteins, the study of tunnels within these structures requires dedicated geometry-based tools that explore free van der Waals volume, such as CAVER 3.0,21 MOLE 2.5,22 and MolAxis.23 Because static structures derived from experimental techniques or computational predictions usually do not consider multiple protein conformational states, these techniques are often followed by molecular dynamics (MD) simulations to explore the protein conformational landscape and capture tunnel dynamics, thereby studying their transient nature.15,17 Additionally, the exploration of tunnels by geometry-based methods can be supplemented by ligand-tracking methods using tools such as streamline tracing,24 Visual Abstraction of Solvent Pathlines,25 Watergate,26 AQUA-DUCT,27 and trj_cavity.28

Due to the difficulty of biomolecules easily overcoming large energy barriers, milliseconds to seconds or even longer sampling is required to visit rare structural changes.17,29 The opening of transient tunnels poses a significant challenge for conventional MD (cMD) methods, which are typically limited to tens of microseconds.30 To address this limitation, various biased enhanced sampling methods have been proposed, such as adaptive biasing force (ABF),31 umbrella sampling,32 and metadynamics,33 which have been successfully used to deliver considerable insights into the utilization of particular pathways by selected small molecules.3442 However, their main limitation lies in defining the collective variables (CVs) capable of reaching sufficient convergence even using very intensive computation,4345 since this requires thorough knowledge of the system and often extensive optimization to find an acceptable solution.4649 For systems with complex tunnel networks, multiple CVs would be required to explore the desired transport pathways efficiently. Alternatively, methods that enhance ligand movements more stochastically, without the need for a rigorous definition of transport pathways, like random accelerated MD (RAMD),50,51 accelerated MD (aMD),52 or ligand Gaussian accelerated MD (LiGaMD),53,54 face a challenge in the identification of rarer transport pathways as most simulation replicates tend to explore the primary pathways.16,43,5557 Moreover, the obtained information on tunnel usage is usually restricted to the particular ligand explicitly used in the simulations, which cannot always be easily translated to different ligands with different molecular properties.58,59 Hence, efficient exploration of tunnel conformational dynamics directly from the protein structure without explicitly probing and thus biasing them with particular ligands remains challenging. From this perspective, aMD and Gaussian accelerated MD (GaMD),60 which have been proposed to enhance the sampling of conformational space without explicitly defined CVs,52 represent a promising solution for tunnel exploration. Furthermore, GaMD provides an opportunity to reconstruct the original free energy landscape due to the appropriately controlled boosting potential. Although GaMD has been investigated for numerous enzymes, including explicit ligand migration simulations,61 its effectiveness for reliably sampling conformations of an overall tunnel network in the absence of ligands remains unknown.

In this study, we examine the efficiency of GaMD to explore stable and transient tunnels in haloalkane dehalogenase’s (LinB) tunnel network. LinB belongs to the α/β hydrolase superfamily, which catalyzes the hydrolytic dehalogenation of halogenated compounds,62 with high potential applications in bioremediation, biosensing, or biocatalysis.63,64 Importantly, due to its multistep reaction and utilization of water during the reaction, this class of enzymes requires multiple tunnels crucial for synchronizing particular steps, making it a perfect model for our investigation.65 Structurally, LinB has a flexible cap domain (Figure 1A) and stable core domain,62 and its active site consists of a catalytic pentad (E132–a catalytic acid, D108–a nucleophile, H272–a base, W109 and N38–two halide-stabilizing residues).66 LinB has three known tunnels: one permanent tunnel (p1) and two auxiliary tunnels (p2 and p3).16 Therefore, in the present study, we focused on three variants to thoroughly test GaMD capabilities to explore the conformational space of the known LinB tunnel network. These variants include (i) LinB wild-type (LinB-Wt), in which p1 represents the main tunnel, whereas p2 tunnels play an auxiliary role (Figure 1B); (ii) designed LinB-Closed variant carrying the L177W mutation resulting in a drop of p1 tunnel occurrence (Figure 1C);16 and (iii) LinB-Open variant, in which the main p1 tunnel was closed by the same mutation L177W as seen in the former variant, resulting in a reduction in p1 functioning as the main tunnel. Introduced mutations W140A, F143L, and I211L result in the opening of an additional auxiliary p3 tunnel (Figure 1D).16 Given that the tunnel dynamics and function vary significantly across the selected LinB variants, they represent suitable models for the exploration of GaMD capabilities to capture rare tunnel dynamics, validate its efficiency and sensitivity with the broad knowledge from the literature, and compare it with cMD simulations.

Figure 1.

Figure 1

Structural representations of LinB and its variants. A. LinB has a more flexible cap domain (pink cartoon) and a more stable core domain (gray cartoon). The catalytic residues of LinB forming the catalytic pentad are shown as gray sticks. B. LinB-Wt. C. LinB-Closed mutant carrying the L177W mutation. D. LinB-Open mutant with four mutations: L177W, F143L, W140A, and I211L. E. Known p1, p2, and p3 tunnels of LinB. The shown representative tunnels were selected from GaMD simulations of LinB-Wt to have their bottleneck radius and length match the average properties of individual tunnel clusters observed in these simulations.

Methods

System Setup and Conventional Molecular Dynamics Simulations

The cMD simulation was performed on three LinB variants: LinB-Wt (PDB code: 1MJ5),62 LinB-Closed (PDB code: 4WDQ),16 and LinB-Open (PDB code: 5LKA).16 First, the systems were protonated using the H++ server67 at pH 8.516 and a salinity of 0.1 M. Subsequently, the systems were solvated using 4-point OPC water models68 and neutralized with counterions (Na+ and Cl) to achieve a final NaCl concentration of 0.1 M. Initially, the systems were energy minimized using 500 steps of steepest descent, followed by 500 steps of conjugate gradient with decreasing harmonic restraints in 5 rounds using the PMEMD69 module of AMBER1879 with the ff14SB force field.71 The following restraints were applied to the system: 500 kcal Inline graphic on all heavy atoms of the enzyme, followed by 500, 125, 25, and 0 kcal Inline graphic on backbone atoms of the enzyme exclusively. Minimization was followed by a 2 ns equilibration MD simulation, with gradual heating to 310 K under a constant volume using the Langevin thermostat72 with a collision frequency of 1.0 ps–1 and harmonic restraints of 5.0 kcal Inline graphic on the positions of all enzyme atoms, using periodic boundary conditions with the particle mesh Ewald method.73,74 The Berendsen barostat was used to control the pressure of the system. The simulations were run using a 4 fs time step enabled by SHAKE and the hydrogen mass repartitioning algorithm.75 Finally, these simulations were continued with an unrestrained 200 ns production simulation performed with pmemd.cuda at constant pressure and temperature, with frames stored every 20 ps.

Clustering analysis was performed using the hierarchical agglomerative (HierAgglo) algorithm in cpptraj76 based on a 200 ns trajectory for each of the three systems. A cutoff of 4.5 was used, with a number of clusters set to 5 and the linkage method as the average linkage, to obtain five clusters of the most diverse conformations. For each cluster, the frame with the lowest cumulative distance to all other frames within the cluster was selected as its representative. These five frames were used as seed structures for the production cMD, which continued to an unrestrained 5 μs simulation, with frames stored every 200 ps, under constant pressure and temperature for all three LinB variants.

Gaussian Accelerated Molecular Dynamics

The five most diverse structures obtained by clustering analysis in cMD were used as seeds for GaMD in all three LinB variants to diversify the sampling space. After preparing the structures for GaMD simulation, a short 8 ns cMD was run to collect the potential energy statistics of the system, including maximum (Vmax), minimum (Vmin), average (Vav), and standard deviation (σV).60 GaMD equilibration of 8 ns was performed after adding the boost potential.60 The first 16 ns of each run was further considered as equilibration and excluded from the final analyses. Parameters governing the strength of the applied boosting potential were set as follows: σ0P, which is the standard deviation of the first potential boost on the total potential energy of the system, was set to 1.3 kcal/mol; σ0D, which is the standard deviation of the second potential boost on the dihedral angle of the system, was set to 2.5 kcal/mol. These parameters for σ0P and σ0D were tested iteratively using a range of values to bring the coefficients k0D and k0P as close as possible to a maximum value of 1.0, which provides the highest acceleration to the system, or the highest possible value whenever the system was unstable before reaching the maximum. The parameters used for testing are described in detail in Table S1. The system threshold energy was set to the lower bound (E = Vmax). Simulations were run using a 4 fs time step, analogously to cMD, with hydrogen mass repartitioning applied.75 Finally, these dual-boost GaMD simulations were continued for 5 μs of unconstrained production MD. The simulations were performed analogously to cMD settings, namely, under constant pressure and temperature using AMBER18 with the ff14SB force field, with frames stored every 200 ps for subsequent analyses.

Basic Analysis

Trajectories generated from both methods, for all three LinB variants and corresponding replicates (totaling 30 repetitions of 5 μs sampling), were processed using cpptraj70 implemented in AmberTools17.67 The root-mean-square deviation (RMSD) and root-mean-square fluctuation (RMSF) were calculated with the initial structure as a reference, considering the backbone heavy atoms of the protein (N, CA, and C). RMSD calculations excluded the N-terminal tail (comprising the first 11 amino acids) due to its high overall flexibility. The protein’s radius of gyration (Rg)77 and solvent accessible surface area (SASA)78 were calculated, considering all heavy atoms of the protein. Distances between residues were calculated using cpptraj, considering the α carbons (Cα) of respective residues.

Tunnel Analysis

The tunnels were calculated using CAVER 3.0.221 software, which identifies pathways in protein structures by constructing a Voronoi diagram79 of their atomic structure. The edges and vertices of such a diagram contain information about the surrounding empty space within the protein. Next, the edges smaller than the user-defined probe radius are removed and the simplified diagram is searched for continuous tunnels from the user-defined starting point to the protein surface using Dijkstra’s algorithm,80 with an adjustable cost function that prioritizes shorter and wider ones. When analyzing MD trajectory, tunnels from each frame are clustered based on their similarity to identify tunnel ensembles corresponding to a given pathway.81 Here, we have used a 6 Å shell radius and 4 Å shell depth to define the protein surface, specifying the starting point as the center of mass of three catalytic residues (Asn38, Asp109, and His272; numbering corresponds to the crystal structure). A probe radius of 0.9 Å was used to calculate potential tunnels in the enzymes with a time sparsity of 1. Finally, the tunnels were clustered using a clustering threshold of 3.0. Tunnel calculations were performed across 5 μs simulations of both cMD and GaMD, along with their initial fractions, namely, 2.5, 1, and 0.5 μs GaMD simulations for sampling evaluation and comparison between GaMD and cMD methods. Subsequently, TransportTools82 (TT) software v0.9.2 was used to generate a unified tunnel network across all variants. Transport tunnels from 5, 2.5, 1, and 0.5 μs GaMD and cMD simulations in the enzyme variants were compared using the comparative analysis module of TT. The clustering method was set to average with a clustering cutoff of 1.

Reweighting of GaMD Tunnel Profiles

A reweighting protocol was implemented to obtain properties of the investigated tunnels from the original GaMD trajectories, reweighted according to the boost potential applied during simulation. For this purpose, the output files of individual GaMD runs containing the boost potential were parsed along with the corresponding TT profiles in the CSV format for all superclusters, providing access to tunnel characteristics reweighted to the original free energy landscape without bias. The top-100 tunnel conformations, selected based on reweighted TT throughput from each variant, were subjected to further analysis to study the migration of ligands through them with the CaverDock software.83

Cryptic Pocket Detection

To analyze the cryptic and allosteric pockets, three different tools were used. The DeepSite84 tool was used to identify viable druggable binding sites on the target protein and pockets likely to bind small molecules. PASSer 2.085 was used to detect probable allosteric site pockets, and FTMove86 was used to search for important cryptic binding hotspots utilizing all known conformers of the protein. Web servers for all three tools used the LinB-Wt crystal structure.

Migration of Ligands Through GaMD Tunnels

To evaluate the efficacy of GaMD tunnels in transporting ligands, CaverDock v1.1 was used to perform molecular docking across the top-100 tunnel conformations with the highest throughput. CaverDock is a computational tool for the study of ligand migrations through protein tunnels by utilizing a modified docking algorithm of AutoDock Vina.87 The tunnel is represented by a sequence of spheres that are extracted from CAVER 3.0, and these spheres are then discretized into cross-sectional slices called discs. The ligand is docked at each disc, and its binding energy is calculated using the AutoDock Vina scoring function. This process predicts a migration pathway along the tunnel with an associated estimate of the potential energy.88 Here, four ligands, namely bromide ion (Br), 2-bromoethanol (be), 1,2-dibromoethane (dbe), and water (H2O), were used to study the transport events in the tunnels, representing the substrate, products, and water important for the catalytic activity of LinB.16 MGLTools v1.5.689 was utilized to prepare the inputs using the prepare_receptor4.py and prepare_ligand4.py scripts with default settings. Finally, the continuous upper-bound migration trajectories obtained from CaverDock for each of the top-100 tunnels in each investigated tunnel cluster were considered to calculate the energy barriers for ligand migration and the successful migration of ligands through a particular tunnel cluster using an in-house Python script.90

Distance-Based Principal Component Analysis

Principal component analysis (PCA)91 was calculated based on the Cα distances from the side helix (residue IDs: 166–179 in the crystal structure) to the catalytic residue His272 (Figure S1). The Python Scikit-learn package92 was used for the calculation of PCA, whereas the distances were calculated using the distance module in cpptraj of AmberTools17. PCA was calculated separately for all three enzyme variants. Cluster analysis was conducted using HDBscan,92 with the following parameters: min_cluster_size and min_samples were kept at 60, allow_single_cluster was set to True, and cluster_selection_epsilon was set to 0.5.

Results

GaMD Overcomes cMD Sampling Limitations and Enables the Exploration of a More Diverse Conformational Space

To determine the optimal acceleration parameters, short GaMD simulations (50 ns) were tested on LinB-Wt, followed by extended simulations (1000 ns). For LinB-Wt, the total potential boost σ0P was set to 1.3, resulting in a k0P of 0.09, which avoided prohibitive instability in the system that was observed with higher σ0P values. Regarding the dihedral potential boost, σ0D was set to 2.5, resulting in a k0D of 1.0, which is the maximum possible boost. Consequently, the dual-boost GaMD on LinB-Wt with σ0P = 1.3 and σ0D = 2.5 was performed for all three variants of LinB (Table S1). The relatively low value of k0P for our model system LinB could be attributed to its two domains, namely, a more flexible cap domain and a stable core domain. The presence of the flexible cap domain increases the likelihood of multiple tunnels opening in the protein, which can make the enzyme quite sensitive to boost potential energies in GaMD.

To analyze the stability of the enzymes across the simulations, we calculated the RMSD and RMSF for all replicates of the simulations from the investigated LinB variants in both cMD and GaMD simulations. The time evolution of RMSD for the wild-type in cMD and GaMD oscillated below ∼2.5 Å, indicating sufficient equilibration and convergence of the simulations (Figure 2A). Similarly, the LinB mutants indicate even more stable behavior of the systems (Figures S2 and S3) and do not display significant increases in RMSD, suggesting more rigid internal dynamics due to the introduced mutations, even in the cap domain, regardless of the applied boosting potential. Furthermore, to verify the compactness of the protein during simulations, especially upon boosting, we evaluated the Rg and SASA from GaMD trajectories and compared them with cMD, which did not elucidate any significant differences (Figures S4–S6).

Figure 2.

Figure 2

Analysis of protein stability. A. The RMSD time evolution of the LinB-Wt enzyme without the 11-residue long N-terminal tail during cMD and GaMD simulations. B. RMSF of LinB-Wt from cMD and GaMD simulations. The most fluctuating residues from the GaMD simulation are highlighted by a black dashed circle.

After verifying the stability of the protein core, we further evaluated the behavior of the residues forming the catalytic machinery. Therefore, we examined the time-evolution RMSD of individual residues Asn38, Trp109, Asp108, Glu132, and His272 to form the catalytic pentad. Collectively, these residues may have been affected by the additional boost potential in GaMD, thus resulting in the sampling of inactive or unphysical conformations (Figures S7–S11). The most flips are observed in the case of residues Asp108, Glu132, and His272. Specifically, for His272 and Glu132, the His272 backbone forms a hydrogen bond with Glu132. However, during the simulation, this hydrogen bond breaks, leading to these residues adopting different conformational states before returning to the initial conformation or, sometimes, the His272 ring entering a different conformational state, causing fluctuations in the RMSD plot (Figures S7–S11). In the case of Asp108, the side chain frequently fluctuates between different conformations, with an RMSD range of approximately 1.25 to 1.50 Å. When considered separately and cumulatively in both cMD and GaMD, the values range between 2.0 and 2.5 Å, indicating that the enzymes’ catalytic machineries are not distorted due to the added boost potential in GaMD.

GaMD Enabled the Accurate Detection of the Tunnel Network Known for LinB and Led to the Discovery of a New Side Tunnel (ST)

We calculated the tunnels to investigate the internal dynamics of ligand transport pathways due to enhanced sampling. The tunnel networks were detected using a combined approach, including CAVER 3.0.2 calculations and further tunnel unification across all simulated systems and replicates in TransportTools. This analysis revealed the presence of all known branches of the p1 and p2 tunnels, namely, p1a, p1b, p2a, p2b, and p2c, respectively. Additionally, two rare tunnels – known as p3 and the newly discovered side tunnel (ST) – were found (Figure 3A,B). The newly discovered ST opening corresponds to a region with high RMSD fluctuations, displaying increased values in two replicates of GaMD (especially replicates 1 and 4), due to the boosting potential helping the enzyme to explore a broader sampling space. This observation is consistent with the RMSF profiles showing increased fluctuations in the cap region (residue numbers 166–179) for these corresponding replicates, suggesting improved sampling (Figure 2B). Furthermore, to compare the sampling efficiency between cMD and GaMD during the time evolution of the simulation, we considered various fractions of the trajectories, including the full length (5 μs), first half (2.5 μs), initial 20% (1 μs), and 10% (500 ns). We noted that at least 1 μs of GaMD is required to observe enhanced exploration of tunnels, which was particularly noticeable for the transient tunnel ST (Figures S12–S14). GaMD enhances exploration of the sampling space to capture the rare tunnels more effectively than cMD (Figure 3A). Besides the tunnel occurrence as defined by the probe radius used for the CAVER calculations (0.9 Å), we also considered other, mostly geometric, properties of the tunnels, such as average length, average bottleneck radius, and maximum bottleneck radius. Interestingly, although tunnels exhibit consistent geometric properties within various groups, it was observed that GaMD simulations tend to open tunnels more, in particular, the primary p1b tunnel (Figures S15–S17), resulting in an increased maximum bottleneck radius exceeding 3 Å (Figures S12–S14).

Figure 3.

Figure 3

Occurrence of tunnels in LinB variants. A. Percentage of tunnels being open in the ensemble for LinB-Wt, LinB-Closed, and LinB-Open variants. Inset plots present the increased sampling of ST captured by GaMD in LinB-Wt. B. Representation of the LinB-Wt crystal structure (gray cartoon) and the tunnel network in the LinB family (colored spheres). The shown representative tunnels were selected from GaMD simulations of LinB-Wt to have their bottleneck radius and length matching the average properties of individual tunnel clusters observed in these simulations. C. Plot of the p3 tunnel from TT analysis in respective Wt and mutants with cMD and GaMD methods (top) and representation of the protein structure region where the p3 tunnel opens in Wt and LinB-Closed mutants, containing bulky residues TRP, PHE, and ILE; also, the region of p3 tunnel opening in LinB-Open, containing the mutation W140A+F143L+I211L, which leads to the widening of this region (bottom). The representatives of p3 tunnels were selected from GaMD simulations of LinB-Wt and LinB-Open, respectively, to have their bottleneck radius and length match the average properties of their tunnel clusters observed in these simulations.

The results obtained from GaMD simulations were compared with the tunnel network in three variants with already published data on de novo tunnel engineering.16 We found that the p3 tunnel opening follows a trend similar to that in previous studies; that is, the characteristic tunnel network known for each variant is reproduced in the GaMD trajectories. In LinB-Wt, the p1 tunnel branches serve as the primary conduits, which undergo changes in mutants due to the closure of p1 tunnels, resulting in a significant decrease in their occurrence. Additionally, we observed a significant increase in the presence of the p3 tunnel in the LinB-Open variant due to p3-opening mutations. Interestingly, the GaMD tunnels have a tendency for more frequent and broader tunnel openings (Figure 3C), particularly in the case of the p3 tunnel (as indicated by the high bottleneck radius shown in Figures S12–S17). Furthermore, we observed increased sampling of the ST in the wild-type protein, with its occurrence being comparable to the rare p3 tunnel in GaMD simulations (Figure 3A), which was not as pronounced in mutants. Moreover, although it is sampled by cMD, we noted its increased sampling, mostly in GaMD for LinB-Wt, whereas the other two variants do not show such prominent ST openings (Figure 3A). Due to the large structural perturbation in the helix caused by boosted GaMD simulation, the opening of a transient ST is enhanced. We monitored the RMSF of the protein throughout the simulation and found an increased fluctuation in the cap domain in two out of five GaMD simulation replicas, indicating that GaMD visited a broader conformational space.

Insights into the Functional Relevance of the Discovered Side Tunnel (ST)

In addition to tunnels, the ligand interaction with enzyme surface pockets or cryptic pockets has been considered important because pocket residues can influence ligand transport through tunnels.93 Interestingly, Raczyńska et al.93 published a study focused on identifying transient binding sites on the enzyme surface as potential sites for engineering enzyme activity. Their study, conducted on LinB-Wt, indicated the ability of the ST entrance pocket to bind 1-chlorohexane. Furthermore, they showed that the experimentally tested mutation of the ST entrance pocket residue A189F increased the enzyme activity by 21.4% for 1-chlorohexane and 26.2% for 1-bromocyclohexane,93 which highlights the importance of the ST pocket and the newly discovered ST path, which connects the active site or active site pocket to the ST pocket for LinB (Figure 4). To study the opening site of the ST at the enzyme surface and its connectivity with the active site, we used cryptic and allosteric pocket detection tools, such as FTMove, PASSer 2.0, and DeepSite. Our analysis revealed a cryptic pocket at the mouth of the ST as one of the major ligand-binding cryptic pockets, suggesting potential allosteric communication via the ST pocket. DeepSite predictions of cryptic pockets showed the ST pocket with the second-highest score after the active site p1 tunnel pocket (Figure S18). Similar results were obtained using the PASSer 2.0 allosteric site prediction tool (Figure S19) and FTMove, where the p1 tunnel pocket was ranked third, the active site pocket fourth, and the ST pocket second among all of the different cryptic pockets detected by the tool (Figure S20).

Figure 4.

Figure 4

Cryptic pockets detected in the LinB-Wt structure by FTMove. The active site and ST pockets are shown in dark gray and yellow, respectively. The ST pocket is located at the mouth of the ST, which connects these two pockets. The shown representative ST was selected from GaMD simulations of LinB-Wt to have its bottleneck radius and length match the average properties of its tunnel clusters observed in these simulations.

ST Can Transport Ligands Similarly to the P3 Tunnel

We observed that the ST is connected to a cryptic pocket, the functional relevance of which has been experimentally verified by mutagenesis.93 Hence, we were interested in probing to what extent the ST transport viability for four relevant substrate and product molecules (Figure 5A) matches those of the auxiliary p3 and primary p1b tunnels. As the explicit simulations of transport via gated tunnels of LinB variants were shown to be very hard to execute even for its primary p1b tunnel and a single product molecule,43 performing rigorous simulations with four molecules and three tunnels, where the ST opening requires rather pronounced protein conformational change, would represent an extremely complex and computationally intensive task. Instead, we opted for estimating the relative transport propensity of these tunnels with CaverDock, which was successfully applied for predicting ligand unbinding rates,83,94 estimating effects of mutations on ligand transport,83,95,96 and approximating energy profiles in agreement with insights from sophisticated enhanced sampling simulations.96

Figure 5.

Figure 5

Energetic analysis of ligand transport through the transient ST. A. The structures of the four ligands studied: 1,2-dibromoethane (dbe), 2-bromoethanol (be), bromide ion (Br), and water (H2O), used for transport analysis (top). B. Plots showing the energy costs of successful transport events in all three variants, LinB-Wt, LinB-Open, and LinB-Closed. Blue represents transport events across the top-100 conformations of p1b tunnels, yellow represents transport events across the top-100 conformations of p3 tunnels, and pink shows transport events across the top-100 conformations of the ST (bottom).

Here, we performed these docking calculations using ensembles of the top-100 conformations of tunnels with the highest throughput with each of the four investigated molecules to compare the efficiency of the transient ST in transporting these molecules with the p3 and p1b tunnels. The top-100 tunnel conformations used for calculations were obtained from the GaMD trajectories, and using CaverDock, we calculated the energy profile by assessing the binding energy of the ligand to the protein along the entire tunnel length. Next, we calculated the energy barriers that the ligands must overcome for these 100 considered transport events (Supporting Information File 2). This energetic evaluation of transport demonstrated that the ST was able to transport all four ligands, and interestingly, the energetics of these transports was comparable to the energy barriers sampled for the auxiliary tunnel p3 (Figure 5B). Our evaluation indicated the importance of the ST in conjunction with the known p3 tunnel. Similar observations were obtained for haloalkane dehalogenase DhaA, where an equivalent ST was also repeatedly identified in energy-unbiased high-throughput MD simulations.97 This tunnel showed capacity for transporting water molecules at levels comparable to that of p3.97 Additionally, the energy barrier for water transport (H2O) was found to be lower (<5 kcal/mol) in all three tunnels, which shows that all the tunnels were effectively able to transport water through the enzyme. Furthermore, the best-performing tunnel was identified as p1b from the Open mutant, consistent with studies on de novo tunnel engineering.16

ST Opening Mechanism is Different in LinB-Wt and Its Mutants

The importance and validity of the ST are confirmed by our results. Therefore, to understand the conformational changes associated with ST opening, we conducted a distance-based PCA in LinB-Wt and its mutants, focusing on the most dynamic region of the protein as confirmed through RMSF, as this approach was shown to be well suited for the investigation of functionally relevant conformational changes.98100 In all cases, the first two principal components (PCs) explain at least 80% variance in the original data (Figure S21), suggesting that they were sufficient for further analyses. In LinB-Wt, we found that GaMD simulations sample two distinct conformational states, in contrast to cMD simulations (Figure 6), indicating that this side helix region provides the protein with the possibility to visit different conformational states (Figure S22). In cMD simulations, we also detected the ST occurrence, but it was less frequent, which confirms that the opening is not forced by the biasing potential but highlights that the system requires broad sampling to capture it efficiently. Major conformational states in all variants were further investigated by HDBscan clustering of distance-based PCA data. We found two predominant states for LinB-Wt in GaMD (Figure 6), which were also partially observed for mutants but only sporadically (Figures S23–S25). Importantly, this analysis further highlights the improved sampling in GaMD trajectories compared to that of cMD.

Figure 6.

Figure 6

Detection of distinct conformational states sampled by LinB variants. Distance-based PCA: PC1 and PC2 were plotted for all three variants of LinB and represented as Clusters 0 and 1 (all representative frames) from the LinB-Wt GaMD simulation, showing two different states of the protein. In the superimposed structures, red represents Cluster 0, and blue represents Cluster 1, showing the different conformational states captured by LinB-Wt.

We investigated in-depth the mechanism of the rare ST opening in LinB-Wt and mutants. In LinB-Wt, the prerequisite for the opening of the ST is the movement of the side helix away from the protein cap domain. Due to the high fluctuation of the side helix of the cap domain, the ST opens in LinB-Wt frequently (Figures 7A and S26), especially in GaMD-boosted simulations. In the case of mutants (LinB-Closed and LinB-Open), a mutation at the bottleneck residue of the ST (L177W) introduces hydrogen bonding with D147, and the breakage of this hydrogen bond promotes the opening of the ST. However, this process is notably more energetically unfavorable, resulting in the ST being less frequently open in mutants compared to the wild-type enzyme (Figures 7B, S27, and S28). We had already observed that LinB-Wt samples the ST more frequently than mutants because, in the case of LinB-Wt, the lack of a hydrogen bond donor makes the movement of the side helix away from the protein easier, contributing to the opening of the side tunnel. In contrast, in mutants due to hydrogen bonding, the opening is less favorable and, therefore, less frequent.

Figure 7.

Figure 7

Mechanism of side tunnel opening. A. Representation of the ST in the LinB-Wt protein showing the necessary movement of the side helix for the opening of the ST. B. Representation of the ST in mutants showing the necessary movement of the side helix accompanied by the breakage of the hydrogen bond between Trp176 and Asp146.

Discussion

In this study, we discussed the potential application of the GaMD enhanced sampling method to investigate tunnels and ligand transport pathways in LinB dehalogenases. Our findings demonstrate that GaMD is not only able to identify all known tunnels (p1a, p1b, p2a, p2b, and p2c) comparable to the cMD method but also enhances sampling of the enzyme conformational space and visits rare tunnels (ST and p3 tunnel). The improved sampling provided by GaMD enabled us to identify a previously unexplored ST and to further understand the mechanism of its opening in the three LinB variants. This opening mechanism was found to be clear and straightforward in the case of the LinB-Wt enzyme but more complex in mutants. Importantly, such a thorough investigation of the newly discovered ST pathway would not have been possible without the sufficient conformational exploration provided by GaMD. What is critically important is that this method not only provided a better picture of conformational dynamics for the three LinB variants but also facilitated consistent exploration of known tunnels from broad computational and experimental data for all variants.

Our exploration revealed that in LinB-Wt, the movement of the side helix is the main contributing factor to the opening of the ST pathway. For mutants, the sampling of ST pathways is less pronounced because of the L177W mutation, resulting in a hydrogen bond between introduced W177 and D147. This prohibits the opening of the helix observed in the wild-type enzyme and proves that the ST is mostly unfavorable in mutants. Thus, it is rarely observed in standard and enhanced MD simulations. Furthermore, the functional importance of this ST pathway is supported by the exploration of druggable, cryptic, and allosteric pockets, all consistently pointing to the mouth of the ST pathway as the functional site. This is further corroborated by a recent experimental study demonstrating the ability of the ST pathway mouth to bind drugs.93 Additionally, migration analysis using CaverDock with GaMD tunnels confirmed the capacity of the transient ST to transport the substrate and product molecules efficiently at levels comparable to those of auxiliary tunnel p3, which supports the auxiliary role of the ST in LinB.

Regarding the p3 tunnel, we observed that GaMD provides better sampling compared with standard MD simulations. Furthermore, we noted the trend expected for LinB-Wt and mutant enzymes from previous literature data. The p3 tunnel is detected more frequently in the LinB-Open mutant due to the mutations W140A, F143L, and I211L, which replace bulky residues with smaller ones, aiming to provide more space for the opening of the p3 tunnel. Conversely, in LinB-Wt and the LinB-Closed mutant, the amino acids at the mouth of the p3 tunnel are not modified, resulting in a more restricted opening compared with the LinB-Open variant.

Conclusions

Our study demonstrates GaMD as a practically useful approach for investigating tunnels in proteins. By overcoming the sampling limitations of standard MD simulations, GaMD can more effectively explore the conformational dynamics of the protein under study while limiting computational costs. This opens up new possibilities for identifying tunnels by efficiently sampling rare events and overcoming unfavorable energy barriers. Therefore, as shown for the LinB enzyme in our study, this method is suitable for effectively exploring new pathways that can become a druggable site to target in addition to the conventional targeting of the active site itself.

Acknowledgments

The computations were performed at the Poznan Supercomputing and Networking Center.

Glossary

Abbreviations

MD

molecular dynamics

cMD

conventional MD

GaMD

Gaussian accelerated MD

RMSD

root mean square deviation

RMSF

root mean square fluctuation

TT

TransportTools

ST

side tunnel

EC

Enzyme Commission

ABF

adaptive biasing force

CV

collective variable

NMR

nuclear magnetic resonance

aMD

accelerated MD

PC1

principal component 1

PC2

principal component 2

cryo-EM

cryogenic electron microscopy

RAMD

random accelerated MD

LiGaMD

ligand Gaussian accelerated MD

Data Availability Statement

Underlying data are available on the Zenodo repository: (i) https://zenodo.org/doi/10.5281/zenodo.11092891, containing input, output, and analysis files; (ii) https://zenodo.org/doi/10.5281/zenodo.11093856, containing the parameter files and stripped MD trajectory files.

Supporting Information Available

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jcim.4c00966.

  • Testing parameters for GaMD simulations (Table S1); the region used for distance-based PCA calculation (Figure S1); RMSD and RMSF of mutant enzymes (Figures S2 and S3); Rg and SASA of LinB-Wt and its mutants (Figures S4–S6); RMSD of catalytic residues (Figures S7–S11); comparison between cMD and GaMD tunnel properties from TransportTools (Figures S12–S14); Bottleneck radius distribution analysis (Figures S15–S17); the results of cryptic and allosteric pocket detecting tools (Figures S18–S20); variance and cumulative variance of all fourteen principal components in cMD and GaMD (Figure S21); plots showing PCA comparison between cMD and GaMD (Figure S22); HDBscan cluster analysis of LinB-Wt and its mutants (Figures S23–S25); mechanism of side tunnel opening using helix movement in LinB-Wt (Figure S26); mechanism of side tunnel opening using helix movement and hydrogen bond analysis in the mutant enzymes (Figures S27 and S28) (PDF)

  • Upper bound energy profiles using CaverDock from individual 100 tunnels in all variants with all four ligands (PDF)

Author Contributions

N.M. performed all MD simulations, calculations of tunnels, cryptic pocket analysis, PCA, and migration analysis and drafted the manuscript. B.S. devised the tunnel reweighting, PCA, and clustering analysis protocol. J.B. and B.S. devised and supervised the project. N.M. and B.S. analyzed the data. J.B., N.M., and B.S. interpreted the results. The manuscript was written through the contributions of all authors. All authors have approved the final version of the manuscript.

This work was supported by the National Science Centre, Poland (Grant Number 2017/26/E/NZ1/00548).

The authors declare no competing financial interest.

Supplementary Material

ci4c00966_si_001.pdf (9.1MB, pdf)
ci4c00966_si_002.pdf (6.4MB, pdf)

References

  1. Monzon A. M.; Zea D. J.; Fornasari M. S.; Saldaño T. E.; Fernandez-Alberti S.; Tosatto S. C. E.; Parisi G. Conformational Diversity Analysis Reveals Three Functional Mechanisms in Proteins. PloS Comput. Biol. 2017, 13, e1005398 10.1371/journal.pcbi.1005398. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Krone M.; Kozlíková B.; Lindow N.; Baaden M.; Baum D.; Parulek J.; Hege H. C.; Viola I. Visual Analysis of Biomolecular Cavities: State of the Art. Comput. Graph. Forum 2016, 35, 527–551. 10.1111/cgf.12928. [DOI] [Google Scholar]
  3. Prokop Z.; Gora A.; Brezovsky J.; Chaloupkova R.; Stepankova V.; Damborsky J.. Engineering of Protein Tunnels: Keyhole-Lock-Key Model for Catalysis by the Enzymes with Buried Active Sites; In Protein Engineering Handbook, Wiley-VCH: Weinheim, 2012; pp. 421-464. [Google Scholar]
  4. Petřek M.; Otyepka M.; Banáš P.; Košinová P.; Koča J.; Damborský J. CAVER: A New Tool to Explore Routes from Protein Clefts, Pockets and Cavities. BMC Bioinf. 2006, 7, 316. 10.1186/1471-2105-7-316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Pravda L.; Berka K.; Svobodová Vařeková R.; Sehnal D.; Banáš P.; Laskowski R. A.; Koča J.; Otyepka M. Anatomy of Enzyme Channels. BMC Bioinf. 2014, 15, 379. 10.1186/s12859-014-0379-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Kingsley L. J.; Lill M. A. Substrate Tunnels in Enzymes: Structure-Function Relationships and Computational Methodology: Protein Tunnel Structure-Function Relationship. Proteins Struct. Funct. Bioinf. 2015, 83, 599–611. 10.1002/prot.24772. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Brezovsky J.; Chovancova E.; Gora A.; Pavelka A.; Biedermannova L.; Damborsky J. Software Tools for Identification, Visualization and Analysis of Protein Tunnels and Channels. Biotechnol. Adv. 2013, 31, 38–49. 10.1016/j.biotechadv.2012.02.002. [DOI] [PubMed] [Google Scholar]
  8. Wade R. C.; Winn P. J.; Schlichting I.; Sudarko A Survey of Active Site Access Channels in Cytochromes P450. J. Inorg. Biochem. 2004, 98 (7), 1175–1182. 10.1016/j.jinorgbio.2004.02.007. [DOI] [PubMed] [Google Scholar]
  9. Fishelovitch D.; Shaik S.; Wolfson H. J.; Nussinov R. Theoretical Characterization of Substrate Access/Exit Channels in the Human Cytochrome P450 3A4 Enzyme: Involvement of Phenylalanine Residues in the Gating Mechanism. J. Phys. Chem. B 2009, 113, 13018–13025. 10.1021/jp810386z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Urban P.; Lautier T.; Pompon D.; Truan G. Ligand Access Channels in Cytochrome P450 Enzymes: A Review. Int. J. Mol. Sci. 2018, 19, 1617. 10.3390/ijms19061617. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Singh S.; Anand R. Tunnel Architectures in Enzyme Systems That Transport Gaseous Substrates. ACS Omega 2021, 6, 33274–33283. 10.1021/acsomega.1c05430. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Cojocaru V.; Winn P. J.; Wade R. C. The Ins and Outs of Cytochrome P450s. Biochim. Biophys. Acta 2007, 1770, 390–401. 10.1016/j.bbagen.2006.07.005. [DOI] [PubMed] [Google Scholar]
  13. Mitusińska K.; Wojsa P.; Bzówka M.; Raczyńska A.; Bagrowska W.; Samol A.; Kapica P.; Góra A. Structure-Function Relationship between Soluble Epoxide Hydrolases Structure and Their Tunnel Network. Comput. Struct. Biotechnol. J. 2022, 20, 193–205. 10.1016/j.csbj.2021.10.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Gora A.; Brezovsky J.; Damborsky J. Gates of Enzymes. Chem. Rev. 2013, 113, 5871–5923. 10.1021/cr300384w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Marques S. M.; Daniel L.; Buryska T.; Prokop Z.; Brezovsky J.; Damborsky J. Enzyme Tunnels and Gates As Relevant Targets in Drug Design: TUNNELS AND GATES IN DRUG DESIGN. Med. Res. Rev. 2017, 37, 1095–1139. 10.1002/med.21430. [DOI] [PubMed] [Google Scholar]
  16. Brezovsky J.; Babkova P.; Degtjarik O.; Fortova A.; Gora A.; Iermak I.; Rezacova P.; Dvorak P.; Smatanova I. K.; Prokop Z.; Chaloupkova R.; Damborsky J. Engineering a de Novo Transport Tunnel. ACS Catal. 2016, 6, 7597–7610. 10.1021/acscatal.6b02081. [DOI] [Google Scholar]
  17. Marques S. M.; Brezovsky J.; Damborsky J.. Understanding Enzymes: Function, Design, Engineering, and Analysis; Pan Stanford Publishing, 2016. [Google Scholar]
  18. Peplow M. Cryo-Electron Microscopy Reaches Resolution Milestone. ACS Cent. Sci. 2020, 6, 1274–1277. 10.1021/acscentsci.0c01048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Puthenveetil R.; Vinogradova O. Solution NMR: A Powerful Tool for Structural and Functional Studies of Membrane Proteins in Reconstituted Environments. J. Biol. Chem. 2019, 294, 15914–15931. 10.1074/jbc.REV119.009178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Yang Z.; Zeng X.; Zhao Y.; Chen R. AlphaFold2 and Its Applications in the Fields of Biology and Medicine. Signal Transduct. Target. Ther. 2023, 8, 115. 10.1038/s41392-023-01381-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Chovancova E.; Pavelka A.; Benes P.; Strnad O.; Brezovsky J.; Kozlikova B.; Gora A.; Sustr V.; Klvana M.; Medek P.; Biedermannova L.; Sochor J.; Damborsky J. CAVER 3.0: A Tool for the Analysis of Transport Pathways in Dynamic Protein Structures. PloS Comput. Biol. 2012, 8, e1002708 10.1371/journal.pcbi.1002708. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Berka K.; Sehnal D.; Bazgier V.; Pravda L.; Svobodova-Varekova R.; Otyepka M.; Koca J. Mole 2.5 - Tool for Detection and Analysis of Macromolecular Pores and Channels. Biophys. J. 2017, 112, 292a–293a. 10.1016/j.bpj.2016.11.1585. [DOI] [Google Scholar]
  23. Yaffe E.; Fishelovitch D.; Wolfson H. J.; Halperin D.; Nussinov R. MolAxis: A Server for Identification of Channels in Macromolecules. Nucleic Acids Res. 2008, 36, W210–W215. 10.1093/nar/gkn223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Vassiliev S.; Comte P.; Mahboob A.; Bruce D. Tracking the Flow of Water through Photosystem II Using Molecular Dynamics and Streamline Tracing. Biochemistry 2010, 49, 1873–1881. 10.1021/bi901900s. [DOI] [PubMed] [Google Scholar]
  25. Bidmon K.; Grottel S.; Bös F.; Pleiss J.; Ertl T. Visual Abstractions of Solvent Pathlines near Protein Cavities. Comput. Graph. Forum 2008, 27, 935–942. 10.1111/j.1467-8659.2008.01227.x. [DOI] [Google Scholar]
  26. Vad V.; Byška J.; Jurcík A.; Viola I.; Gröller E.; Hauser H.; Marques S. M.; Damborský J.; Kozlíková B.. Watergate: Visual Exploration of Water Trajectories in Protein Dynamics. VCBM 17: Eurographics Workshop on Visual Computing for Biology and Medicine; The Eurographics Association, 2017, 33–42, 10.2312/VCBM.2017123. [DOI] [Google Scholar]
  27. Magdziarz T.; Mitusińska K.; Gołdowska S.; Płuciennik A.; Stolarczyk M.; Ługowska M.; Góra A. AQUA-DUCT: A Ligands Tracking Tool. Bioinformatics 2017, 33, 2045–2046. 10.1093/bioinformatics/btx125. [DOI] [PubMed] [Google Scholar]
  28. Paramo T.; East A.; Garzón D.; Ulmschneider M. B.; Bond P. J. Efficient Characterization of Protein Cavities within Molecular Simulation Trajectories:Trj_cavity. J. Chem. Theory Comput. 2014, 10, 2151–2164. 10.1021/ct401098b. [DOI] [PubMed] [Google Scholar]
  29. Henzler-Wildman K.; Kern D. Dynamic Personalities of Proteins. Nature 2007, 450, 964–972. 10.1038/nature06522. [DOI] [PubMed] [Google Scholar]
  30. Harvey M. J.; Giupponi G.; Fabritiis G. D. ACEMD: Accelerating Biomolecular Dynamics in the Microsecond Time Scale. J. Chem. Theory Comput. 2009, 5, 1632–1639. 10.1021/ct9000685. [DOI] [PubMed] [Google Scholar]
  31. Darve E.; Pohorille A. Calculating Free Energies Using Average Force. J. Chem. Phys. 2001, 115, 9169–9183. 10.1063/1.1410978. [DOI] [Google Scholar]
  32. Torrie G. M.; Valleau J. P. Nonphysical Sampling Distributions in Monte Carlo Free-Energy Estimation: Umbrella Sampling. J. Comput. Phys. 1977, 23, 187–199. 10.1016/0021-9991(77)90121-8. [DOI] [Google Scholar]
  33. Laio A.; Gervasio F. L. Metadynamics: A Method to Simulate Rare Events and Reconstruct the Free Energy in Biophysics, Chemistry and Material Science. Rep. Prog. Phys. 2008, 71, 126601. 10.1088/0034-4885/71/12/126601. [DOI] [Google Scholar]
  34. Dickson A. Mapping the Ligand Binding Landscape. Biophys. J. 2018, 115, 1707–1719. 10.1016/j.bpj.2018.09.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Dickson A.; Lotz S. D. Multiple Ligand Unbinding Pathways and Ligand-Induced Destabilization Revealed by WExplore. Biophys. J. 2017, 112, 620–629. 10.1016/j.bpj.2017.01.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Lotz S. D.; Dickson A. Unbiased Molecular Dynamics of 11 min Timescale Drug Unbinding Reveals Transition State Stabilizing Interactions. J. Am. Chem. Soc. 2018, 140, 618–628. 10.1021/jacs.7b08572. [DOI] [PubMed] [Google Scholar]
  37. Dickson A.; Lotz S. D. Ligand Release Pathways Obtained with WExplore: Residence Times and Mechanisms. J. Phys. Chem. B 2016, 120, 5377–5385. 10.1021/acs.jpcb.6b04012. [DOI] [PubMed] [Google Scholar]
  38. D’Annessa I.; Raniolo S.; Limongelli V.; Di Marino D.; Colombo G. Ligand Binding, Unbinding, and Allosteric Effects: Deciphering Small-Molecule Modulation of HSP90. J. Chem. Theory Comput. 2019, 15, 6368–6381. 10.1021/acs.jctc.9b00319. [DOI] [PubMed] [Google Scholar]
  39. Capelli R.; Carloni P.; Parrinello M. Exhaustive Search of Ligand Binding Pathways via Volume-Based Metadynamics. J. Phys. Chem. Lett. 2019, 10, 3495–3499. 10.1021/acs.jpclett.9b01183. [DOI] [PubMed] [Google Scholar]
  40. Croney K. A.; McCarty J. Exploring Product Release from Yeast Cytosine Deaminase with Metadynamics. J. Phys. Chem. B 2024, 128, 3102–3112. 10.1021/acs.jpcb.3c07972. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Nguyen H. L.; Thai N. Q.; Li M. S. Determination of Multidirectional Pathways for Ligand Release from the Receptor: A New Approach Based on Differential Evolution. J. Chem. Theory Comput. 2022, 18, 3860–3872. 10.1021/acs.jctc.1c01158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Ahmad K.; Rizzi A.; Capelli R.; Mandelli D.; Lyu W.; Carloni P. Enhanced-Sampling Simulations for the Estimation of Ligand Binding Kinetics: Current Status and Perspective. Front. Mol. Biosci. 2022, 9, 899805. 10.3389/fmolb.2022.899805. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Biedermannová L.; Prokop Z.; Gora A.; Chovancová E.; Kovács M.; Damborský J.; Wade R. C. A Single Mutation in a Tunnel to the Active Site Changes the Mechanism and Kinetics of Product Release in Haloalkane Dehalogenase LinB. J. Biol. Chem. 2012, 287, 29062–29074. 10.1074/jbc.M112.377853. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Kingsley L. J.; Lill M. A. Including Ligand-Induced Protein Flexibility into Protein Tunnel Prediction. J. Comput. Chem. 2014, 35, 1748–1756. 10.1002/jcc.23680. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Paloncýová M.; Navrátilová V.; Berka K.; Laio A.; Otyepka M. Role of Enzyme Flexibility in Ligand Access and Egress to Active Site: Bias-Exchange Metadynamics Study of 1,3,7-Trimethyluric Acid in Cytochrome P450 3A4. J. Chem. Theory Comput. 2016, 12, 2101–2109. 10.1021/acs.jctc.6b00075. [DOI] [PubMed] [Google Scholar]
  46. Fu H.; Bian H.; Shao X.; Cai W. Collective Variable-Based Enhanced Sampling: From Human Learning to Machine Learning. J. Phys. Chem. Lett. 2024, 15, 1774–1783. 10.1021/acs.jpclett.3c03542. [DOI] [PubMed] [Google Scholar]
  47. Rydzewski J.; Nowak W. Ligand Diffusion in Proteins via Enhanced Sampling in Molecular Dynamics. Phys. Life Rev. 2017, 22–23, 58–74. 10.1016/j.plrev.2017.03.003. [DOI] [PubMed] [Google Scholar]
  48. Sarkar D. K.; Surpeta B.; Brezovsky J. Incorporating Prior Knowledge in the Seeds of Adaptive Sampling Molecular Dynamics Simulations of Ligand Transport in Enzymes with Buried Active Sites. J. Chem. Theory Comput. 2024, 20, 5807–5819. 10.1021/acs.jctc.4c00452. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Sohraby F.; Nunes-Alves A. Advances in Computational Methods for Ligand Binding Kinetics. Trends Biochem. Sci. 2023, 48, 437–449. 10.1016/j.tibs.2022.11.003. [DOI] [PubMed] [Google Scholar]
  50. Lüdemann S. K.; Lounnas V.; Wade R. C. How Do Substrates Enter and Products Exit the Buried Active Site of Cytochrome P450cam? 1. Random Expulsion Molecular Dynamics Investigation of Ligand Access Channels and Mechanisms 1 1Edited by J. Thornton. J. Mol. Biol. 2000, 303, 797–811. 10.1006/jmbi.2000.4154. [DOI] [PubMed] [Google Scholar]
  51. Kokh D. B.; Amaral M.; Bomke J.; Grädler U.; Musil D.; Buchstaller H.-P.; Dreyer M. K.; Frech M.; Lowinski M.; Vallee F.; Bianciotto M.; Rak A.; Wade R. C. Estimation of Drug-Target Residence Times by τ-Random Acceleration Molecular Dynamics Simulations. J. Chem. Theory Comput. 2018, 14, 3859–3869. 10.1021/acs.jctc.8b00230. [DOI] [PubMed] [Google Scholar]
  52. Hamelberg D.; Mongan J.; McCammon J. A. Accelerated Molecular Dynamics: A Promising and Efficient Simulation Method for Biomolecules. J. Chem. Phys. 2004, 120, 11919–11929. 10.1063/1.1755656. [DOI] [PubMed] [Google Scholar]
  53. Miao Y.; Bhattarai A.; Wang J. Ligand Gaussian Accelerated Molecular Dynamics (LiGaMD): Characterization of Ligand Binding Thermodynamics and Kinetics. J. Chem. Theory Comput. 2020, 16, 5526–5547. 10.1021/acs.jctc.0c00395. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Wang J.; Miao Y. Ligand Gaussian Accelerated Molecular Dynamics 2 (LiGaMD2): Improved Calculations of Ligand Binding Thermodynamics and Kinetics with Closed Protein Pocket. J. Chem. Theory Comput. 2023, 19, 733–745. 10.1021/acs.jctc.2c01194. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Muvva C.; Murugan N. A.; Kumar Choutipalli V. S.; Subramanian V. Unraveling the Unbinding Pathways of Products Formed in Catalytic Reactions Involved in SIRT1–3: A Random Acceleration Molecular Dynamics Simulation Study. J. Chem. Inf. Model. 2019, 59, 4100–4115. 10.1021/acs.jcim.9b00513. [DOI] [PubMed] [Google Scholar]
  56. Klvana M.; Pavlova M.; Koudelakova T.; Chaloupkova R.; Dvorak P.; Prokop Z.; Stsiapanava A.; Kuty M.; Kuta-Smatanova I.; Dohnalek J.; Kulhanek P.; Wade R. C.; Damborsky J. Pathways and Mechanisms for Product Release in the Engineered Haloalkane Dehalogenases Explored Using Classical and Random Acceleration Molecular Dynamics Simulations. J. Mol. Biol. 2009, 392, 1339–1356. 10.1016/j.jmb.2009.06.076. [DOI] [PubMed] [Google Scholar]
  57. Marques S. M.; Dunajova Z.; Prokop Z.; Chaloupkova R.; Brezovsky J.; Damborsky J. Catalytic Cycle of Haloalkane Dehalogenases Toward Unnatural Substrates Explored by Computational Modeling. J. Chem. Inf. Model. 2017, 57, 1970–1989. 10.1021/acs.jcim.7b00070. [DOI] [PubMed] [Google Scholar]
  58. Mitusińska K.; Bzówka M.; Magdziarz T.; Góra A. Geometry-Based versus Small-Molecule Tracking Method for Tunnel Identification: Benefits and Pitfalls. J. Chem. Inf. Model. 2022, 62, 6803–6811. 10.1021/acs.jcim.2c00985. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Kaushik S.; Marques S. M.; Khirsariya P.; Paruch K.; Libichova L.; Brezovsky J.; Prokop Z.; Chaloupkova R.; Damborsky J. Impact of the Access Tunnel Engineering on Catalysis Is Strictly Ligand specific. FEBS J. 2018, 285, 1456–1476. 10.1111/febs.14418. [DOI] [PubMed] [Google Scholar]
  60. Miao Y.; Feher V. A.; McCammon J. A. Gaussian Accelerated Molecular Dynamics: Unconstrained Enhanced Sampling and Free Energy Calculation. J. Chem. Theory Comput. 2015, 11, 3584–3595. 10.1021/acs.jctc.5b00436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Huang Y. M. Multiscale Computational Study of Ligand Binding Pathways: Case of P38 MAP Kinase and Its Inhibitors. Biophys. J. 2021, 120, 3881–3892. 10.1016/j.bpj.2021.08.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Okai M.; Ohtsuka J.; Imai L. F.; Mase T.; Moriuchi R.; Tsuda M.; Nagata K.; Nagata Y.; Tanokura M. Crystal Structure and Site-Directed Mutagenesis Analyses of Haloalkane Dehalogenase LinB from Sphingobium Sp. Strain MI1205. J. Bacteriol. 2013, 195, 2642–2651. 10.1128/JB.02020-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Swanson P. E. Dehalogenases Applied to Industrial-Scale Biocatalysis. Curr. Opin. Biotechnol. 1999, 10, 365–369. 10.1016/S0958-1669(99)80066-4. [DOI] [PubMed] [Google Scholar]
  64. Bidmanova S.; Chaloupkova R.; Damborsky J.; Prokop Z. Development of an Enzymatic Fiber-Optic Biosensor for Detection of Halogenated Hydrocarbons. Anal. Bioanal. Chem. 2010, 398, 1891–1898. 10.1007/s00216-010-4083-z. [DOI] [PubMed] [Google Scholar]
  65. Liskova V.; Bednar D.; Prudnikova T.; Rezacova P.; Koudelakova T.; Sebestova E.; Smatanova I. K.; Brezovsky J.; Chaloupkova R.; Damborsky J. Balancing the Stability-Activity Trade-Off by Fine-Tuning Dehalogenase Access Tunnels. ChemCatchem 2015, 7, 648–659. 10.1002/cctc.201402792. [DOI] [Google Scholar]
  66. Pavlová M.; Klvaňa M.; Jesenská A.; Prokop Z.; Konečná H.; Sato T.; Tsuda M.; Nagata Y.; Damborský J. The Identification of Catalytic Pentad in the Haloalkane Dehalogenase DhmA from Mycobacterium Avium N85: Reaction Mechanism and Molecular Evolution. J. Struct. Biol. 2007, 157, 384–392. 10.1016/j.jsb.2006.09.004. [DOI] [PubMed] [Google Scholar]
  67. Gordon J. C.; Myers J. B.; Folta T.; Shoja V.; Heath L. S.; Onufriev A. H++: A Server for Estimating pKas and Adding Missing Hydrogens to Macromolecules. Nucleic Acids Res. 2005, 33, W368–W371. 10.1093/nar/gki464. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Izadi S.; Anandakrishnan R.; Onufriev A. V. Building Water Models: A Different Approach. J. Phys. Chem. Lett. 2014, 5, 3863–3871. 10.1021/jz501780a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Mermelstein D. J.; Lin C.; Nelson G.; Kretsch R.; McCammon J. A.; Walker R. C. Fast and Flexible Gpu Accelerated Binding Free Energy Calculations within the Amber Molecular Dynamics Package. J. Comput. Chem. 2018, 39, 1354–1358. 10.1002/jcc.25187. [DOI] [PubMed] [Google Scholar]
  70. Case D. A.; Ben-Shalom I.; Brozell S. R.; Cerutti D. S.; Cheatham T.; Cruzeiro V. W. D.; Darden T.; Duke R. E.; Ghoreishi D.; Gilson M. K.. Amber 2018, ambermd.org, 2018. [Google Scholar]
  71. Maier J. A.; Martinez C.; Kasavajhala K.; Wickstrom L.; Hauser K. E.; Simmerling C. ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from. J. Chem. Theory Comput. 2015, 11, 3696–3713. 10.1021/acs.jctc.5b00255. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Zwanzig R. Nonlinear Generalized Langevin Equations. J. Stat. Phys. 1973, 9, 215–220. 10.1007/BF01008729. [DOI] [Google Scholar]
  73. Darden T.; York D.; Pedersen L. Particle Mesh Ewald: An N ·log(N) Method for Ewald Sums in Large Systems. J. Chem. Phys. 1993, 98, 10089–10092. 10.1063/1.464397. [DOI] [Google Scholar]
  74. Essmann U.; Perera L.; Berkowitz M. L.; Darden T.; Lee H.; Pedersen L. G. A Smooth Particle Mesh Ewald Method. J. Chem. Phys. 1995, 103, 8577–8593. 10.1063/1.470117. [DOI] [Google Scholar]
  75. Hopkins C. W.; Le Grand S.; Walker R. C.; Roitberg A. E. Long-Time-Step Molecular Dynamics through Hydrogen Mass Repartitioning. J. Chem. Theory Comput. 2015, 11, 1864–1874. 10.1021/ct5010406. [DOI] [PubMed] [Google Scholar]
  76. Roe D. R.; Cheatham T. E. PTRAJ and CPPTRAJ: Software for Processing and Analysis of Molecular Dynamics Trajectory Data. J. Chem. Theory Comput. 2013, 9, 3084–3095. 10.1021/ct400341p. [DOI] [PubMed] [Google Scholar]
  77. Lobanov M. Y.; Bogatyreva N. S.; Galzitskaya O. V. Radius of Gyration as an Indicator of Protein Structure Compactness. Mol. Biol. 2008, 42, 623–628. 10.1134/S0026893308040195. [DOI] [PubMed] [Google Scholar]
  78. Shrake A.; Rupley J. A. Environment and Exposure to Solvent of Protein Atoms. Lysozyme and Insulin. J. Mol. Biol. 1973, 79, 351–371. 10.1016/0022-2836(73)90011-9. [DOI] [PubMed] [Google Scholar]
  79. Aurenhammer F. Voronoi Diagrams—a Survey of a Fundamental Geometric Data Structure. ACM Comput. Surv. 1991, 23, 345–405. 10.1145/116873.116880. [DOI] [Google Scholar]
  80. Dijkstra E. W.; Apt K. R.; Hoare T.. A Note on Two Problems in Connexion with Graphs. In Edsger Wybe Dijkstra; ACM: New York, NY, USA, 2022, 287-290. 10.1145/3544585.354460 [DOI] [Google Scholar]
  81. Pavelka A.; Sebestova E.; Kozlikova B.; Brezovsky J.; Sochor J.; Damborsky J. CAVER: Algorithms for Analyzing Dynamics of Tunnels in Macromolecules. IEEE/ACM Trans. Comput. Biol. Bioinf. 2015, 13 (3), 505–517. 10.1109/TCBB.2015.2459680. [DOI] [PubMed] [Google Scholar]
  82. Brezovsky J.; Thirunavukarasu A. S.; Surpeta B.; Sequeiros-Borja C. E.; Mandal N.; Sarkar D. K.; Dongmo Foumthuim C. J.; Agrawal N. TransportTools: A Library for High-Throughput Analyses of Internal Voids in Biomolecules and Ligand Transport through Them. Bioinformatics 2022, 38, 1752–1753. 10.1093/bioinformatics/btab872. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Vavra O.; Filipovic J.; Plhak J.; Bednar D.; Marques S. M.; Brezovsky J.; Stourac J.; Matyska L.; Damborsky J. CaverDock: A Molecular Docking-Based Tool to Analyse Ligand Transport through Protein Tunnels and Channels. Bioinformatics 2019, 35, 4986–4993. 10.1093/bioinformatics/btz386. [DOI] [PubMed] [Google Scholar]
  84. Jiménez J.; Doerr S.; Martínez-Rosell G.; Rose A. S.; De Fabritiis G. DeepSite: Protein-Binding Site Predictor Using 3D-Convolutional Neural Networks. Bioinformatics 2017, 33, 3036–3042. 10.1093/bioinformatics/btx350. [DOI] [PubMed] [Google Scholar]
  85. Xiao S.; Tian H.; Tao P. PASSer2.0: Accurate Prediction of Protein Allosteric Sites Through Automated Machine Learning. Front. Mol. Biosci. 2022, 9, 879251. 10.3389/fmolb.2022.879251. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Egbert M.; Jones G.; Collins M. R.; Kozakov D.; Vajda S. FTMove: A Web Server for Detection and Analysis of Cryptic and Allosteric Binding Sites by Mapping Multiple Protein Structures. J. Mol. Biol. 2022, 434, 167587. 10.1016/j.jmb.2022.167587. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Trott O.; Olson A. J. AutoDock Vina: Improving the Speed and Accuracy of Docking with a New Scoring Function, Efficient Optimization, and Multithreading. J. Comput. Chem. 2010, 31, 455–461. 10.1002/jcc.21334. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Filipovic J.; Vavra O.; Plhak J.; Bednar D.; Marques S. M.; Brezovsky J.; Matyska L.; Damborsky J. CaverDock: A Novel Method for the Fast Analysis of Ligand Transport. IEEE/ACM Trans. Comput. Biol. Bioinf. 2020, 17, 1625–1638. 10.1109/TCBB.2019.2907492. [DOI] [PubMed] [Google Scholar]
  89. Morris G. M.; Huey R.; Lindstrom W.; Sanner M. F.; Belew R. K.; Goodsell D. S.; Olson A. J. AutoDock4 and AutoDockTools4: Automated Docking with Selective Receptor Flexibility. J. Comput. Chem. 2009, 30, 2785–2791. 10.1002/jcc.21256. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Pakuła K.; Sequeiros-Borja C.; Biała-Leonhard W.; Pawela A.; Banasiak J.; Bailly A.; Radom M.; Geisler M.; Brezovsky J.; Jasiński M. Restriction of Access to the Central Cavity Is a Major Contributor to Substrate Selectivity in Plant ABCG Transporters. Cell. Mol. Life Sci. 2023, 80, 105. 10.1007/s00018-023-04751-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Lever J.; Krzywinski M.; Altman N. Principal Component Analysis. Nat. Methods 2017, 14, 641–642. 10.1038/nmeth.4346. [DOI] [Google Scholar]
  92. Pedregosa F.; Varoquaux G.; Gramfort A.; Michel V.; Thirion B.; Grisel O.; Blondel M.; Prettenhofer P.; Weiss R.; Dubourg V.; et al. Scikit-Learn: Machine Learning in Python. J Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
  93. Raczyńska A.; Kapica P.; Papaj K.; Stańczak A.; Shyntum D.; Spychalska P.; Byczek-Wyrostek A.; Góra A. Transient Binding Sites at the Surface of Haloalkane Dehalogenase LinB as Locations for Fine-Tuning Enzymatic Activity. PLoS One 2023, 18, e0280776 10.1371/journal.pone.0280776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Pinto G. P.; Vavra O.; Filipovic J.; Stourac J.; Bednar D.; Damborsky J. Fast Screening of Inhibitor Binding/Unbinding Using Novel Software Tool CaverDock. Front. Chem. 2019, 7, 709. 10.3389/fchem.2019.00709. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Sequeiros-Borja C. E.; Surpeta B.; Brezovsky J. Recent Advances in User-Friendly Computational Tools to Engineer Protein Function. Brief. Bioinf. 2021, 22, bbaa150. 10.1093/bib/bbaa150. [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Marques S. M.; Bednar D.; Damborsky J. Computational Study of Protein-Ligand Unbinding for Enzyme Engineering. Front. Chem. 2019, 6, 650. 10.3389/fchem.2018.00650. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Sequeiros-Borja C.; Surpeta B.; Thirunavukarasu A. S.; Dongmo Foumthuim C. J.; Marchlewski I.; Brezovsky J. Water Will Find Its Way: Transport through Narrow Tunnels in Hydrolases. J. Chem. Inf. Model. 2024, 64, 6014–6025. 10.1021/acs.jcim.4c00094. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Tänzel V.; Jäger M.; Wolf S. Learning Protein-Ligand Unbinding Pathways via Single-Parameter Community Detection. J. Chem. Theory Comput. 2024, 20, 5058–5067. 10.1021/acs.jctc.4c00250. [DOI] [PubMed] [Google Scholar]
  99. Ernst M.; Sittel F.; Stock G. Contact- and Distance-Based Principal Component Analysis of Protein Dynamics. J. Chem. Phys. 2015, 143, 244114. 10.1063/1.4938249. [DOI] [PubMed] [Google Scholar]
  100. Busch M. R.; Drexler L.; Mahato D. R.; Hiefinger C.; Osuna S.; Sterner R. Retracing the Rapid Evolution of an Herbicide-Degrading Enzyme by Protein Engineering. ACS Catal. 2023, 13, 15558–15571. 10.1021/acscatal.3c04010. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ci4c00966_si_001.pdf (9.1MB, pdf)
ci4c00966_si_002.pdf (6.4MB, pdf)

Data Availability Statement

Underlying data are available on the Zenodo repository: (i) https://zenodo.org/doi/10.5281/zenodo.11092891, containing input, output, and analysis files; (ii) https://zenodo.org/doi/10.5281/zenodo.11093856, containing the parameter files and stripped MD trajectory files.


Articles from Journal of Chemical Information and Modeling are provided here courtesy of American Chemical Society

RESOURCES