Skip to main content
ACS AuthorChoice logoLink to ACS AuthorChoice
. 2024 Apr 5;128(15):3631–3642. doi: 10.1021/acs.jpcb.4c01271

PaCS-Toolkit: Optimized Software Utilities for Parallel Cascade Selection Molecular Dynamics (PaCS-MD) Simulations and Subsequent Analyses

Shinji Ikizawa , Tatsuki Hori , Tegar Nurwahyu Wijaya †,, Hiroshi Kono , Zhen Bai , Tatsuhiro Kimizono , Wenbo Lu , Duy Phuoc Tran , Akio Kitao †,*
PMCID: PMC11033871  PMID: 38578072

Abstract

graphic file with name jp4c01271_0006.jpg

Parallel cascade selection molecular dynamics (PaCS-MD) is an enhanced conformational sampling method conducted as a “repetition of time leaps in parallel worlds”, comprising cycles of multiple molecular dynamics (MD) simulations performed in parallel and selection of the initial structures of MDs for the next cycle. We developed PaCS-Toolkit, an optimized software utility enabling the use of different MD software and trajectory analysis tools to facilitate the execution of the PaCS-MD simulation and analyze the obtained trajectories, including the preparation for the subsequent construction of the Markov state model. PaCS-Toolkit is coded with Python, is compatible with various computing environments, and allows for easy customization by editing the configuration file and specifying the MD software and analysis tools to be used. We present the software design of PaCS-Toolkit and demonstrate applications of PaCS-MD variations: original targeted PaCS-MD to peptide folding; rmsdPaCS-MD to protein domain motion; and dissociation PaCS-MD to ligand dissociation from adenosine A2A receptor.

1. Introduction

Parallel cascade selection molecular dynamics (PaCS-MD) is an enhanced conformational sampling method based on molecular dynamics (MD) simulation using only standard force fields, and consists of cycles of multiple MD simulations conducted in parallel using different conditions (replicas) and selections of the initial structures closest to a target for the next cycle.114 This procedure can be considered as a “repetition of time leaps in parallel worlds” until reaching a target, which enables complete parallel execution of all the MD simulations for each cycle or computation with distributed computing. PaCS-MD has been shown to be very efficient in sampling large protein conformational change,1,2,6,810,13,15 peptide folding,1 the dissociation and association of protein/ligand, protein/peptide, and protein/DNA complexes4,5,7,11,13,14,16 without applying any bias during each MD simulation.

Typical enhanced sampling simulations with additional bias may require the modification of MD software to add a new function to calculate the bias or to adjust bias parameters to avoid applying too much perturbation to the system. In contrast, PaCS-MD does not require software modifications or such parameter adjustments.4 The sampling efficiency along a certain direction in PaCS-MD is enhanced by introducing a quantity for the selection (“selection feature” hereafter). MD snapshots generated in each cycle are then rank ordered based on the selection feature, and the top nrep snapshots (nrep: the number of replicas simulated in each cycle) are employed as the initial structures for the next cycle (Figure 1). This selection protocol effectively harvests snapshots from the edge of distributions along the direction of the selection feature, which are rarely reached on the MD time scale. Therefore, the selection considerably increases the probabilities of relatively rare events occurring in each cycle, thus greatly enhancing conformational sampling along the selection feature after the accumulation of many cycles. The selection protocol generates a set of MD trajectories evolved in a cascaded manner (see Figure 4 of ref (1)).

Figure 1.

Figure 1

Schematic illustration of how PaCS-MD trajectories evolve during the cycles, shown as projections onto the space spanned by the direction of the selection feature and other directions for the case of nrep = 3. The circled numbers represent cycle numbers and are marked with the positions of the initial structures of each cycle.

In addition, velocities are typically reinitialized at the beginning of each cycle based on their Maxwell–Boltzmann distribution, so the trajectories in the new cycle are significantly modulated and new routes are explored from the branch points compared to those in the previous cycle (Figure 1). Significant benefits of the reinitializing protocol have been recognized in the late 1990s in the case of short (120 ps) MD simulations of a protein, in which considerable differences along dominant principal components are induced only by the initial velocity variations.17 The efficiency of the reinitializing protocol in PaCS-MD was also revisited and reconfirmed recently.18 The ratio of acceleration estimated as the experimentally observed time scale of the event divided by the simulation time of PaCS-MD can be 108 or higher. For example, the actual time scale of dissociation of a protein/peptide complex deduced from dissociation rate constants is on the order of seconds, whereas the simulation time of PaCS-MD is nanoseconds to tens of nanoseconds.5,7

The choice of the selection feature is essential for enhancing sampling efficiency in PaCS-MD and a variety of selection protocols have been proposed.110,13,14 This is relatively straightforward when the target directions along which conformational sampling will be enhanced are identified in advance. The original version of PaCS-MD, sometimes called targeted PaCS-MD or t-PaCS-MD, selects the MD snapshots closest to the target structure in the root-mean-square deviation (RMSD) of the atoms. This approach can efficiently generate pathways for protein conformational transitions and for peptide folding.1 In contrast to t-PaCS-MD, rmsdPaCS-MD moves the system the farthest in RMSD from the initial structure.13 The distance between two regions in a protein, such as an interdomain distance6 or interlid distance,15 are used as selection features to enhance the open-close motions of proteins. The intercenter-of-mass distance between two molecules (dcom)4,5,7,11,19 is also used as a selection feature to enhance intermolecular dynamics. Dissociation PaCS-MD (dPaCS-MD), in which the snapshots with the longest dcom are selected, can generate dissociation pathways from complexes, e.g., protein/ligand,4,11 protein/peptide,5,19 and protein/DNA16 complexes within a simulation time of a few to tens of nanoseconds. Association and dissociation PaCS-MD (a/dPaCS-MD) consists of switching between the cycles of association PaCS-MD (aPaCS-MD) conducted by selecting the closest snapshots in intermolecular dcom and the cycles of dPaCS-MD. This procedure generated many encounter complexes between MDM2 protein and an intrinsically disordered region of p53 protein. Subsequent additional relaxation MD simulations started from many a/dPaCS-MD-generated encounter complexes and Markov state model (MSM) analysis provided the global free energy minimum structure as the conformation closest and very similar to the crystal complex structure among the sampled conformation.7 LB-PaCS-MD is simulated in a ligand-concentrated environment and selects the closest dcom snapshots for frequent sampling of binding pathways.14 PaCS-Fit expands the concept of PaCS-MD to the use of experimental data, using better fitting to low-resolution structure data as the selection feature and constructing structural models based on small-angle X-ray scattering and cryo-electron microscopy data.20

Variations of PaCS-MD enhance sampling, even when specific target directions are unidentified. The first version of this approach is nontargeted PaCS-MD (nt-PaCS-MD),2 in which structures significantly deviating from an average are selected based on Gram–Schmidt orthogonalization of the distribution of the sampled snapshots. Edge expansion PaCS-MD (eePaCS-MD)8,9 selects the edges and vertices of the already sampled conformational space in a multidimensional principal component subspace by solving the “convex hull problem”.21 The approach conducted in combination with accelerated MD22 and is called eePaCS-aMD further improving the efficiency of sampling. Anomaly detection PaCS-MD (ad-PaCS-MD)10 employs an anomaly detection generative adversarial network (anoGAN)23 for the selection process. Independent nontargeted PaCS-MD (Ino-PaCS-MD) is an extension of nt-PaCS-MD, wherein multiple nt-PaCS-MD are started from different initial configurations.12 Tree search MD (TS-MD)24 employs a reinforcement learning algorithm, called upper bounds for trees (UCT),25 for selection. Harada and co-workers developed many simulation methods comprising cycles of parallel MDs and selections of the initial structures, and proposed them with different names, e.g., fluctuation flooding method (FFM),26 outlier folding (OFLOOD),2729 taboo search (TBSA),30,31 structural dissimilarity sampling (SDS),32 extended SDS,33 and self-avoiding conformational sampling (SACS).34

PaCS-MD shares similar features with weighted ensemble (WE)3541 and forward flux sampling.42,43 PaCS-MD spreads trajectories using the aforementioned cascading procedure, whereas WE is conducted based on the “splitting and merging” of trajectories.3841 Due to differences between these approaches, trajectory weights in WE can be obtained with the established procedures, whereas PaCS-MD typically requires different post processing methods. Early PaCS-MD development used the generated trajectories as starting structures for subsequent sampling simulations, such as umbrella sampling as shown in previous examples,1,4 although recently the Markov state model (MSM)4446 is more frequently used for further analysis. The velocity reinitialization protocol introduces disconnectivity among a set of PaCS-MD-generated trajectories in phase space, but trajectories are mutually connected in conformational space due to the branching of the trajectories. MSM that assumes a Markov process is thus a suitable analysis method for PaCS-MD-generated trajectories. The PaCS-MD/MSM combination is widely used to calculate the free energy landscape of conformational changes,6,8,9,13,15 standard binding free energy (ΔG°),4,5,7,11,13,16,19 the association (kon) and dissociation rate constants (koff) of protein complexes,5,7 and flux along conformational transition pathways.7,15 If sampling is insufficient to construct an MSM, additional MD simulations can be added afterward.11,19 In the MSM step, the sampled snapshots are grouped into discrete microstates based on certain features (termed MSM features hereafter), which are typically defined as collective coordinates that well characterize matters of interest. PaCS-MD accelerates dynamics along the selection feature but less so dynamics along other directions. Noteworthily, the MSM features can be different from the selection feature and can be decided after PaCS-MD. One trial of PaCS-MD samples the conformational space along a relatively narrow pathway.4,5 Therefore, multiple trials of PaCS-MD should be performed to cover a broader conformational space. This process is also important to obtain more statistics, especially when users plan to use MSM for post PaCS-MD analysis. MD simulations in PaCS-MD are typically conducted without bias, but a set of generated MD trajectories might contain artifacts introduced by the selection. MSM analysis likely reduces the effects of potential biases on the set of PaCS-MD trajectories.

Since PaCS-MD is conducted as a combination of MD simulations using only standard force fields without extra bias, it can be performed without the need to modify MD software. However, PaCS-MD requires scripts for executing multiple MD simulations with different initial conditions, calculating the selection feature from the obtained trajectories, rank ordering of the snapshots based on the selection feature, and preparing the initial structures for the next cycle. We developed PaCS-Toolkit as an optimized package that facilitates PaCS-MD simulation and enables the use of different MD software and trajectory analysis tools. PaCS-Toolkit also aids in the analysis of the obtained trajectories including preparation for subsequent MSM analysis. PaCS-Toolkit includes tools for conducting a variety of PaCS-MD simulations. In this paper, we present the software design of PaCS-Toolkit and examples of its applications using the original PaCS-MD (t-PaCS-MD),1 dPaCS-MD,4,5,11 and rmsdPaCS-MD13 in combination with MSM.

2. Methods

2.1. Features of PaCS-Toolkit

PaCS-Toolkit is coded with Python and is compatible with Python 3.7 or later versions. PaCS-Toolkit requires standard Python libraries such as NumPy47 and SciPy48 and libraries for parallel processing. The user’s choice of MD software must be installed in advance. The current version, PaCS-Toolkit1.0, can employ GROMACS,49,50 AMBER,51 and NAMD.52 PaCS-Toolkit is compatible with various computing environments, including super computers with a message passing interface (MPI), servers with graphic processor units (GPUs), and personal computers such as laptops. PaCS-Toolkit heavily incorporates parallelization using MPI, GPU, and the multiprocessing package of Python, enabling optimization of the computation time depending on the available computational environments. PaCS-Toolkit allows easy customization by editing the configuration file and specifying the MD software to be used. PaCS-MD maintains flexibility so that new features, libraries, and software can be added by introducing responsible “classes” of Python.

The core of the PaCS-Toolkit for running PaCS-MD is composed of simulator, analyzer, and exporter. Simulator performs MD simulations via user-specified MD software. Analyzer analyzes MD trajectory files generated by simulator, calculates the selection feature, and ranks the snapshots. Exporter generates initial structures for the next cycle based on the rankings.

2.2. How to Use PaCS-Toolkit

Users can conduct PaCS-MD by preparing a single configuration file (input.toml) and necessary standard MD input files and by specifying options and I/O files. PaCS-Toolkit is executed by entering the command, “pacs [function] [parameters]”. Available functions are listed in Table 1. For example, PaCS-MD is performed by “pacs mdrun -t 1 -f input.toml”, wherein “mdrun” identifies the function to execute PaCS-MD, and “-t 1” indicates the number of PaCS-MD trials. The specified number is used as a part of the directory name, and “-f input.toml” specifies the configuration file.

Table 1. Functions of the PaCS-Toolkit.

type of function function purpose
execution of PaCS-MD mdrun execute PaCS-MD
post-PaCS-MD tools genrepresent generate representative conformational pathways along the PaCS-MD cycles
  fit regenerate MD trajectories after best-fitting selected regions/molecules to a reference
  gencom generate the center of mass (COM) trajectories of a molecule in PDB format (.pdb)
  rmmol reduce the size of MD trajectories by selecting necessary molecules and removing the rest
  rmfile remove files unnecessary for PaCS-MD and analysis
MSM tool genfeature calculate the MSM features for MSM analysis and output them in NumPy format (.npy)

The content of input.toml controls the setting, I/O files, and use of modules with “keywords”. An example of the content of input.toml is shown in Figure 2. The choice of GROMACS,49,50 AMBER,51 or NAMD52 is indicated by the keyword, simulator = “gromacs”, “amber”, or “namd” in input.toml, respectively. The type of PaCS-MD to be conducted is determined by the keyword type, as shown in Table 2. User-defined selection features can be used by specifying type = “template” and by modifying the file template.py, in which three functions to calculate the selection feature must be described by the users. To analyze the MD trajectories, users should select from GROMACS,49 Cpptraj (the main program for processing trajectory files in AMBER),53 and MDtraj (Python libraries for MD trajectories).54 The choice is indicated by the keyword analyzer = “gmx”, “cpptraj”, or “mdtraj”, respectively.

Figure 2.

Figure 2

Example of the configuration file, input.toml for dPaCS-MD conducted with GROMACS. Major keywords explained in the text are shown in blue.

Table 2. Available PaCS-MD Methods Are Specified by the Keyword type.

types of PaCS-MD specified value by type
t-PaCS-MD1 target
rmsdPaCS-MD13 rmsd
dPaCS-MD4,5,11 dissociation
aPaCS-MD7 association
a/dPaCS-MD7 a_d
eePaCS-MD8,9 ee
user defined feature template

2.3. Typical Settings for the PaCS-MD

Important setting parameters for PaCS-MD include the number of replicas (nrep), the length of each MD simulation cycle (tMD), and the number of trials (ntrial). PaCS-MD execution should be stopped when the number of cycles (ncyc) reaches a certain limit. In the configuration file, nrep is specified by the keyword n_replica. In principle, the probability of the occurrence of stochastic events is proportional to nrep. The sampling efficiency therefore increases as a greater number of replicas is used, reducing the ncyc required to reach a target.4 Typical nrep values are 10–100. Users planning to construct an MSM from a single trial of PaCS-MD that generates samples along a narrow conformational transition pathway should use a minimum nrep of 30 but 50 is better. When dPaCS-MD was conducted to sample dissociation pathways of a protein/peptide complex19 and protein/ligand complexes11 with nrep = 10, 10–30 MD simulations per PaCS-MD cycle were additionally performed for some of the trials so that a one-dimensional MSM (1D-MSM) per pathway was constructed with sufficient statistics. These additional MD simulations were started from the initial structures of PaCS-MD cycles but with different initial velocities until sufficient samples were obtained to construct the MSM. Since the use of a larger nrep reduces the ncyc required to reach a target, PaCS-MD with nrep = 30–50 will likely reduce the total computational cost compared to conducting PaCS-MD with nrep = 10 and additional MD simulations later.

The length of MD simulation per cycle (tMD) is the MD time step Δt multiplied by the number of MD iterations and should be specified inside the MD configuration file assigned by the keyword mdconf. One key factor in determining tMD is the time scale of dynamics along the selection feature, so tMD should be sufficiently long to observe fluctuation of the selection feature. Another key factor is the effect of the initialization protocol, which modulates the system only once per cycle. A shorter tMD introduces a greater chance of modulation caused by initialization per fixed MD cost. A third key factor should be considered if the generated data are analyzed later by MSM to consider transitions between discrete microstates separated by a lag time τ.5558 If tMD is too short, it may be difficult to construct an MSM. For adequate statistics, τ should be the minimum of tMD/2 and the maximum of lag time before disconnecting the state involved in the slowest process.59 Thus, tMD should be equal to or greater than 2τ, but the choice of τ also depends on the level of coarse-graining and discretization. The optimal value of tMD should be determined based on the balance between these key factors. The choice of tMD is strongly dependent on the dynamics of interest, which can vary from 0.1 ns to tens of nanoseconds or longer. In dPaCS-MD/MSM and a/dPaCS-MD/MSM, tMD = 0.1 ns is frequently used, which is sufficient for constructing an MSM to calculate the free energy landscape and flux along the pathways and to reproduce experimentally determined ΔG°, and kon and koff values.4,5,7,11,16,19

The maximum limit of ncyc is specified by the keyword max_cycle. ncyc required to achieve the expected sampling strongly depends on the targets and typically varies from several to hundreds of cycles. The RMSD target value from a reference or dcom to be reached is specified with the keyword threshold (unit: nm). PaCS-MD stops when either ncyc reaches max_cycle, or the condition indicated by threshold is satisfied, except for eePaCS-MD and a/dPaCS-MD, in which no threshold to interrupt the execution is specified. a/dPaCS-MD requires several different parameters.7 When dcom reaches the target value specified by the keyword d_threshold during the execution of dPaCS-MD, it switches to aPaCS-MD, and then when association movements stack for several cycles, the moving direction switches to dPaCS-MD. The switching condition is controlled by two keywords: bound_threshold defines the number of cycles wherein association movements stack before switching to dPaCS-MD; and frame_sel sets the number of first frames considered to judge the occurrence of stacking.

The residues and atoms involved the RMSD and dcom calculations are specified by the keywords, selection1, selection2 (mandatory), selection3, and selection4 (optional). In aPaCS-MD, dPaCS-MD, and a/dPaCS-MD, selection1 and selection2 specify the groups of residues and/or atoms (referred to as group1 and group2 hereafter) for which the centers of mass are calculated and dcom is determined between the two groups. To dissociate or associate two molecules, groups 1 and 2 should be selected from different molecules. The selection of two domains from the same protein allows interdomain motion to be simulated. In t-PaCS-MD and rmsdPaCS-MD, selection1 indicates the residues and/or atoms of group1 to be best-fitted, and selection2 defines the group2 whose coordinates are translated and rotated according to the best-fit performed with group1 and for which RMSD is calculated. Typically, group1 and group2 can be identical in t-PaCS-MD. The coordinate file of the reference structure for the RMSD calculation is indicated by the keyword, reference. The keywords selection3 and selection4 indicate the reference coordinates for groups 1 and group2, respectively. The specification of selection3 and selection4 is required only when the content of the reference file is somewhat different from the information in PaCS-MD. For example, if the original coordinate files of the Protein Data Bank are used as the reference, the residue numbering and contained atoms can be different from those in the MD software. When these two keywords are unspecified, they are automatically set to selection1 and selection2, respectively. The formats to select residues/atoms depend on the choice of analyzer because the specification methods of the analyzer, gmx, cppdtraj, and mdtraj, are used.

The number of trials (ntrial) required for adequate sampling depends on the purpose of the simulation. ntrial should be determined so that the conformational space is sufficiently sampled to calculate the target quantities. The minimum ntrial necessary to calculate the binding free energies of complexes with dPaCS-MD/MSM is around five4,11,19 because the free energy difference between a well-defined bound state and sufficiently distant unbound states should in principle be path independent and can be calculated without extensive sampling of many different pathways. However, at least 10 or more trials are required to sample significant pathways covering different directions when path-dependent dissociation mechanisms are investigated16 or experimentally measured kon and koff are to be reproduced.5 As mentioned previously, the trial number is specified by the “-t” option when PaCS-MD is executed with “pacs mdrun”.

2.4. Applications

Three different types of simulations conducted with the PaCS-Toolkit are demonstrated in this paper. All the MD and PaCS-MD simulations were performed by using GROMACS49,50 with a Nosé–Hoover thermostat60,61 and Parrinello–Rahman barostat62 employed to achieve isothermal and isobaric conditions, respectively. GROMACS was used as the analyzer. The “genfeature” function of PaCS-Toolkit (Table 1) was used to output the MSM features and pyEMMA263 was employed to construct MSM. k-means clustering was performed with the k++ algorithm64 to discretize the MSM features into microstates.

2.4.1. Folding of the “Mini-Protein” Chignolin by t-PaCS-MD/MSM

Chignolin is a designed peptide consisting of 10 amino acid residues (GYDPETGTWG) and is a “mini-protein” that folds into a β-hairpin structure (PDB ID: 1UAO).65 Here, chignolin was first simulated at 500 K so that it completely unfolded; then, using t-PaCS-MD started from considerably different structures, it was folded into a native fold.

The initial structure of chignolin taken from model 1 of the 1UAO PDB file65 was solvated into a 5.4 × 5.4 × 5.4 nm3 water box with 0.15 M NaCl comprising 11,512 atoms in total. The CHARMM36m and TIP3P(CHARMM) force fields66 were used for chignolin and the water molecules, respectively. After energy minimizations with the steepest descent method followed by the conjugate gradient method, MD simulation was conducted at 500 K for 1 ns with restraints to maintain native hydrogen bonds with the NVT ensemble and then continued for 50 ns at 500 K and 1 bar with the NPT ensemble. The MD time step was 1 fs in all of the simulations. The following 200 ns MD simulation was performed without restraints to sample significantly different unfolded structures, and four initial structures for t-PaCS-MD were randomly selected from representative structures of clustered chignolin snapshots.

t-PaCS-MD was conducted 20 times (5 times per initial structure with different initial velocities) with nrep = 50, tMD = 0.1 ns, max_cycle = 30, and threshold = 0.05 nm, employing the backbone RMSD from model 1 of the 1UAO PDB file65 as the selection feature. Although the RMSD threshold of 0.1 nm is sufficient to reach around the native state, the value of 0.05 nm was chosen so that conformations around the native state are sufficiently sampled. The actual PaCS-MD cycles finished within ncyc ≤ 30 in eight cases. Seventeen trials out of 20 satisfied the condition RMSD <0.1 nm within 30 cycles. To obtain more samples around the native state, additional 2 ns MD simulations were performed starting from the last snapshots of these trials.

We constructed the MSM by employing the three hydrogen bond distances, HB1, HB2, and HB3 as the MSM features. HB1 is formed between the main chain amide N atom of Asp3 (hydrogen bond donor) and the main chain carbonyl O atom of Thr8 (acceptor). HB2 is an alternative to HB1 formed between Asp3:N and Gly7:O. HB1 and HB2 are considered good indicators to distinguish the native (HB1) and misfolded (HB2) states.1,6770 HB3 (Gly7:N–Asp3:O) is predominantly formed before HB1 and HB2 are formed.1 The trajectories in 3D space spanned by the distances of HB1, HB2, and HB3 were clustered into 2000. The trajectories with the HB3 distance >4 Å were excluded in this step, similar to the methods used in refs,1,67,68 3D-MSM was conducted with the lag time of 50 ps, and the free energy landscape of the folding pathways was calculated. The implied time scale versus lag time plot for the MSM analysis is shown in Figure S1.

2.4.2. Domain Motion of the SARS-CoV-2 Nsp15 Monomer by rmsdPaCS-MD/MSM

The Nsp15 protein of coronavirus cleaves the polyuridine (polyU) of negative-sense RNAs, limits the abundance and length of polyU, and delays the type I interferon response in macrophages.71 The crystal structure of the SARS-CoV-2 Nsp15 hexamer in its apo state (PDB ID: 6VWW) reveals a distinctive ring-like complex comprising a dimer of trimers.72 Each monomer structure is characterized by three distinct domains: the N-terminal domain, spanning residues 1 to 68 (called the N-term domain hereafter); the middle domain, encompassing residues 69 to 202 (Mid domain); and the C-terminal domain that contains the NendoU catalytic site and spans residues 203 to 347 (C-term domain). Although the hexamer structure suppresses domain movements, the Nsp15 monomer was shown to be very flexible and could potentially be a target of anti-SARS-CoV-2 drugs.13

The simulated system contained the Nsp15 monomer solvated in a 15.9 × 14.7 × 14.7 nm3 box filled with water molecules and 0.15 M KCl. The AMBER ff19SB force field73 was used for the protein and the OPC water model74 was employed. The equilibrated monomer structures of Nsp15 at 300 K and 1 atm obtained in an earlier paper13 were employed as the initial structures for rmsdPaCS-MD.

Using backbone RMSD from the crystal monomer structure as the selection feature and selecting snapshots with greater RMSD, rmsdPaCS-MD at 300 K and 1 bar was conducted 20 times with nrep = 50 and tMD = 0.1 ns. In all of the trials, ncyc reached max_cycle = 50 because threshold was set to a very large value (10 nm).

1D-MSM was conducted using dcom between the N- and C-term domains as the MSM feature with a lag time of 50 ps, and the free energy profile of the domain motion as a function of dcom was calculated. The implied time scale versus lag time plot for the MSM analysis is shown in Figure S2.

2.4.3. Ligand Dissociation from the Adenosine A2A Receptor by dPaCS-MD/MSM

Adenosine A2A receptor (A2AR) is a member of the G protein-coupled receptor (GPCR) superfamily, plays central roles in sleep regulation, angiogenesis, and immunosuppression, and is considered an important drug target.75 LUF5833 is a nonribose partial agonist of A2AR that binds in the binding pocket.76 The dissociation of LUF5833 from A2AR was simulated by dPaCS-MD. MD simulation of the A2AR/LUF5833 complex was conducted as follows to prepare initial structures for dPaCS-MD. The crystal structure of the A2AR/LUF5833 complex (PDB ID: 7ARO) was embedded in a membrane consisting of 145 POPC (1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine) and 37 cholesterol molecules and was solvated in water with 0.15 M KCl in a 7.7 × 7.7 × 16.6 nm3 box using CHARMM-GUI.77 The system comprised 125,557 atoms. AMBER ff19SB,73 Lipid21,78 and OPC74 were employed for the protein, membrane, and water molecules, respectively. Partial charges of LUF5833 were generated by GAFF279 using Gaussian1680 with B3LYP/6-31G* and Antechamber in AmberTools.81

After energy minimizations with the steepest descent method followed by the conjugate gradient method, MD simulation was conducted for 1 ns at 300 K with the NVT ensemble and continued for 100 ns at 300 K and 1 bar with the NPT ensemble with positional restraints imposed on the A2AR/LUF5833 heavy atoms. The MD time step was 2 fs. Ten independent MD simulations were then conducted without restraints for 1 μs. Snapshots of the second halves of the trajectories were gathered and clustered into 10 groups, and a representative structure was selected from each cluster as the initial structure for the dPaCS-MD. Therefore, ten initial structures were selected for the following PaCS-MD.

dcom between the two molecules was used to dissociate LUF5833 from A2AR as the selection feature and dPaCS-MD was conducted 30 times (3 times per initial structure with different initial velocities) with nrep = 50, tMD = 0.1 ns, and an MD time step of 1 fs. The simulations were conducted with max_cycle = 100 with an extremely large threshold value (100 nm), and the distance exceeded 6.0 nm within the designated cycles in all trials. k-Means clustering into 50 was conducted for the MSM analysis using snapshots with dcom ≤ 6 nm, and a lag time of 50 ps was employed to construct 1D-MSM using dcom as the MSM feature. The implied time scale versus lag time plot for the MSM analysis is shown in Figure S3. ΔG° was calculated from the potential of mean force obtained by the MSM with volume correction82,83 conducted with the procedure described elsewhere11 by volume calculation using the convex hull with the Quickhull algorithm21 implemented in SciPy.48

3. Results and Discussion

3.1. Folding of Chignolin

Folding simulation of chignolin by t-PaCS-MD was started from four different structures generated by the MD simulation. The initial structures and native structure of chignolin are shown in Figure 3A. In 17 trials out of 20, chignolin folded into the native structure (RMSD < 0.1 nm) within 30 cycles as indicated by the plot of RMSD versus ncyc (Figure 3B). Figure 3C shows representative folding pathways of chignolin generated by t-PaCS-MD. Most of the pathways led to the native state directly, but some passed through the misfolded state in which HB2 formed instead of HB1. Figure 3D represents the free energy landscape on the space spanned by the distances of HB1 and HB2, showing free energy minima of the native, misfolded, and intermediate states.1,67,68

Figure 3.

Figure 3

Results of t-PaCS-MD/MSM on chignolin. (A) The native structure of chignolin (the top image. model 1 of 1UAO(65)) and four unfolded starting structures for t-PaCS-MD (the bottom image. green, yellow, orange, and magenta) best-fitted to residues 5 and 6 of the native structure are shown transparently. The images in Figures 13 were created by VMD84 unless otherwise specified. (B) Backbone RMSD (the selection feature) as a function of ncyc during t-PaCS-MD. The horizontal lines show 0.1 and 0.05 nm. (C) Projections of folding pathways mapped onto the space spanned by the distances of HB1 and HB2. Representative pathways are selected from each trial and are shown in different colors. The square symbols indicate the positions of the initial structures, and the open triangle represents the native structure. (D) The free energy landscape of chignolin folding mapped onto the HB1–HB2 distance space.

3.2. Domain Motion of Nsp15

The domain motion of Nsp15 was significantly enhanced by rmsdPaCS-MD. The backbone RMSD for the whole molecule was around 0.5–1.1 nm, indicating that large movements were enhanced by rmsdPaCS-MD (Figure 4A). RMSDs for each domain were around 0.1–0.2 nm except for Mid domain in six trials where RMSDs were greater than 0.2 nm. These results show that the domain structures are mostly well maintained, and rmsdPaCS-MD enhanced domain motions, as shown in Figure 4B, and especially large movements of the N-term domain relative to the other domains. The data used for constructing Figure 4B were prepared by the PaCS-Toolkit function “genrepresent” (see Table 1). Figure 4C shows the free energy profile calculated by 1D-MSM as a function of dcom between the N- and C-term domains, showing that the Nsp15 monomer adopts a more open form compared to the apo hexamer structure, consistent with an earlier study.13

Figure 4.

Figure 4

Results of rmsdPaCS-MD/MSM on the Nsp15 monomer. (A) Backbone RMSD as a function of ncyc during rmsdPaCS-MD for each trial. RMSDs for the whole molecule (the selection feature) and the N-term (blue), Mid (red), and C-domains (gray) are shown. The error bars represent the standard deviations between the replicas. (B) The monomer structure in the apo hexamer72 (cartoon representation) and examples of conformations generated by the rmsdPaCS-MD (thin tube representations) after best-fitting the C-term domain. The last snapshots of all of the trials from the first replica at ncyc = 50 are shown. (C) The free energy profile as a function of dcom between the N- and C-term domains. The vertical line shows the dcom of the apo hexamer.

3.3. Ligand Dissociation from A2AR

Dissociation of LUF5833 from A2AR was investigated by dPaCS-MD/MSM. Figure 5A shows the simulated system containing the A2AR/LUF5833 complex. The plot of the selection feature dcom versus ncyc (Figure 5B) indicates that, in all 30 trials, LUF5833 dissociated up to 6 nm or more from A2AR within 30 cycles, with only up to 3 ns (10–9 s) required for each dPaCS-MD trial. The experimental value of koff is 0.16 ± 0.08 min–1, meaning that dissociation takes the order of 1 × 102 s. Therefore, the ratio of acceleration is on the order of 1 × 1011. Figure 5C depicts the center of mass positions of LUF5833 along the dissociation pathways sampled by dPaCS-MD, showing that after exiting from the narrow binding pocket of A2AR, LUF5833 diffused in different directions. The presented data were prepared by the PaCS-Toolkit function “gencom” (see Table 1). Figure 5D shows the potential of mean force (PMF) calculated by 1D-MSM as a function of dcom, which mostly converges between 4.5 and 5.0 nm. The value of ΔG° deduced from the PMF after volume correction82,83 is −10.9 ± 0.2 kcal/mol. Kinetic KD obtained as experimentally measured koff/kon for LUF5833 is 19 ± 0.6 nM,85 which is equivalent to ΔG° = −10.6 kcal/mol. The ΔG° value deduced from dPaCS-MD/MSM is in good agreement with the experimental value.

Figure 5.

Figure 5

Results of dPaCS-MD/MSM on the A2AR/LUF5833 complex. (A) A side view of the simulated system containing A2AR (magenta), LUF5833 (spheres), the membrane region (white), water (blue), K+ (purple spheres), and Cl (cyan spheres). The image was created using ChimeraX.8688 The chemical structure of LUF5833 is shown in the inset. (B) The intercenter-of-mass distance dcom (the selection feature) as a function of ncyc during dPaCS-MD. (C) The dissociation pathways of LUF5833 from A2AR (white cartoon model). The center of mass positions of LUF5833 along the dissociation pathways are shown as spheres in path-dependent colors. (D) The potential of mean force (PMF) as a function of dcom. The thick black line represents the average PMF and the dotted lines show individual PMFs from each trial.

4. Conclusions

In this paper, we presented the software design of the PaCS-Toolkit. PaCS-Toolkit is coded with Python3 and is distributed under the GPLv3 license. PaCS-Toolkit, its documents, and input files for the examples presented in this paper are available from GitHub.a Potential users can download the PaCS-Toolkit and input examples and start to conduct PaCS-MD with the provided inputs. The users who can program with Python should be able to modify the code of the PaCS-Toolkit or implement new methods.

We also demonstrated three different types of applications. The folding of chignolin investigated by t-PaCS-MD/MSM identified direct folding into the native state as well as folding via a misfolded state, similar to the results of previous studies.1,67,68 rmsdPaCS-MD/MSM of the Nsp15 monomer successfully enhanced domain motions without causing large intradomain motion in the majority of cases and identified that the open structure is stable in the monomer state compared to the more closed structure in the hexamer form. The dissociation of LUF5833 from A2AR was successfully simulated within 3 ns of dPaCS-MD and the experimental value of ΔG° was well reproduced by dPaCS-MD/MSM. These results indicated that PaCS-Toolkit can be easily utilized to simulate a wide variety of dynamics for different types of molecular systems. As mentioned earlier, input files for these examples are available.

Since the PaCS-Toolkit is designed to be flexible, new features can be incorporated relatively easily. For example, although PaCS-Toolkit1.0 supports only GROMACS, AMBER, and NAMD, other MD software can be made available. Also, variations of PaCS-MD not implemented in the current version and new types of PaCS-MD can be integrated in the future.

Acknowledgments

This work used computational resources of the supercomputer TSUBAME provided by Tokyo Institute of Technology, FUGAKU through the HPCI System Research Project (project IDs: hp230077 and hp230216), and the supercomputers provided by Research Center for Computational Science, The National Institute of Natural Science, and The Institute for Solid State Physics, The University of Tokyo.

Supporting Information Available

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jpcb.4c01271.

  • Implied time scale versus lag time plots for the MSM analysis for chignolin folding, dynamics of the Nsp15 monomer, and LUF5833 dissociation from A2AR (PDF)

Author Contributions

S.I., T.H., and T.N.W. contributed equally. S.I., T.H., T.N.W., H.K., Z.B., T.K., W.L., D.P.T., and A.K. contributed to the design of the work, conducted to the overall analyses, and wrote the manuscript. S.I., T.H., H.K., and W.L. did coding. T.N.W., Z.B., T.K., W.L., and D.P.T. conducted simulations and their analyses.

This research was supported by MEXT/JSPS KAKENHI nos. JP22H04745, JP23H02424, JP23H02445, and JP23H04058 to A.K., and JP23K14154 to D.P.T., JSPS Bilateral Program Number JPJSBP120236503 to A.K., and MEXT as “Program for Promoting Researches on the Supercomputer Fugaku” (simulation- and AI-driven next-generation medicine and drug discovery based on “Fugaku”, JPMXP1020230120) to D.P.T.

The authors declare no competing financial interest.

Special Issue

Published as part of The Journal of Physical Chemistry Bvirtual special issue “Recent Advances in Simulation Software and Force Fields”.

Footnotes

a

PaCS-Toolkit is available from GitHub https://github.com/Kitaolab/PaCS-Toolkit under the GPLv3 and examples of input files for the three applications shown in this paper are available from GitHub https://github.com/Kitaolab/PaCS-MD-example.

Supplementary Material

jp4c01271_si_001.pdf (1.3MB, pdf)

References

  1. Harada R.; Kitao A. Parallel Cascade Selection Molecular Dynamics (PaCS-MD) to generate conformational transition pathway. J. Chem. Phys. 2013, 139 (3), 035103. 10.1063/1.4813023. [DOI] [PubMed] [Google Scholar]
  2. Harada R.; Kitao A. Nontargeted Parallel Cascade Selection Molecular Dynamics for Enhancing the Conformational Sampling of Proteins. J. Chem. Theory Comput. 2015, 11 (11), 5493–5502. 10.1021/acs.jctc.5b00723. [DOI] [PubMed] [Google Scholar]
  3. Harada R.; Shigeta Y. Temperature-shuffled parallel cascade selection molecular dynamics accelerates the structural transitions of proteins. J. Comput. Chem. 2017, 38 (31), 2671–2674. 10.1002/jcc.25060. [DOI] [PubMed] [Google Scholar]
  4. Tran D. P.; Takemura K.; Kuwata K.; Kitao A. Protein-Ligand Dissociation Simulated by Parallel Cascade Selection Molecular Dynamics. J. Chem. Theory Comput. 2018, 14 (1), 404–417. 10.1021/acs.jctc.7b00504. [DOI] [PubMed] [Google Scholar]
  5. Tran D. P.; Kitao A. Dissociation Process of a MDM2/p53 Complex Investigated by Parallel Cascade Selection Molecular Dynamics and the Markov State Model. J. Phys. Chem. B 2019, 123 (11), 2469–2478. 10.1021/acs.jpcb.8b10309. [DOI] [PubMed] [Google Scholar]
  6. Inoue Y.; Ogawa Y.; Kinoshita M.; Terahara N.; Shimada M.; Kodera N.; Ando T.; Namba K.; Kitao A.; Imada K.; Minamino T. Structural Insights into the Substrate Specificity Switch Mechanism of the Type III Protein Export Apparatus. Structure 2019, 27 (6), 965–976.e6. 10.1016/j.str.2019.03.017. [DOI] [PubMed] [Google Scholar]
  7. Tran D. P.; Kitao A. Kinetic Selection and Relaxation of the Intrinsically Disordered Region of a Protein upon Binding. J. Chem. Theory Comput. 2020, 16 (4), 2835–2845. 10.1021/acs.jctc.9b01203. [DOI] [PubMed] [Google Scholar]
  8. Takaba K.; Tran D. P.; Kitao A. Edge expansion parallel cascade selection molecular dynamics simulation for investigating large-amplitude collective motions of proteins. J. Chem. Phys. 2020, 152 (22), 225101. 10.1063/5.0004654. [DOI] [PubMed] [Google Scholar]
  9. Takaba K.; Tran D. P.; Kitao A. Erratum: “Edge expansion parallel cascade selection molecular dynamics simulation for investigating large-amplitude collective motions of proteins” [J. Chem. Phys. 152, 225101 (2020)]. J. Chem. Phys. 2020, 153 (17), 179902. 10.1063/5.0032465. [DOI] [PubMed] [Google Scholar]
  10. Harada R.; Yamaguchi K.; Shigeta Y. Enhanced Conformational Sampling Method Based on Anomaly Detection Parallel Cascade Selection Molecular Dynamics: ad-PaCS-MD. J. Chem. Theory Comput. 2020, 16 (10), 6716–6725. 10.1021/acs.jctc.0c00697. [DOI] [PubMed] [Google Scholar]
  11. Hata H.; Phuoc Tran D.; Marzouk Sobeh M.; Kitao A. Binding free energy of protein/ligand complexes calculated using dissociation Parallel Cascade Selection Molecular Dynamics and Markov state model. Biophys. Physicobiol. 2021, 18, 305–316. 10.2142/biophysico.bppb-v18.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Yasuda T.; Morita R.; Shigeta Y.; Harada R. Independent Nontargeted Parallel Cascade Selection Molecular Dynamics (Ino-PaCS-MD) to Enhance the Conformational Sampling of Proteins. J. Chem. Theory Comput. 2021, 17 (9), 5933–5943. 10.1021/acs.jctc.1c00558. [DOI] [PubMed] [Google Scholar]
  13. Tran D. P.; Taira Y.; Ogawa T.; Misu R.; Miyazawa Y.; Kitao A. Inhibition of the hexamerization of SARS-CoV-2 endoribonuclease and modeling of RNA structures bound to the hexamer. Sci. Rep. 2022, 12 (1), 3860. 10.1038/s41598-022-07792-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Aida H.; Shigeta Y.; Harada R. Ligand Binding Path Sampling Based on Parallel Cascade Selection Molecular Dynamics: LB-PaCS-MD. Materials 2022, 15 (4), 1490. 10.3390/ma15041490. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Wijaya T. N.; Kitao A. Energetic and Kinetic Origins of CALB Interfacial Activation Revealed by PaCS-MD/MSM. J. Phys. Chem. B 2023, 127 (34), 7431–7441. 10.1021/acs.jpcb.3c02041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Sobeh M. M.; Kitao A. Dissociation Pathways of the p53 DNA Binding Domain from DNA and Critical Roles of Key Residues Elucidated by dPaCS-MD/MSM. J. Chem. Inf. Model. 2022, 62 (5), 1294–1307. 10.1021/acs.jcim.1c01508. [DOI] [PubMed] [Google Scholar]
  17. Caves L. S.; Evanseck J. D.; Karplus M. Locally accessible conformations of proteins: multiple molecular dynamics simulations of crambin. Protein Sci. 1998, 7 (3), 649–666. 10.1002/pro.5560070314. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Aida H.; Shigeta Y.; Harada R. Regenerations of Initial Velocities in Parallel Cascade Selection Molecular Dynamics (PaCS-MD) Enhance the Conformational Transitions of Proteins. Chem. Lett. 2020, 49 (7), 798–801. 10.1246/cl.200196. [DOI] [Google Scholar]
  19. Hata H.; Nishihara Y.; Nishiyama M.; Sowa Y.; Kawagishi I.; Kitao A. High pressure inhibits signaling protein binding to the flagellar motor and bacterial chemotaxis through enhanced hydration. Sci. Rep. 2020, 10 (1), 2351. 10.1038/s41598-020-59172-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Peng J.; Zhang Z. Unraveling low-resolution structural data of large biomolecules by constructing atomic models with experiment-targeted parallel cascade selection simulations. Sci. Rep. 2016, 6, 29360. 10.1038/srep29360. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Barber C. B.; Dobkin D. P.; Huhdanpaa H. The quickhull algorithm for convex hulls. ACM Trans. Math Software 1996, 22 (4), 469–483. 10.1145/235815.235821. [DOI] [Google Scholar]
  22. Hamelberg D.; Mongan J.; McCammon J. A. Accelerated molecular dynamics: a promising and efficient simulation method for biomolecules. J. Chem. Phys. 2004, 120 (24), 11919–11929. 10.1063/1.1755656. [DOI] [PubMed] [Google Scholar]
  23. Schlegl T.; Seeböck P.; Waldstein S. M.; Schmidt-Erfurth U.; Langs G.. Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery. In Information Processing in Medical Imaging; Niethammer M., Styner M., Aylward S., Zhu H., Oguz I., Yap P.-T., Shen D., Eds.; Springer International Publishing: Cham, 2017; pp 146–157. [Google Scholar]
  24. Shin K.; Tran D. P.; Takemura K.; Kitao A.; Terayama K.; Tsuda K. Enhancing Biomolecular Sampling with Reinforcement Learning: A Tree Search Molecular Dynamics Simulation Method. ACS Omega 2019, 4 (9), 13853–13862. 10.1021/acsomega.9b01480. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Kocsis L.; Szepesvári C.. Bandit Based Monte-Carlo Planning. In Machine Learning: ECML 2006; Springer, Berlin, Heidelberg, 2006; Vol. 4212, pp 282–293. [Google Scholar]
  26. Harada R.; Takano Y.; Shigeta Y. Fluctuation Flooding Method (FFM) for accelerating conformational transitions of proteins. J. Chem. Phys. 2014, 140 (12), 125103. 10.1063/1.4869594. [DOI] [PubMed] [Google Scholar]
  27. Harada R.; Nakamura T.; Shigeta Y. Automatic detection of hidden dimensions to obtain appropriate reaction coordinates in the Outlier FLOODing (OFLOOD) method. Chem. Phys. Lett. 2015, 639, 269–274. 10.1016/j.cplett.2015.09.031. [DOI] [Google Scholar]
  28. Harada R.; Nakamura T.; Takano Y.; Shigeta Y. Protein folding pathways extracted by OFLOOD: Outlier FLOODing method. J. Comput. Chem. 2015, 36 (2), 97–102. 10.1002/jcc.23773. [DOI] [PubMed] [Google Scholar]
  29. Harada R.; Nakamura T.; Shigeta Y. Sparsity-weighted outlier FLOODing (OFLOOD) method: Efficient rare event sampling method using sparsity of distribution. J. Comput. Chem. 2016, 37 (8), 724–738. 10.1002/jcc.24255. [DOI] [PubMed] [Google Scholar]
  30. Harada R.; Takano Y.; Shigeta Y. Enhanced conformational sampling method for proteins based on the TaBoo SeArch algorithm: application to the folding of a mini-protein, chignolin. J. Comput. Chem. 2015, 36 (10), 763–772. 10.1002/jcc.23854. [DOI] [PubMed] [Google Scholar]
  31. Harada R.; Takano Y.; Shigeta Y. Efficient conformational sampling of proteins based on a multi-dimensional TaBoo SeArch algorithm: An application to folding of chignolin in explicit solvent. Chem. Phys. Lett. 2015, 630, 68–75. 10.1016/j.cplett.2015.04.039. [DOI] [Google Scholar]
  32. Harada R.; Shigeta Y. Efficient Conformational Search Based on Structural Dissimilarity Sampling: Applications for Reproducing Structural Transitions of Proteins. J. Chem. Theory Comput. 2017, 13 (3), 1411–1423. 10.1021/acs.jctc.6b01112. [DOI] [PubMed] [Google Scholar]
  33. Harada R.; Shigeta Y. Structural dissimilarity sampling with dynamically self-guiding selection. J. Comput. Chem. 2017, 38 (22), 1921–1929. 10.1002/jcc.24837. [DOI] [PubMed] [Google Scholar]
  34. Harada R.; Shigeta Y. Self-Avoiding Conformational Sampling Based on Histories of Past Conformational Searches. J. Chem. Inf. Model. 2017, 57 (12), 3070–3078. 10.1021/acs.jcim.7b00573. [DOI] [PubMed] [Google Scholar]
  35. Huber G. A.; Kim S. Weighted-ensemble Brownian dynamics simulations for protein association reactions. Biophys. J. 1996, 70 (1), 97–110. 10.1016/S0006-3495(96)79552-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Zhang B. W.; Jasnow D.; Zuckerman D. M. Efficient and verified simulation of a path ensemble for conformational change in a united-residue model of calmodulin. Proc. Natl. Acad. Sci. U.S.A. 2007, 104 (46), 18043–18048. 10.1073/pnas.0706349104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Suarez E.; Lettieri S.; Zwier M. C.; Stringer C. A.; Subramanian S. R.; Chong L. T.; Zuckerman D. M. Simultaneous Computation of Dynamical and Equilibrium Information Using a Weighted Ensemble of Trajectories. J. Chem. Theory Comput. 2014, 10 (7), 2658–2667. 10.1021/ct401065r. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Zwier M. C.; Adelman J. L.; Kaus J. W.; Pratt A. J.; Wong K. F.; Rego N. B.; Suarez E.; Lettieri S.; Wang D. W.; Grabe M.; et al. WESTPA: an interoperable, highly scalable software package for weighted ensemble simulation and analysis. J. Chem. Theory Comput. 2015, 11 (2), 800–809. 10.1021/ct5010615. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Zuckerman D. M.; Chong L. T. Weighted Ensemble Simulation: Review of Methodology, Applications, and Software. Annu. Rev. Biophys. 2017, 46, 43–57. 10.1146/annurev-biophys-070816-033834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Russo J. D.; Zhang S.; Leung J. M. G.; Bogetti A. T.; Thompson J. P.; DeGrave A. J.; Torrillo P. A.; Pratt A. J.; Wong K. F.; Xia J.; et al. WESTPA 2.0: High-Performance Upgrades for Weighted Ensemble Simulations and Analysis of Longer-Timescale Applications. J. Chem. Theory Comput. 2022, 18 (2), 638–649. 10.1021/acs.jctc.1c01154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Aristoff D.; Copperman J.; Simpson G.; Webber R. J.; Zuckerman D. M. Weighted ensemble: Recent mathematical developments. J. Chem. Phys. 2023, 158 (1), 014108. 10.1063/5.0110873. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Allen R. J.; Valeriani C.; Rein Ten Wolde P. Forward flux sampling for rare event simulations. J. Phys.: Condens. Matter 2009, 21 (46), 463102. 10.1088/0953-8984/21/46/463102. [DOI] [PubMed] [Google Scholar]
  43. Becker N. B.; Allen R. J.; ten Wolde P. R. Non-stationary forward flux sampling. J. Chem. Phys. 2012, 136 (17), 174118. 10.1063/1.4704810. [DOI] [PubMed] [Google Scholar]
  44. Bowman G. R.; Noé F.; Pande V. S.. An Introduction to Markov State Models and Their Application to Long Timescale Molecular Simulation; Advances in Experimental Medicine and Biology; Springer Dordrecht, 2014. [Google Scholar]
  45. Husic B. E.; Pande V. S. Markov State Models: From an Art to a Science. J. Am. Chem. Soc. 2018, 140 (7), 2386–2396. 10.1021/jacs.7b12191. [DOI] [PubMed] [Google Scholar]
  46. Noé F.; Rosta E. Markov Models of Molecular Kinetics. J. Chem. Phys. 2019, 151 (19), 190401. 10.1063/1.5134029. [DOI] [PubMed] [Google Scholar]
  47. Harris C. R.; Millman K. J.; van der Walt S. J.; Gommers R.; Virtanen P.; Cournapeau D.; Wieser E.; Taylor J.; Berg S.; Smith N. J.; et al. Array programming with NumPy. Nature 2020, 585 (7825), 357–362. 10.1038/s41586-020-2649-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Virtanen P.; Gommers R.; Oliphant T. E.; Haberland M.; Reddy T.; Cournapeau D.; Burovski E.; Peterson P.; Weckesser W.; Bright J.; et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 2020, 17 (3), 261–272. 10.1038/s41592-019-0686-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Abraham M. J.; Murtola T.; Schulz R.; Páll S.; Smith J. C.; Hess B.; Lindahl E. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 2015, 1–2, 19–25. 10.1016/j.softx.2015.06.001. [DOI] [Google Scholar]
  50. Pall S.; Zhmurov A.; Bauer P.; Abraham M.; Lundborg M.; Gray A.; Hess B.; Lindahl E. Heterogeneous parallelization and acceleration of molecular dynamics simulations in GROMACS. J. Chem. Phys. 2020, 153 (13), 134110. 10.1063/5.0018516. [DOI] [PubMed] [Google Scholar]
  51. Case D. A.; Aktulga H. M.; Belfon K.; Ben-Shalom I. Y.; Berryman J. T.; Brozell S. R.; Cerutti D. S.; Cheatham I. T. E.; Cisneros G. A.; Cruzeiro V. W. D.; et al. Amber 2022; University of California: San Francisco, 2022.
  52. Phillips J. C.; Hardy D. J.; Maia J. D. C.; Stone J. E.; Ribeiro J. V.; Bernardi R. C.; Buch R.; Fiorin G.; Henin J.; Jiang W.; et al. Scalable molecular dynamics on CPU and GPU architectures with NAMD. J. Chem. Phys. 2020, 153 (4), 044130. 10.1063/5.0014475. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Roe D. R.; Cheatham T. E. PTRAJ and CPPTRAJ: Software for Processing and Analysis of Molecular Dynamics Trajectory Data. J. Chem. Theory Comput. 2013, 9 (7), 3084–3095. 10.1021/ct400341p. [DOI] [PubMed] [Google Scholar]
  54. McGibbon R. T.; Beauchamp K. A.; Harrigan M. P.; Klein C.; Swails J. M.; Hernandez C. X.; Schwantes C. R.; Wang L. P.; Lane T. J.; Pande V. S. MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics Trajectories. Biophys. J. 2015, 109 (8), 1528–1532. 10.1016/j.bpj.2015.08.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Buchete N. V.; Hummer G. Coarse master equations for peptide folding dynamics. J. Phys. Chem. B 2008, 112 (19), 6057–6069. 10.1021/jp0761665. [DOI] [PubMed] [Google Scholar]
  56. Bowman G. R.; Beauchamp K. A.; Boxer G.; Pande V. S. Progress and challenges in the automated construction of Markov state models for full protein systems. J. Chem. Phys. 2009, 131 (12), 124101. 10.1063/1.3216567. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Prinz J. H.; Wu H.; Sarich M.; Keller B.; Senne M.; Held M.; Chodera J. D.; Schutte C.; Noe F. Markov models of molecular kinetics: generation and validation. J. Chem. Phys. 2011, 134 (17), 174105. 10.1063/1.3565032. [DOI] [PubMed] [Google Scholar]
  58. Trendelkamp-Schroer B.; Wu H.; Paul F.; Noe F. Estimation and uncertainty of reversible Markov models. J. Chem. Phys. 2015, 143 (17), 174101. 10.1063/1.4934536. [DOI] [PubMed] [Google Scholar]
  59. Doerr S.; De Fabritiis G. On-the-Fly Learning and Sampling of Ligand Binding by High-Throughput Molecular Simulations. J. Chem. Theory Comput. 2014, 10 (5), 2064–2069. 10.1021/ct400919u. [DOI] [PubMed] [Google Scholar]
  60. Nosé S. A unified formulation of the constant temperature molecular dynamics methods. J. Chem. Phys. 1984, 81 (1), 511–519. 10.1063/1.447334. [DOI] [Google Scholar]
  61. Hoover W. G. Canonical dynamics: Equilibrium phase-space distributions. Phys. Rev. A: At., Mol., Opt. Phys. 1985, 31 (3), 1695–1697. 10.1103/PhysRevA.31.1695. [DOI] [PubMed] [Google Scholar]
  62. Parrinello M.; Rahman A. Polymorphic transitions in single crystals: A new molecular dynamics method. J. Appl. Phys. 1981, 52 (12), 7182–7190. 10.1063/1.328693. [DOI] [Google Scholar]
  63. Scherer M. K.; Trendelkamp-Schroer B.; Paul F.; Perez-Hernandez G.; Hoffmann M.; Plattner N.; Wehmeyer C.; Prinz J. H.; Noe F. PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models. J. Chem. Theory Comput. 2015, 11 (11), 5525–5542. 10.1021/acs.jctc.5b00743. [DOI] [PubMed] [Google Scholar]
  64. Arthur D.; Vassilvitskii S.. k-means++: the advantages of careful seeding. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2007, 2007.
  65. Honda S.; Yamasaki K.; Sawada Y.; Morii H. 10 residue folded peptide designed by segment statistics. Structure 2004, 12 (8), 1507–1518. 10.1016/j.str.2004.05.022. [DOI] [PubMed] [Google Scholar]
  66. Huang J.; Rauscher S.; Nawrocki G.; Ran T.; Feig M.; de Groot B. L.; Grubmuller H.; MacKerell A. D. CHARMM36m: an improved force field for folded and intrinsically disordered proteins. Nat. Methods 2017, 14 (1), 71–73. 10.1038/nmeth.4067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Satoh D.; Shimizu K.; Nakamura S.; Terada T. Folding free-energy landscape of a 10-residue mini-protein, chignolin. FEBS Lett. 2006, 580 (14), 3422–3426. 10.1016/j.febslet.2006.05.015. [DOI] [PubMed] [Google Scholar]
  68. Harada R.; Kitao A. Exploring the Folding Free Energy Landscape of a β-Hairpin Miniprotein, Chignolin, Using Multiscale Free Energy Landscape Calculation Method. J. Phys. Chem. B 2011, 115 (27), 8806–8812. 10.1021/jp2008623. [DOI] [PubMed] [Google Scholar]
  69. Kitao A. Transform and relax sampling for highly anisotropic systems: application to protein domain motion and folding. J. Chem. Phys. 2011, 135 (4), 045101. 10.1063/1.3613676. [DOI] [PubMed] [Google Scholar]
  70. Matsubara D.; Kasahara K.; Dokainish H. M.; Oshima H.; Sugita Y. Modified Protein-Water Interactions in CHARMM36m for Thermodynamics and Kinetics of Proteins in Dilute and Crowded Solutions. Molecules 2022, 27 (17), 5726. 10.3390/molecules27175726. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Hackbart M.; Deng X.; Baker S. C. Coronavirus endoribonuclease targets viral polyuridine sequences to evade activating host sensors. Proc. Natl. Acad. Sci. U.S.A. 2020, 117 (14), 8094–8103. 10.1073/pnas.1921485117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Kim Y.; Jedrzejczak R.; Maltseva N. I.; Wilamowski M.; Endres M.; Godzik A.; Michalska K.; Joachimiak A. Crystal structure of Nsp15 endoribonuclease NendoU from SARS-CoV-2. Protein Sci. 2020, 29 (7), 1596–1605. 10.1002/pro.3873. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Tian C.; Kasavajhala K.; Belfon K. A. A.; Raguette L.; Huang H.; Migues A. N.; Bickel J.; Wang Y.; Pincay J.; Wu Q.; Simmerling C. ff19SB: Amino-Acid-Specific Protein Backbone Parameters Trained against Quantum Mechanics Energy Surfaces in Solution. J. Chem. Theory Comput. 2020, 16 (1), 528–552. 10.1021/acs.jctc.9b00591. [DOI] [PubMed] [Google Scholar]
  74. Izadi S.; Anandakrishnan R.; Onufriev A. V. Building Water Models: A Different Approach. J. Phys. Chem. Lett. 2014, 5 (21), 3863–3871. 10.1021/jz501780a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Carpenter B.; Lebon G. Human Adenosine A(2A) Receptor: Molecular Mechanism of Ligand Binding and Activation. Front. Pharmacol 2017, 8, 898. 10.3389/fphar.2017.00898. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Amelia T.; van Veldhoven J. P. D.; Falsini M.; Liu R.; Heitman L. H.; van Westen G. J. P.; Segala E.; Verdon G.; Cheng R. K. Y.; Cooke R. M.; et al. Crystal Structure and Subsequent Ligand Design of a Nonriboside Partial Agonist Bound to the Adenosine A(2A) Receptor. J. Med. Chem. 2021, 64 (7), 3827–3842. 10.1021/acs.jmedchem.0c01856. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Jo S.; Kim T.; Iyer V. G.; Im W. CHARMM-GUI: a web-based graphical user interface for CHARMM. J. Comput. Chem. 2008, 29 (11), 1859–1865. 10.1002/jcc.20945. [DOI] [PubMed] [Google Scholar]
  78. Dickson C. J.; Walker R. C.; Gould I. R. Lipid21: Complex Lipid Membrane Simulations with AMBER. J. Chem. Theory Comput. 2022, 18 (3), 1726–1736. 10.1021/acs.jctc.1c01217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. He X.; Man V. H.; Yang W.; Lee T. S.; Wang J. A fast and high-quality charge model for the next generation general AMBER force field. J. Chem. Phys. 2020, 153 (11), 114502. 10.1063/5.0019056. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Frisch M. J.; Trucks G. W.; Schlegel H. B.; Scuseria G. E.; Robb M. A.; Cheeseman J. R.; Scalmani G.; Barone V.; Petersson G. A.; Nakatsuji H.; et al. Gaussian 16, Rev. C.01; Gaussian, Inc.: Wallingford, CT, 2016.
  81. Case D. A.; Aktulga H. M.; Belfon K.; Cerutti D. S.; Cisneros G. A.; Cruzeiro V. W. D.; Forouzesh N.; Giese T. J.; Gotz A. W.; Gohlke H.; et al. AmberTools. J. Chem. Inf. Model. 2023, 63 (20), 6183–6191. 10.1021/acs.jcim.3c01153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Doudou S.; Burton N. A.; Henchman R. H. Standard Free Energy of Binding from a One-Dimensional Potential of Mean Force. J. Chem. Theory Comput. 2009, 5 (4), 909–918. 10.1021/ct8002354. [DOI] [PubMed] [Google Scholar]
  83. Buch I.; Giorgino T.; De Fabritiis G. Complete reconstruction of an enzyme-inhibitor binding process by molecular dynamics simulations. Proc. Natl. Acad. Sci. U.S.A. 2011, 108 (25), 10184–10189. 10.1073/pnas.1103547108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Humphrey W.; Dalke A.; Schulten K. VMD: visual molecular dynamics. J. Mol. Graph. 1996, 14 (1), 33–38. 10.1016/0263-7855(96)00018-5. [DOI] [PubMed] [Google Scholar]
  85. Guo D.; Mulder-Krieger T.; Ijzerman A. P.; Heitman L. H. Functional efficacy of adenosine A(2)A receptor agonists is positively correlated to their receptor residence time. Br. J. Pharmacol. 2012, 166 (6), 1846–1859. 10.1111/j.1476-5381.2012.01897.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Goddard T. D.; Huang C. C.; Meng E. C.; Pettersen E. F.; Couch G. S.; Morris J. H.; Ferrin T. E. UCSF ChimeraX: Meeting modern challenges in visualization and analysis. Protein Sci. 2018, 27 (1), 14–25. 10.1002/pro.3235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Pettersen E. F.; Goddard T. D.; Huang C. C.; Meng E. C.; Couch G. S.; Croll T. I.; Morris J. H.; Ferrin T. E. UCSF ChimeraX: Structure visualization for researchers, educators, and developers. Protein Sci. 2021, 30 (1), 70–82. 10.1002/pro.3943. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Meng E. C.; Goddard T. D.; Pettersen E. F.; Couch G. S.; Pearson Z. J.; Morris J. H.; Ferrin T. E. UCSF ChimeraX: Tools for structure building and analysis. Protein Sci. 2023, 32 (11), e4792 10.1002/pro.4792. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

jp4c01271_si_001.pdf (1.3MB, pdf)

Articles from The Journal of Physical Chemistry. B are provided here courtesy of American Chemical Society

RESOURCES