Abstract
We developed force field parameters for fluorinated, aromatic amino acids enabling molecular dynamics (MD) simulations of fluorinated proteins. These parameters are tailored to the AMBER ff15ipq protein force field and enable the modeling of 4, 5, 6, and 7F-tryptophan, 3F- and 3,5F-tyrosine, and 4F- or 4-CF3-phenylalanine. The parameters include 181 unique atomic charges derived using the implicitly polarized charge (IPolQ) scheme in the presence of SPC/Eb explicit water molecules and 9 unique bond, angle, or torsion terms. Our simulations of benchmark peptides and proteins maintain expected conformational propensities on the μs time scale. In addition, we have developed an open-source Python program to calculate fluorine relaxation rates from MD simulations. The extracted relaxation rates from protein simulations are in good agreement with experimental values determined by 19F NMR. Collectively, our results illustrate the power and robustness of the IPolQ lineage of force fields for modeling the structure and dynamics of fluorine-containing proteins at the atomic level.
I. Introduction
While NMR spectroscopy using the 19F nucleus is emerging as a powerful tool for measuring various structural and dynamical properties of fluorine-labeled biomolecules,1 the availability of force fields for modeling the corresponding conformational dynamics at the atomic level has been limited. To address this unmet need, we have expanded upon the standard amino acids available in the AMBER ff15ipq protein force field2 and present a comprehensive set of parameters for the most commonly used, fluorinated, aromatic amino acids in 19F NMR experiments. Coupled with GPU-accelerated molecular dynamics (MD) simulations,3,4 we can now begin to effectively integrate the results from 19F NMR experiments with atomistic MD simulations.
The fluorine nucleus has several features that render it a particularly useful NMR probe. Specifically, 19F is a 100% naturally abundant isotope with a high gyromagnetic ratio, which makes it almost as sensitive as 1H. Importantly, 19F chemical shifts span a very large range and are exquisitely responsive to the local chemical and electronic environment around the atom.5−7 Fluorine is absent from virtually all naturally occurring biomolecules. Therefore, studies of fluorinated biopolymers can be carried out in any routine buffer system or environment without interference from background signals.8 In general, only one or a handful of fluorine atoms are introduced into a biopolymer, overcoming the common handicap of spectral overlap in proton spectra that necessitated uniform labeling with 13C and 15N, in conjunction with three-dimensional (3D) and four-dimensional (4D) heteronuclear spectroscopy, for resonance assignments in bioNMR. These desirable properties have propelled 19F NMR-based studies of biomolecular systems forward, such as investigations of thermodynamics9 and kinetics,10 protein structure and dynamics,11−13 or protein–protein and protein–ligand interactions.14−19 Efforts are also ongoing toward 19F NMR method development, including relaxation optimization approaches20,21 to extend applicability to larger systems and to exploit 19F paramagnetic relaxation enhancement to determine long distances up to 35 Å.2219F NMR is particularly useful in the pharmaceutical arena, guiding fragment-based screening23 and drug discovery/design.24 Finally, a more recent and exciting direction is the movement of 19F NMR into a physiological environment, measuring spectra of fluorinated proteins directly in Escherichia coli,25Xenopus laevis oocytes,26 and mammalian cells.27
Naturally, 19F NMR of proteins is not a panacea and has shortcomings and limitations. The 1.47 Å van der Waals radius of the fluorine atom lies between those of hydrogen (1.2 Å) and oxygen (1.52 Å). Therefore, fluorine is frequently substituted for hydrogens, hydroxyl groups, or carbonyl oxygens. While such substitutions are often only weakly perturbing, with little to no effect on a protein’s structure and biological activity,5,9,12,13 this is not necessarily the case and needs to be verified for each system under study.28 Compared to more traditional NMR probes (1H, 15N, 13C), introducing a highly electronegative fluorine atom changes electronic surroundings and thereby local interactions, especially if the substituted H or OH was involved in hydrogen bonding. In addition, although resonance overlap is avoided, having only a few probes available, compared to thousands with traditional 1H, 15N, 13C labeling, limits the information content and several samples may need to be prepared where single fluorinated amino acids are judiciously placed to probe different parts of the biomolecule. MD simulations can help bridge this gap in information content at the all-atom level.
Here, we have developed new force field parameters for eight commonly used fluorinated, aromatic amino acids, facilitating more accurate atomic-level simulations of 19F NMR observables. As indicated in Figure 1, our study focuses on 4, 5, 6, and 7-fluoro-tryptophan (W4F, W5F, W6F, W7F); 3, and 3,5-fluoro-tyrosine (Y3F, YDF); as well as 4-fluoro- and 4-trifluoromethyl-phenylalanine (F4F, FTF). These fluorinated derivatives of tryptophan, tyrosine, and phenylalanine were selected because they are readily available and can be easily incorporated into proteins for 19F NMR.29−32 Our parameters are intended for use with the AMBER ff15ipq force field and were derived using a general workflow that has been used previously for deriving new classes of noncanonical residues.33 Consistent with the ff15ipq force field, we derived implicitly polarized atomic charges in the presence of explicit solvent. The implicit polarization of atomic charges has been demonstrated to be essential for modeling condensed-phase electrostatics by fixed-charge force fields2,34–that is, when polarizable force fields that employ Drude oscillators35 or inducible multipoles36 are not feasible. Furthermore, application of the ff15ipq force field with the intended SPC/Eb water model yields more accurate rotational diffusion times of proteins, enabling direct calculation of NMR observables.2,37,38 Using the GPU-accelerated AMBER MD engine,39,40 we have extensively validated our force field parameters by simulating both peptides and proteins that include each of the eight fluorinated amino acids, yielding over 47 μs of aggregate simulation time. Our parameters maintain expected conformational propensities of both fluorinated peptide- and protein-based systems on the μs time scale, and our relaxation rates from simulation agree with those from 19F NMR experiments.
Until now, parameters for the full set of fluorinated, aromatic amino acids listed above for the AMBER ff15ipq protein force field have not been available, although recently, parameters for over 400 nonstandard amino acids were added to the CHARMM36 protein and CHARMM general force fields,41 including some for several fluorinated amino acids. The CHARMM parameters were derived by isolating only the side-chain atoms of the amino acid of interest and optimizing select water interactions based on quantum mechanical (QM) calculations at either the MP2/6-31G(d) or the HF/6-31G(d) level of theory. However, CHARMM parameters are not available for 7-fluoro-tryptophan and 4-fluoro-phenylalanine. In addition, parameters for apolar, nonaromatic, fluorinated amino acids are available for use with the AMBER ff14SB force field, which selectively optimized fluorine specific Lennard-Jones parameters and derived atomic charges at the RHF/6-31G* level of theory.42,43 Also, for use with the ff14SB force field, a set of unnatural phenylalanine and tyrosine derivatives was parameterized, which included 3,5-fluoro-tyrosine.44 Atomic charges for these parameters were based on electrostatic potential calculations of a blocked dipeptide at the HF/6-31G* level of theory. Other efforts to develop parameters for simulating fluorinated proteins have been primarily ad hoc and single-use cases.45−47
II. AMBER ff15ipq Force Field
Parameters for the eight different fluorinated amino acids were developed for use with the AMBER ff15ipq protein force field, the latest version of the implicitly polarized charge (ipq) force field lineage.2,48 These parameters are also compatible with the ff15ipq-m force field for protein mimetics.33 Ipq force fields feature implicitly polarized atomic charges, with each charge optimized to reproduce the mean-field electron density of the molecule in explicit solvent.49 Notably, the ff15ipq force field is parameterized using the three-point, explicit SPC/Eb (extended simple point charge) water model,50 which reproduces the experimental rotational diffusion of proteins and can therefore yield reasonable dynamical observables such as those measured by NMR without the computational burden associated with four-point water model alternatives.
The original motivation behind the ff15ipq force field was to obtain more accurate propensities for salt-bridge formation, which is a common limitation of most contemporary fixed-charge force fields.34 The final ff15ipq implementation was a complete rederivation of its predecessor,48 including a greatly expanded torsion and angle parameter set, atomic charges derived at the MP2/cc-pVTZ level in explicit solvent, and adjustments to atomic radii for polar hydrogens. The ff15ipq force field reproduces the experimental probabilities of salt-bridge formation while maintaining the secondary structure of stably folded and disordered systems on the μs time scale and faithfully predicts J-coupling constants for a penta-alanine peptide as well as NMR relaxation rates for protein systems.2
Additional developments involved a modification of methyl side-chain rotational barriers that help to accurately predict methyl relaxation rates,38,51,52 as well as an expanded force field (ff15ipq-m) for modeling four classes of artificial backbone units, including D- and Cα-methylated amino acids, β-amino acids, and two cyclic β residues.33 These developments are implemented in both the AMBER39,40,53 and open-source OpenMM54 GPU-accelerated biomolecular simulation software packages and are accessible to other such packages through the ParmEd program55 for format conversion.
Our new parameters for the eight fluorinated, aromatic amino acids were derived using a workflow designed to be consistent with the parent ff15ipq derivation process. As done for the ff15ipq-m force field, one minor difference pertained to the generation of blocked dipeptide conformations for both the charge and bonded parameter derivation: to increase and safeguard conformational diversity, we progressively restrained the backbone torsions at evenly spaced intervals before energy minimizing each restrained conformation.33 In the parent ff15ipq force field, conformations were generated using high-temperature MD simulations.2
III. Methods
The general workflow for developing ff15ipq force field parameters is outlined on Figure 2. Throughout this workflow, Lennard-Jones parameters for all atoms, including fluorine, were taken from the parent ff15ipq force field.
III.I. Derivation of IPolQ Atomic Charges
For each fluorinated residue, IPolQ atomic charges were derived using a four-step iterative procedure until convergence was reached:
-
1.
Generate a set of conformations. Each type of fluorinated amino acid was flanked with acetyl (Ace) and N-methylamide (Nme) N- and C-terminal capping groups, and the resulting molecule was named “dipeptide.” The 20 dipeptide conformations were generated by progressively restraining the backbone Φ/Ψ torsion angles within −180° to 180° using a force constant of 32 kcal/mol. Aside from the backbone dihedral angles, no other restraints were applied for generating the initial set of dipeptide conformations. Only for the first iteration, the initial set of atomic charges was derived using the AM1-BCC charge method.56 Each restrained conformation was then subjected to energy minimization and solvated using a truncated octahedral box of SPC/EB water molecules with at least a 12 Å clearance between the solute and the edge of the box. After another round of energy minimization, each solvated system underwent a two-stage equilibration, which included a 10 kcal/(mol Å2) positional restraint on the entire dipeptide. In the first stage, 20 ps of dynamics were carried out at constant temperature (25 °C) and volume. In the second stage, 100 ps of dynamics were carried out at constant temperature (25 °C) and pressure (1 atm). A final 500 ps simulation was performed at constant temperature (25 °C) and volume during which the solute remained fixed, and the solvent coordinates were used to generate a distribution of point charges that represent the solvent reaction field potential. This distribution consisted of an inner cloud of point charges based on the coordinates of the solvent molecules within 5 Å of the solute and three outer shells of point charges that reproduce contributions to the solvent reaction field potential from the periodic system beyond 5 Å.
-
2.
Calculate the electrostatic potential of each conformation in vacuum and in explicit solvent. Two sets of QM electrostatic potential calculations were carried out for each dipeptide conformation at the MP2/cc-pVTZ57−60 level of theory using the ORCA 4.2.061 software package. The first set of QM calculations was in vacuum and the other set included the solvent reaction field potential to represent the surrounding explicit solvent molecules.
-
3.
Fit the average electrostatic potential over all conformations to atomic charges, both in vacuum and in explicit solvent. All eight aromatic fluorinated amino acids were fit together using the fitq module of the mdgx program,53 with atomic charges of the Ace and Nme capping groups fixed to net neutral values during the process.
-
4.
Average the vacuum-phase and solvent-phase point charges to obtain implicitly polarized atomic charges. This averaged charge set was used as the starting point for the next iteration of charge generation and fitting as described above.
The above procedure was repeated four times to reach convergence of the partial atomic charge values within 10% of the previous iteration’s values.
III.II. Generation and Fitting of the Bonded Parameter Dataset
Bonded parameters were derived using an iterative, multistep procedure: (i) 1000 conformations of each fluorinated dipeptide were generated in vacuum, taking trial torsion and angle parameters from the parent ff15ipq force field, if available, or otherwise from the general amber force field 2 (GAFF2).62 Atomic charges were obtained from the vacuum-phase set of converged point charges as described above. Each conformation was generated by progressively restraining backbone Φ and Ψ torsion angles of the dipeptide between −180° and 180° using a force constant of 32 kcal/mol. After energy minimization Φ and Ψ distributions of the conformation set were evaluated to ensure extensive sampling of the configurational space. (ii) For each conformation, the quantum mechanical (QM) single-point energy was calculated at the MP2/cc-pVTZ level of theory, along with the molecular mechanical (MM) energy. Linear least-squares fitting for the entire set of bond, angle, and torsion parameters was carried out using mdgx.63 (iii) Force field bonded parameters such as torsional barrier heights, angle equilibria, angle stiffness, bond length equilibria, and bond stiffness were adjusted to minimize the error between the QM and MM energies. Using the updated parameter set, the second round of conformer generation and QM energy calculations was carried out in the absence of restraints to prevent becoming trapped in local minima. This new set of conformations excluded redundant conformations, as defined by those with MM energies that differ by <0.01 kcal/mol. Another round of fitting was then performed using both sets of conformations and their respective energies to obtain the final parameter set for each iteration. Steps (i–iii) were repeated until the root-mean-square error (RMSE) between the QM and MM energies was less than 1% from the previous iteration.
The accuracy of the molecular mechanical energies (UMM) produced using the optimized parameters for each fluorinated dipeptide was assessed by comparing them to the respective QM energies (UQM) for all generated conformations (Figure 3). During each iteration of the above fitting procedure, optimization of the bonded parameters was monitored by the RMSE, which reached final values of 1.08–1.86 kcal/mol for all eight residues. On average, these errors are slightly higher than those of the nonfluorinated tryptophan (0.89 kcal/mol), tyrosine (0.87 kcal/mol), and phenylalanine (0.93 kcal/mol) counterparts from the ff15ipq force field2 for canonical residues. The difference in RMSE of the fluorinated residue from the canonical counterpart may be due to fewer parameters in the fitting procedure and/or the presence of electronegative fluorine atom(s).
III.III. Preparation of Validation Systems
Models of fluorinated and canonical peptides were built using Avogadro64 and tleap.63 To generate the 4, 5, 6, and 7F-Trp-modified protein systems, the atomic coordinates of cyclophilin A [CypA; Protein Data Bank65 (PDB) ID: 3K0N]66 were modified using Chimera,67 with fluorine atoms substituted individually for hydrogens in Trp121 at the respectively numbered positions on the indole ring. All C–F bonds were initially modeled using a 1.42 Å bond length. Each system was solvated in a truncated octahedral box of explicit SPC/Eb50 water molecules with a 10 Å clearance between the solute and the edge of the box for the peptides and a 12 Å clearance for the proteins. All systems with unpaired charges were neutralized by adding Na+ or Cl– ions, treated with Joung and Cheatham ion parameters.68 Protonation states for ionizable residues were adjusted to represent the major species present at pH 6.5 to match the experimental NMR conditions.69
III.IV. Umbrella Sampling of Fluorinated and Canonical Peptides
To validate the conformational preferences of each fluorinated residue, umbrella sampling simulations were carried out for systems that consisted of a fluorinated amino acid (Xaa) flanked on either side by an alanine and the Ace and NMe capping groups at the N- and C-terminal ends, respectively. These molecules (Ace-Ala-Xaa-Ala-NMe) are named tetrapeptides thereafter. Ramachandran plots were generated for each tetrapeptide by calculating the potential of mean force as a function of the Φ and Ψ torsion angles of the fluorinated residue. Prior to umbrella sampling, each tetrapeptide was solvated and equilibrated as described above for unrestrained simulations, differing only in the duration of the final equilibration stage, which was 100 ps instead of 1 ns. Each window was then subjected to a 200 ps incrementally restrained equilibration prior to a 2 ns restrained simulation at constant temperature (25 °C) and pressure (1 atm). The Φ and Ψ torsions of the central residue were restrained using a harmonic penalty function with a force constant of 8 kcal/(mol rad2) for each window with 10° intervals about each torsion angle. This restraint scheme resulted in a series of 1296 windows for each set of two torsions, cumulating into 22.8 μs of aggregate simulation time for the eight fluorinated residue classes and 10,368 windows. From each set of 1296 windows, the unbiased potential of mean force was reconstructed using the weighted histogram analysis method (WHAM).70−72
III.V. MD Simulations of Fluorinated and Wild-Type CypA
Simulations of the 4, 5, 6, 7F-Trp fluorinated and the wild-type CypA proteins were carried out using the GPU-accelerated pmemd module of the AMBER 18 software package,39,40,53 ff15ipq force field,2 and our new fluorinated amino acid parameters. Each system was initially subjected to energy minimization followed by a three-stage equilibration. In the first stage, a 20 ps simulation was carried out at constant volume and temperature (25 °C) in the presence of solute heavy-atom positional restraints using a harmonic potential with a force constant of 1 kcal/(mol Å2). In the second stage, a 1 ns simulation was carried out at constant temperature (25 °C) and pressure (1 atm) using the same harmonic positional restraints. Finally, an unrestrained 1 ns simulation was carried out before performing a 1 μs production simulation with both constant temperature (25 °C) and pressure (1 atm). Five production simulations were run for each CypA protein, yielding 25 μs of aggregate simulation time.
Temperatures were maintained using a Langevin thermostat with a frictional constant of 1 ps–1, while pressure was maintained using a Monte Carlo barostat with 100 fs between system volume changes. Van der Waals and short-range electrostatic interactions were truncated at 10 Å, while long-range electrostatic interactions were calculated using the particle mesh Ewald method.73 To enable a 2 fs time step, all CH and NH bonds were constrained to their equilibrium values using the SHAKE algorithm.74 Coordinates were saved every ps and analysis was performed in CPPTRAJ.75
III.VI. Backbone Conformations of Fluorinated Amino Acids in the PDB
A “ligand expo” search76 was carried out to determine whether each fluorinated amino acid of interest was present in PDB deposited structures. The corresponding three-letter identifiers in the PDB were as follows: 4FW, FTR, FT6, F7W, PFF, 55I, YOF, and F2Y, which are equivalent to the following identifiers in the current study: W4F, W5F, W6F, W7F, F4F, FTF, Y3F, and YDF (Figure 1), respectively. Each identifier was associated with at least one structure and was used to construct a query for structures that included the fluorinated residue as a part of a polymer chain, avoiding fluorinated ligand molecules. For structures determined by X-ray crystallography, only those with a resolution ≤ 2.5 Å were included, resulting in a total of 56 structures with at least one fluorinated residue each (Table S2). The backbone Φ and Ψ torsion angles were then calculated using a custom Python script, omitting the torsion angles belonging to fluorinated C-terminal and N-terminal residues.
III.VII. Calculation of 19F NMR Relaxation Rates
19F longitudinal (R1) and transverse relaxation rates (R2) were calculated from MD simulations for each of the four 19F-Trp CypA protein systems using eqs 1–4.5,77R1 and R2 values for these fluorinated CypA protein variants have been previously determined experimentally using 19F NMR.69 Both types of relaxation rates are affected by dipole–dipole (DD) interactions and chemical shift anisotropy (CSA). The influence of DD interactions is described by eqs 1 and 2
1 |
2 |
where γ is the gyromagnetic ratio of fluorine or hydrogen, ℏ is the reduced Plank’s constant, ω is the Larmor frequency of fluorine or hydrogen, and τc is the rotational correlation time.
For each frame in the MD simulations, the distances between each hydrogen and the fluorine atoms within a 3 Å radius around the fluorine were calculated (rFH) and used in eqs 1 and 2 to calculate the relaxation rate contribution to fluorine from each nearby hydrogen. These contributions were summed to account for the influence of all surrounding hydrogen dipoles.
The influence of CSA effects is described by eqs 3 and 4
3 |
4 |
where δσ is the reduced anisotropy and η is the asymmetry parameter, as described in Haeberlen78 convention by the following equations
5 |
6 |
7 |
8 |
The final relaxation rate values were then calculated as the sum of the individual DD and CSA-based components.
These equations assume the following approximations: (i) protein tumbles isotropically with the rotational correlation time remaining constant, and the fluorinated side-chain motion is governed by the same overall protein rotational correlation time; (ii) fluorine–hydrogen distances close to the F atom remain constant and can be described by a single value; (iii) cross-correlation interactions79 are not accounted for, even though a 10–25% cross-term contribution to the total R2 relaxation rate may be possible;69 and (iv) the effects of chemical exchange on R2 are neither considered nor expected to affect the fluorine on the tryptophan indole ring.
For all CypA variants, a single rotational correlation time (τc) of 8.2 ns80 was used, along with the 19F CSA values from solid-state magic angle spinning (MAS) NMR experiments for 4, 5, 6, or 7-fluoro-tryptophans81 (Table S4) to calculate δσ and η. To sample a representative ensemble, only the last 800 ns of each trajectory, from each independent 1 μs production simulation, was used. All fluorine relaxation calculations were carried out using a custom-made and open-source Python program (https://github.com/chonglab-pitt/fluorelax).
IV. Results and Discussion
Our parameters for the eight fluorinated aromatic amino acids were derived using the IPolQ workflow (Figure 2). Briefly, a set of conformations for a capped dipeptide with the fluorinated residue of interest was generated, and electrostatic potentials of each conformation were calculated quantum mechanically, both in vacuum and in the presence of an explicit solvent. The resulting set of vacuum and solvent-phase atomic charges were then optimized to reproduce the respective electrostatic potential calculations, before being averaged to obtain an implicitly polarized charge set. Along with van der Waals interactions, these atomic charges estimate the nonbonded contributions of the force field. With the vacuum-optimized charges, the bonded terms of the force field, such as bond angles and dihedrals, were fit to minimize the error between the molecular mechanical and the quantum mechanical energies of each conformation. Both the atomic partial charges and the bonded parameter derivation steps were repeated until they were self-consistent. To validate our force field parameters, we initially carried out peptide simulations to explore the changes in the conformational free-energy landscape of each fluorinated residue. We then performed protein-based simulations of 4, 5, 6, and 7F-Trp-substituted cyclophilin A (CypA) and compared our simulation results with those of the native CypA protein. Finally, we calculated NMR relaxation rates from our MD simulation data of CypA and compared these rates to the respective, experimentally determined 19F relaxation rates from NMR.
IV.I. Conformational Preferences of Individual Fluorinated Residues
Backbone conformational preferences for the blocked tetrapeptide systems (Ace-Ala-Xaa-Ala-Nme) were assessed using umbrella sampling simulations in which the central residue (Xaa) backbone dihedrals were progressively restrained (Figure 4). In umbrella sampling, a chosen reaction coordinate is initially divided into a series of windows or sections and a harmonic restraint is applied. In our case, this coordinate was the backbone Φ and Ψ dihedrals of the central residue. We choose 10° intervals from −180° to 180° using a force constant of 8 kcal/(mol rad2) to ensure that the reaction coordinate remained within the center of the window during MD simulations (see Methods Section III.V). A series of histograms along the reaction coordinate was generated, and because these distributions were overlapping, our umbrella sampling parameters were determined to be exhaustive and allowed for the corresponding unbiased, free-energy landscape to be recovered using WHAM.70−72 Umbrella sampling was carried out for the fluorinated tetrapeptides as well as their nonfluorinated counterparts (Trp, Tyr, Phe), as depicted in the relative free-energy difference plots (Figure 4). In addition, we compared our simulation results to the experimentally observed conformations for each fluorinated variant extracted from 56 protein structures deposited in the PDB.
Our results show that most free-energy barriers between different secondary structures were either unperturbed or only slightly increased for the fluorinated residues relative to their nonfluorinated counterparts, indicating less favorable sampling of that region. An exception is the area between the polyproline II (Φ ≈ −70°, Ψ ≈ 140°) and left-handed α helical (Φ ≈ 60°, Ψ ≈ 40°) regions, which were more favorably sampled in all cases for fluorinated residues. For tyrosine and phenylalanine variants, we saw this favorable sampling amplified with more fluorine atoms present in the rings, such as with tyrosine possessing one or two fluorine atoms and phenylalanine with a single fluorine or the trimethyl fluorine group. For tryptophan, our resulting energy landscapes produced a similar trend, except that the α helical (Φ ≈ −70°, Ψ ≈ −20°) and γ′ (Φ ≈ −80°, Ψ ≈ 60°) regions were more favorably sampled, with fluorination at the 6-position of the indole ring exhibiting the largest difference. All dihedral angles of fluorinated residues remained consistent with those experimentally observed in structures deposited in the PDB. The backbone Φ and Ψ torsion angle energies for fluorine-containing peptides are very similar to nonfluorinated peptides, with DDG values ranging from −0.63 to 0.64 kcal/mol for the tryptophan variants, −0.55 to 0.65 kcal/mol for the tyrosine variants, and −0.53 to 0.88 kcal/mol for the phenylalanine variants (Table S3). The average ΔΔG of our nonfluorinated versus fluorinated peptides are all close to zero, further supporting our conclusion that only minimal perturbations are induced upon fluorine substitution. The only exception is for 4-fluoro-phenylalanine, for which an average ΔΔG value of 0.273 ± 0.186 kcal/mol was observed. Overall, these findings are consistent with previous studies where fluorinated tryptophan,13,82 tyrosine,83 and phenylalanine9 substitutions introduced minimal to no differences in the global protein structure or the local dihedral angles around the fluorinated residue.
IV.II. Simulations of Fluorinated Cyclophilin A
To evaluate our force field parameters in the context of a protein, we carried out multiple μs time scale simulations for each of four variants of cyclophilin A (CypA), a 18.3 kDa peptidyl–prolyl isomerase that is a known host factor for HIV-1 infection.84 The variants are singly fluorinated at four different indole ring positions of Trp121, which is close to the active site of the protein. These fluorinated variants of CypA have been studied previously69 and serve as useful benchmark proteins for our simulation studies.
Our data show that both wild-type and the fluorinated CypA variants all remained stable over the course of multiple μs time scale simulations (Figure 5), and only small deviations are noted. For the wild-type CypA simulations, the average backbone RMSD value is 1.37 ± 0.37 Å (average ± one standard deviation), while the fluorinated Trp121 CypA variants exhibited average backbone RMSD values of 1.28 ± 0.23 Å (W4F CypA), 1.22 ± 0.27 Å (W5F CypA), 1.35 ± 0.31 Å (W6F CypA), and 1.21 ± 0.20 Å, (W7F CypA). RMSD values for each individual 1 μs simulation as a function of simulation time are shown in Figure S2.
In addition to backbone RMSD values, we also assessed the dihedral angle propensities of our aggregate simulation data by generating Ramachandran (Φ and Ψ) and Janin (χ1 and χ2) plots. All dihedral probabilities for the fluorinated CypA proteins remained within the same distribution as that of wild-type CypA (Figure 6). Only for W6F, a slightly larger range of conformational sampling around the α helical region in the Ramachandran plot and a slightly restricted conformational sampling distribution in the Janin plots was noted. These findings are similar to those with our peptide systems, where the W6F substitution also indicated a change in a sampling of the free-energy landscape near the α helical region, compared to the other fluorinated tryptophan moieties (Figure 4).
When evaluating the backbone RMSD and secondary structure predictions on a per-residue basis (Figure S4), excellent stability was maintained throughout multiple simulation replicates. However, a sharp increase in RMSD and corresponding change in secondary structure was noted between residues 146–153 and 102–107 during a fraction of both the wild-type and fluorinated CypA simulations. We isolated the characteristics of these increases in RMSD and found that they were largely driven by compensatory peptide-plane flips, which occur when changes in |Ψi| + |Φi+1| are large, while changes in |Ψi + Φi+1| are comparatively small.85 We accessed the per-residue level torsion angles of each trajectory (Figure S5) and found that residues 146–153 are a part of a loop that undergoes a hinge motion that is associated with a peptide-plane flip of residues 149 and 150. Furthermore, residues 102–107 are a part of a β turn that becomes twisted throughout our simulations and is compensated by peptide-plane flips of residues 103/104 and 107/108 (Figure S6).
IV.III. 19F Relaxation of CypA
In addition to structural properties, we also evaluated whether longitudinal and transverse fluorine NMR relaxation rates would be accessible from our aggregate CypA simulation data. NMR relaxation rates function as useful probes to study dynamics and depend on the properties of a nucleus, modulated by its local environment. Fluorine relaxation is governed by both dipole–dipole interactions with neighboring proton spins as well as chemical shift anisotropy (CSA).5 While dipole–dipole interactions and CSA affect both the longitudinal (R1) and transverse (R2) relaxation rates of the fluorine nucleus, dipole–dipole-based contributions dominate R1 and CSA dominates R2.
In a first approximation, we calculated R1 and R2 from our aggregate CypA simulation data using an overall rotational correlation time τc of 8.2 ns80 (Methods Section III.VII). For each frame of the MD ensemble, we estimated the dipole–dipole-based relaxation contributions from each hydrogen atom within 3 Å of the fluorine atom. The CSA-based relaxation contributions used previously measured chemical shift tensors from solid-state NMR84 (Table S4) and were kept constant for each frame of the MD ensemble. We compared these calculated rates to the measured experimental rates69 (Figure 7). Overall, our calculated relaxation rates from aggregate MD simulation data are in good agreement with the experimental values. The average calculated longitudinal rates are systematically somewhat smaller than the experimental values, although within a reasonable error (Table S5). W4F CypA exhibited a somewhat larger R1 value (∼2 s–1) than all others, which are similar and grouped around 1 s–1. The calculated transverse relaxation rates are in good agreement with experimental values for all CypA variants, except for W6F. They can be grouped into two sets comprising W5F and W6F CypA with R2 values around 60–80 s–1, and W4F and W7F CypA with R2 values around 110–120 s–1. Overall, the calculated relaxation values follow the same trends and agree well with the experimental 19F NMR relaxation rates.
At this juncture, it should be pointed out that our methodology for calculating relaxation rates from MD simulation data includes several key assumptions. In particular, our calculations assume that (i) the tryptophan side chain essentially moves like the overall protein, i.e., it does not exhibit internal motions in addition to the overall molecular tumbling of the protein (ii) only hydrogens within a radius of 3 Å around the fluorine atom contribute to the relaxation rate, and (iii) any contribution from dipole–dipole and CSA cross-correlated relaxation79 is unaccounted for, since they may only contribute 10–25% to the total R2 relaxation rate.69 Furthermore, the experimental relaxation rates are extracted using single exponential fitting of the signal intensity decays, although multiexponential fitting may more accurately capture cross-correlation-induced relaxation.69,77,86 Despite these differences between the simulation and experiment, as well as their respective assumptions, the simulation-based calculated rates agree surprisingly well with the experimental relaxation rates. In the future, more complex methods52,87 for fluorine relaxation rate calculations from MD simulations can be tested, which will help to dissect which components predominantly affect relaxation decay rates.
V. Conclusions
We have reported the development and validation of force field parameters for a set of eight fluorinated, aromatic amino acids that are commonly used for 19F NMR, for use with the AMBER ff15ipq protein force field. Our parameters include 181 implicitly polarized atomic charges and 9 unique bonded terms for 4, 5, 6, and 7-fluoro-tryptophan; 3, and 3,5-fluoro-tyrosine; as well as 4-fluoro- and 4-trifluoromethyl-phenylalanine. We validated that our new parameters maintain the expected conformational propensities of the fluorinated amino acids consistent with both the respective canonical residues and previously characterized experimental X-ray structure-derived propensities extracted from structures deposited in the PDB. Fluorinated amino acid-containing proteins, such as CypA, maintain the overall globular protein fold over multiple μs time scale simulations and extracted 19F NMR relaxation rates are in good agreement with the corresponding experimental rates.
Overall, our results demonstrate the robustness of the “sweeping optimization” approach using the mdgx program of AMBERTools20 distribution63 and the power of the IPolQ lineage of AMBER force fields for modeling fluorinated proteins. Our workflow is readily applicable to other residue classes33 and can be easily expanded to include other fluorinated amino acids, if so desired. On the practical side, our force field parameters have numerous potential implications, particularly for use with complementary 19F NMR studies and when considering structural ensembles. Since 19F NMR data can be used to guide MD simulations and vice versa, our parameters will aid in deriving an integrated, all-atom view of any fluorinated protein. Our parameters extend the macromolecular chemical space available to the AMBER ff15ipq force field and bridge the gap between the computation and experiment for the collaborative study of fluorinated proteins at the atomic level.
Similar to the IPolQ methodology, the Force Balance approach and the corresponding AMBER-FB15 force field88 perform sweeping optimization of hundreds of parameters simultaneously and include nonlinear optimization methods, which can incorporate alternative datasets, such as in vitro experiments, directly into the parameter optimization process. The FB15 force field has also recently been expanded to include parameters for phosphorylated amino acids,89 but at this time, it does not have parameters for halogenated residues such as the fluorinated amino acids presented in this work. In contrast to the IPolQ workflow, which is fully physics-based, the burgeoning development of machine learning-based force fields90 is promising but is heavily dependent on the training dataset being used. Thus, while physics-based approaches are more generalizable, deep learning methods can be more specialized for specific atoms or dependent on dataset properties. The goal of such machine learning-based approaches is to narrow the gap between the efficiency of classical models and the accuracy of ab initio methods, by training a machine learning model to predict molecular potential energies, usually using quantum mechanical datasets. Examples of these approaches include the ANAKIN-ME91,92 neural network potential and the OrbNet93 method, which, once fully trained, can accurately predict the energetic properties of organic molecules within chemical accuracy of ab initio density functional theory calculations but at a fraction of the computational cost. In the future, machine learning-based methods may be readily paired with the IPolQ method to further enhance the efficiency of the parameter fitting and optimization process.
Acknowledgments
This work was supported by the NIH Pittsburgh Aids Research Training (PART) program grant T32AI065380 to D.T.Y., NIH grant P50AI150481 and NSF grant CHE-1708773 to A.M.G., and NIH grant R01GM115805 and NSF grant CHE-1807301 to L.T.C. Computational resources were provided by the University of Pittsburgh’s Center for Research Computing and through NSF XSEDE grant TG-BIO210161 to D.T.Y. The authors thank David Case, Rieko Ishima, Manman Lu, and Anthony Bogetti for helpful discussions.
Supporting Information Available
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jpca.2c00255.
Atom types for fluorinated amino acids and further analysis of CypA simulations: individual RMSD plots, carbonyl to backbone distances, secondary structure, per-residue RMSD, and per-residue backbone dihedral angles; tables include individual force field bonded parameters, PDB identifier codes for each fluorinated amino acid, relative free-energy differences between blocked tetrapeptides, CSA tensors used for relaxation calculations, and comparison of average relaxation rates to experimental values (PDF)
The authors declare the following competing financial interest(s): L.T.C. is a current member of the Scientific Advisory Board of OpenEye Scientific and an Open Science Fellow with Roivant Sciences.
Notes
Input files, analysis scripts, and the force field parameter files are available from https://github.com/chonglab-pitt/ff15ipq-19F. The Python program for fluorine relaxation calculations is open source and available from https://github.com/chonglab-pitt/fluorelax.
Supplementary Material
References
- Gronenborn A. M. Small, but Powerful and Attractive: 19F in Biomolecular NMR. Structure 2022, 30, 6–14. 10.1016/j.str.2021.09.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Debiec K. T.; Cerutti D. S.; Baker L. R.; Gronenborn A. M.; Case D. A.; Chong L. T. Further along the Road Less Traveled: AMBER ff15ipq, an Original Protein Force Field Built on a Self-Consistent Physical Model. J. Chem. Theory Comput. 2016, 12, 3926–3947. 10.1021/acs.jctc.6b00567. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Friedrichs M. S.; Eastman P.; Vaidyanathan V.; Houston M.; Legrand S.; Beberg A. L.; Ensign D. L.; Bruns C. M.; Pande V. S. Accelerating Molecular Dynamic Simulation on Graphics Processing Units. J. Comput. Chem. 2009, 30, 864–872. 10.1002/jcc.21209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stone J. E.; Hardy D. J.; Ufimtsev I. S.; Schulten K. GPU-Accelerated Molecular Modeling Coming of Age. J. Mol. Graphics Modell. 2010, 29, 116–125. 10.1016/j.jmgm.2010.06.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gerig J. T. Fluorine NMR of Proteins. Prog. Nucl. Magn. Reson. Spectrosc. 1994, 26, 293–370. 10.1016/0079-6565(94)80009-X. [DOI] [Google Scholar]
- Luck L. A.; Falke J. J. Fluorine-19 NMR Studies of the D-Galactose Chemosensory Receptor. 1. Sugar Binding Yields a Global Structural Change. Biochemistry 1991, 30, 4248–4256. 10.1021/bi00231a021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sykes B. D.; Hull W. E. [13] Fluorine Nuclear Magnetic Resonance Studies of Proteins. Methods Enzymol. 1978, 49, 270–295. 10.1016/S0076-6879(78)49015-9. [DOI] [PubMed] [Google Scholar]
- Sharaf N. G.; Gronenborn A. M. 19F-Modified Proteins and 19F-Containing Ligands as Tools in Solution NMR Studies of Protein Interactions. Methods Enzymol. 2015, 565, 67–95. 10.1016/bs.mie.2015.05.014. [DOI] [PubMed] [Google Scholar]
- Welte H.; Zhou T.; Mihajlenko X.; Mayans O.; Kovermann M. What Does Fluorine Do to a Protein? Thermodynamic, and Highly-Resolved Structural Insights into Fluorine-Labelled Variants of the Cold Shock Protein. Sci. Rep. 2020, 10, 2640 10.1038/s41598-020-59446-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stadmiller S. S.; Aguilar J. S.; Waudby C. A.; Pielak G. J. Rapid Quantification of Protein-Ligand Binding via 19F NMR Lineshape Analysis. Biophys. J. 2020, 118, 2537–2548. 10.1016/j.bpj.2020.03.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sharaf N. G.; Ishima R.; Gronenborn A. M. Conformational Plasticity of the NNRTI-Binding Pocket in HIV-1 Reverse Transcriptase: A Fluorine Nuclear Magnetic Resonance Study. Biochemistry 2016, 55, 3864–3873. 10.1021/acs.biochem.6b00113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Danielson M. A.; Falke J. J. Use of 19F NMR to Probe Protein Structure and Conformationsl Changes. Annu. Rev. Biophys. Biomol. Struct. 1996, 25, 163–195. 10.1146/annurev.bb.25.060196.001115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Campos-Olivas R.; Aziz R.; Helms G. L.; Evans J. N. S.; Gronenborn A. M. Placement of 19F into the Center of GB1: Effects on Structure and Stability. FEBS Lett. 2002, 517, 55–60. 10.1016/S0014-5793(02)02577-2. [DOI] [PubMed] [Google Scholar]
- Peng J. W. Cross-Correlated 19F Relaxation Measurements for the Study of Fluorinated Ligand-Receptor Interactions. J. Magn. Reson. 2001, 153, 32–47. 10.1006/jmre.2001.2422. [DOI] [PubMed] [Google Scholar]
- Marsh E. N. G.; Suzuki Y. Using 19F NMR to Probe Biological Interactions of Proteins and Peptides. ACS Chem. Biol. 2014, 9, 1242–1250. 10.1021/cb500111u. [DOI] [PubMed] [Google Scholar]
- Lu M.; Wang M.; Sergeyev I. V.; Quinn C. M.; Struppe J.; Rosay M.; Maas W.; Gronenborn A. M.; Polenova T. 19F Dynamic Nuclear Polarization at Fast Magic Angle Spinning for NMR of HIV-1 Capsid Protein Assemblies. J. Am. Chem. Soc. 2019, 141, 5681–5691. 10.1021/jacs.8b09216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Campomizzi C. S.; Ghanatios G. E.; Estrada D. F. 19F-NMR Reveals Substrate Specificity of CYP121A1 in Mycobacterium tuberculosis. J. Biol. Chem. 2021, 297, 101287 10.1016/j.jbc.2021.101287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Didenko T.; Liu J. J.; Horst R.; Stevens R. C.; Wüthrich K. Fluorine-19 NMR of Integral Membrane Proteins Illustrated with Studies of GPCRs. Curr. Opin. Struct. Biol. 2013, 23, 740–747. 10.1016/j.sbi.2013.07.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Loewen M. C.; Klein-Seetharaman J.; Getmanova E. V.; Reeves P. J.; Schwalbe H.; Khorana H. G. Solution 19F Nuclear Overhauser Effects in Structural Studies of the Cytoplasmic Domain of Mammalian Rhodopsin. Proc. Natl. Acad. Sci. U.S.A. 2001, 98, 4888–4892. 10.1073/pnas.051633098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boeszoermenyi A.; Chhabra S.; Dubey A.; Radeva D. L.; Burdzhiev N. T.; Chanev C. D.; Petrov O. I.; Gelev V. M.; Zhang M.; Anklin C.; et al. Aromatic 19F-13C TROSY: A Background-Free Approach to Probe Biomolecular Structure, Function, and Dynamics. Nat. Methods 2019, 16, 333–340. 10.1038/s41592-019-0334-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Becette O. B.; Zong G.; Chen B.; Taiwo K. M.; Case D. A.; Dayie T. K. Solution NMR Readily Reveals Distinct Structural Folds and Interactions in Doubly 13C- And 19F-Labeled RNAs. Sci. Adv. 2020, 6, eabc6572 10.1126/sciadv.abc6572. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Matei E.; Gronenborn A. M. 19F Paramagnetic Relaxation Enhancement: A Valuable Tool for Distance Measurements in Proteins. Angew. Chem. 2016, 128, 158–162. 10.1002/ange.201508464. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buchholz C. R.; Pomerantz W. C. K. 19F NMR Viewed through Two Different Lenses: Ligand-Observed and Protein-Observed 19F NMR Applications for Fragment-Based Drug Discovery. RSC Chem. Biol. 2021, 2, 1312–1330. 10.1039/D1CB00085C. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dalvit C.; Vulpetti A. Ligand-Based Fluorine NMR Screening: Principles and Applications in Drug Discovery Projects. J. Med. Chem. 2019, 62, 2218–2244. 10.1021/acs.jmedchem.8b01210. [DOI] [PubMed] [Google Scholar]
- Li C.; Wang G. F.; Wang Y.; Creager-Allen R.; Lutz E. A.; Scronce H.; Slade K. M.; Ruf R. A. S.; Mehl R. A.; Pielak G. J. Protein 19F NMR in Escherichia coli. J. Am. Chem. Soc. 2010, 132, 321–327. 10.1021/ja907966n. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ye Y.; Liu X.; Xu G.; Liu M.; Li C. Direct Observation of Ca2+-Induced Calmodulin Conformational Transitions in Intact Xenopus Laevis Oocytes by 19F NMR Spectroscopy. Angew. Chem. 2015, 127, 5418–5420. 10.1002/ange.201500261. [DOI] [PubMed] [Google Scholar]
- Veronesi M.; Giacomina F.; Romeo E.; Castellani B.; Ottonello G.; Lambruschini C.; Garau G.; Scarpelli R.; Bandiera T.; Piomelli D.; et al. Fluorine Nuclear Magnetic Resonance-Based Assay in Living Mammalian Cells. Anal. Biochem. 2016, 495, 52–59. 10.1016/j.ab.2015.11.015. [DOI] [PubMed] [Google Scholar]
- Sun X.; Dyson H. J.; Wright P. E. Fluorotryptophan Incorporation Modulates the Structure and Stability of Transthyretin in a Site-Specific Manner. Biochemistry 2017, 56, 5570–5581. 10.1021/acs.biochem.7b00815. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Crowley P. B.; Kyne C.; Monteith W. B. Simple and Inexpensive Incorporation of 19F-Tryptophan for Protein NMR Spectroscopy. Chem. Commun. 2012, 48, 10681–10683. 10.1039/c2cc35347d. [DOI] [PubMed] [Google Scholar]
- Arntson K. E.; Pomerantz W. C. K. Protein-Observed Fluorine NMR: A Bioorthogonal Approach for Small Molecule Discovery. J. Med. Chem. 2016, 59, 5158–5171. 10.1021/acs.jmedchem.5b01447. [DOI] [PubMed] [Google Scholar]
- Shu Q.; Frieden C. Urea-Dependent Unfolding of Murine Adenosine Deaminase: Sequential Destabilization As Measured by 19F NMR. Biochemistry 2004, 43, 1432–1439. 10.1021/bi035651x. [DOI] [PubMed] [Google Scholar]
- Frieden C.; Hoeltzli S. D.; Bann J. G. The Preparation of 19F-Labeled Proteins for NMR Studies. Methods Enzymol. 2004, 380, 400–415. 10.1016/s0076-6879(04)80018-1. [DOI] [PubMed] [Google Scholar]
- Bogetti A. T.; Piston H. E.; Leung J. M. G.; Cabalteja C. C.; Yang D. T.; Degrave A. J.; Debiec K. T.; Cerutti D. S.; Case D. A.; Horne W. S.; et al. A Twist in the Road Less Traveled: The AMBER ff15ipq-m Force Field for Protein Mimetics. J. Chem. Phys 2020, 153, 064101 10.1063/5.0019054. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Debiec K. T.; Gronenborn A. M.; Chong L. T. Evaluating the Strength of Salt Bridges: A Comparison of Current Biomolecular Force Fields. J. Phys. Chem. B 2014, 118, 6561–6569. 10.1021/jp500958r. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lin F. Y.; Huang J.; Pandey P.; Rupakheti C.; Li J.; Roux B.; Mackerell A. D. Further Optimization and Validation of the Classical Drude Polarizable Protein Force Field. J. Chem. Theory Comput. 2020, 16, 3221–3239. 10.1021/acs.jctc.0c00057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shi Y.; Xia Z.; Zhang J.; Best R.; Wu C.; Ponder J. W.; Ren P. Polarizable Atomic Multipole-Based AMOEBA Force Field for Proteins. J. Chem. Theory Comput. 2013, 9, 4046–4063. 10.1021/ct4003702. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Koes D. R.; Vries J. K. Evaluating Amber Force Fields Using Computed NMR Chemical Shifts. Proteins 2017, 85, 1944–1956. 10.1002/prot.25350. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hoffmann F.; Mulder F. A. A.; Schäfer L. V. Predicting NMR Relaxation of Proteins from Molecular Dynamics Simulations with Accurate Methyl Rotation Barriers. J. Chem. Phys. 2020, 152, 084102 10.1063/1.5135379. [DOI] [PubMed] [Google Scholar]
- Götz A. W.; Williamson M. J.; Xu D.; Poole D.; Le Grand S.; Walker R. C. Routine Microsecond Molecular Dynamics Simulations with AMBER on GPUs. 1. Generalized Born. J. Chem. Theory Comput. 2012, 8, 1542–1555. 10.1021/ct200909j. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Salomon-Ferrer R.; Götz A. W.; Poole D.; Le Grand S.; Walker R. C. Routine Microsecond Molecular Dynamics Simulations with AMBER on GPUs. 2. Explicit Solvent Particle Mesh Ewald. J. Chem. Theory Comput. 2013, 9, 3878–3888. 10.1021/ct400314y. [DOI] [PubMed] [Google Scholar]
- Croitoru A.; Park S. J.; Kumar A.; Lee J.; Im W.; Mackerell A. D.; Aleksandrov A. Additive CHARMM36 Force Field for Nonstandard Amino Acids. J. Chem. Theory Comput. 2021, 17, 3554–3570. 10.1021/acs.jctc.1c00254. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robalo J. R.; Vila Verde A. Unexpected Trends in the Hydrophobicity of Fluorinated Amino Acids Reflect Competing Changes in Polarity and Conformation. Phys. Chem. Chem. Phys. 2019, 21, 2029–2038. 10.1039/C8CP07025C. [DOI] [PubMed] [Google Scholar]
- Robalo J. R.; Huhmann S.; Koksch B.; Vila Verde A. The Multiple Origins of the Hydrophobicity of Fluorinated Apolar Amino Acids. Chem 2017, 3, 881–897. 10.1016/j.chempr.2017.09.012. [DOI] [Google Scholar]
- Wang X.; Li W. Development and Testing of Force Field Parameters for Phenylalanine and Tyrosine Derivatives. Front. Mol. Biosci. 2020, 7, 350 10.3389/fmolb.2020.608931. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tobola F.; Lelimousin M.; Varrot A.; Gillon E.; Darnhofer B.; Blixt O.; Birner-Gruenberger R.; Imberty A.; Wiltschi B. Effect of Noncanonical Amino Acids on Protein-Carbohydrate Interactions: Structure, Dynamics, and Carbohydrate Affinity of a Lectin Engineered with Fluorinated Tryptophan Analogs. ACS Chem. Biol. 2018, 13, 2211–2219. 10.1021/acschembio.8b00377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Panek J. J.; Ward T. R.; Jezierska A.; Noviĉ M. Effects of Tryptophan Residue Fluorination on Streptavidin Stability and Biotin-Streptavidin Interactions via Molecular Dynamics Simulations. J. Mol. Model. 2009, 15, 257–266. 10.1007/s00894-008-0382-0. [DOI] [PubMed] [Google Scholar]
- Rashid S.; Lee B. L.; Wajda B.; Spyracopoulos L. Side-Chain Dynamics of the Trifluoroacetone Cysteine Derivative Characterized by 19F NMR Relaxation and Molecular Dynamics Simulations. J. Phys. Chem. B 2019, 123, 3665–3671. 10.1021/acs.jpcb.9b01741. [DOI] [PubMed] [Google Scholar]
- Cerutti D. S.; Swope W. C.; Rice J. E.; Case D. A. ff14ipq: A Self-Consistent Force Field for Condensed-Phase Simulations of Proteins. J. Chem. Theory Comput. 2014, 10, 4515–4534. 10.1021/ct500643c. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cerutti D. S.; Rice J. E.; Swope W. C.; Case D. A. Derivation of Fixed Partial Carges for Amino Acids Accommodating a Specific Water Model and Implicit Polarization. J. Phys. Chem. B 2013, 117, 2328–2338. 10.1021/jp311851r. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Takemura K.; Kitao A. Water Model Tuning for Improved Reproduction of Rotational Diffusion and NMR Spectral Density. J. Phys. Chem. B 2012, 116, 6279–6287. 10.1021/jp301100g. [DOI] [PubMed] [Google Scholar]
- Hoffmann F.; Xue M.; Schäfer L. V.; Mulder F. A. A. Narrowing the Gap between Experimental and Computational Determination of Methyl Group Dynamics in Proteins. Phys. Chem. Chem. Phys. 2018, 20, 24577–24590. 10.1039/C8CP03915A. [DOI] [PubMed] [Google Scholar]
- Hoffmann F.; Mulder F. A. A.; Schäfer L. V. Accurate Methyl Group Dynamics in Protein Simulations with AMBER Force Fields. J. Phys. Chem. B 2018, 122, 5038–5048. 10.1021/acs.jpcb.8b02769. [DOI] [PubMed] [Google Scholar]
- Case D. A.; Cheatham T. E.; Darden T.; Gohlke H.; Luo R.; Merz K. M.; Onufriev A.; Simmerling C.; Wang B.; Woods R. J. The Amber Biomolecular Simulation Programs. J. Comput. Chem. 2005, 26, 1668–1688. 10.1002/jcc.20290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eastman P.; Swails J.; Chodera J. D.; McGibbon R. T.; Zhao Y.; Beauchamp K. A.; Wang L. P.; Simmonett A. C.; Harrigan M. P.; Stern C. D.; et al. OpenMM 7: Rapid Development of High Performance Algorithms for Molecular Dynamics. PLoS Comput. Biol. 2017, 13, e1005659 10.1371/journal.pcbi.1005659. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shirts M. R.; Klein C.; Swails J. M.; Yin J.; Gilson M. K.; Mobley D. L.; Case D. A.; Zhong E. D. Lessons Learned from Comparing Molecular Dynamics Engines on the SAMPL5 Dataset. J. Comput.-Aided Mol. Des. 2017, 31, 147–161. 10.1007/s10822-016-9977-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jakalian A.; Jack D. B.; Bayly C. I. Fast, Efficient Generation of High-Quality Atomic Charges. AM1-BCC Model: II. Parameterization and Validation. J. Comput. Chem. 2002, 23, 1623–1641. 10.1002/jcc.10128. [DOI] [PubMed] [Google Scholar]
- Dunning T. H.; Peterson K. A.; Wilson A. K. Gaussian Basis Sets for Use in Correlated Molecular Calculations. X. The Atoms Aluminum through Argon Revisited. J. Chem. Phys. 2001, 114, 9244–9253. 10.1063/1.1367373. [DOI] [Google Scholar]
- Woon D. E.; Dunning T. H. Gaussian Basis Sets for Use in Correlated Molecular Calculations. III. The Atoms Aluminum through Argon. J. Chem. Phys. 1993, 98, 1358–1371. 10.1063/1.464303. [DOI] [Google Scholar]
- Dunning T. H. Gaussian Basis Sets for Use in Correlated Molecular Calculations. I. The Atoms Boron through Neon and Hydrogen. J. Chem. Phys. 1989, 90, 1007–1023. 10.1063/1.456153. [DOI] [Google Scholar]
- Kendall R. A.; Dunning T. H.; Harrison R. J. Electron Affinities of the First-Row Atoms Revisited. Systematic Basis Sets and Wave Functions. J. Chem. Phys. 1992, 96, 6796–6806. 10.1063/1.462569. [DOI] [Google Scholar]
- Neese F.; Wennmohs F.; Becker U.; Riplinger C. The ORCA Quantum Chemistry Program Package. J. Chem. Phys. 2020, 152, 224108 10.1063/5.0004608. [DOI] [PubMed] [Google Scholar]
- He X.; Man V. H.; Yang W.; Lee T. S.; Wang J. A Fast and High-Quality Charge Model for the next Generation General AMBER Force Field. J. Chem. Phys. 2020, 153, 114502 10.1063/5.0019056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Case D. A.; Ben-Shalom I. Y.; Brozell S. R.; Cerutti D. S.; Cheatham T. E. III; Cruzeiro V. W. D.; Darden T. A.; Duke R. E.; Ghoreishi D.; Gilson M. K.; et al. AMBER 18; University of California: San Francisco, 2018.
- Hanwell M. D.; Curtis D. E.; Lonie D. C.; Vandermeerschd T.; Zurek E.; Hutchison G. R. Avogadro: An Advanced Semantic Chemical Editor, Visualization, and Analysis Platform. J. Cheminf. 2012, 4, 17 10.1186/1758-2946-4-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berman H. M.; Westbrook J.; Feng Z.; Gilliland G.; Bhat T. N.; Weissig H.; Shindyalov I. N.; Bourne P. E. The Protein Data Bank. Nucleic Acids Res. 2000, 28, 235–242. 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fraser J. S.; Clarkson M. W.; Degnan S. C.; Erion R.; Kern D.; Alber T. Hidden Alternative Structures of Proline Isomerase Essential for Catalysis. Nature 2009, 462, 669–673. 10.1038/nature08615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pettersen E. F.; Goddard T. D.; Huang C. C.; Couch G. S.; Greenblatt D. M.; Meng E. C.; Ferrin T. E. UCSF Chimera—A Visualization System for Exploratory Research and Analysis. J. Comput. Chem. 2004, 25, 1605–1612. 10.1002/jcc.20084. [DOI] [PubMed] [Google Scholar]
- Joung I. S.; Cheatham T. E. Determination of Alkali and Halide Monovalent Ion Parameters for Use in Explicitly Solvated Biomolecular Simulations. J. Phys. Chem. B 2008, 112, 9020–9041. 10.1021/jp8001614. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lu M.; Ishima R.; Polenova T.; Gronenborn A. M. 19F NMR Relaxation Studies of Fluorosubstituted Tryptophans. J. Biomol. NMR 2019, 73, 401–409. 10.1007/s10858-019-00268-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kumar S.; Rosenberg J. M.; Bouzida D.; Swendsen R. H.; Kollman P. A. Multi Dimensional Free-Energy Calculations Using the Weighted Histogram Analysis Method. J. Comput. Chem. 1995, 16, 1339–1350. 10.1002/jcc.540161104. [DOI] [Google Scholar]
- Torrie G. M.; Valleau J. P. Nonphysical Sampling Distributions in Monte Carlo Free-Energy Estimation: Umbrella Sampling. J. Comput. Phys. 1977, 23, 187–199. 10.1016/0021-9991(77)90121-8. [DOI] [Google Scholar]
- Grossfield A.WHAM: The Weighted Histogram Analysis Method; Scientific Research Publishing Inc.: Rochester, NY, 2006. [Google Scholar]
- Essmann U.; Perera L.; Berkowitz M. L.; Darden T.; Lee H.; Pedersen L. G. A Smooth Particle Mesh Ewald Method. J. Chem. Phys. 1995, 103, 234505 10.1063/1.470117. [DOI] [Google Scholar]
- Ryckaert J.-P.; Ciccotti G.; Berendsen H. J. C. Numerical Integration of the Cartesian Equations of Motion of a System with Constraints: Molecular Dynamics of n-Alkanes. J. Comput. Phys. 1977, 23, 321–341. 10.1016/0021-9991(77)90098-5. [DOI] [Google Scholar]
- Roe D. R.; Cheatham T. E. PTRAJ and CPPTRAJ: Software for Processing and Analysis of Molecular Dynamics Trajectory Data. J. Chem. Theory Comput. 2013, 9, 3084–3095. 10.1021/ct400341p. [DOI] [PubMed] [Google Scholar]
- Feng Z.; Chen L.; Maddula H.; Akcan O.; Oughtred R.; Berman H. M.; Westbrook J. Ligand Depot: A Data Warehouse for Ligands Bound to Macromolecules. Bioinformatics 2004, 20, 2153–2155. 10.1093/bioinformatics/bth214. [DOI] [PubMed] [Google Scholar]
- Goldman M. Interference Effects in the Relaxation of a Pair of Unlike Spin- 1 2 Nuclei. J. Magn. Reson. 1984, 60, 437–452. 10.1016/0022-2364(84)90055-6. [DOI] [Google Scholar]
- Haeberlen U.High Resolution NMR in Solids Selective Averaging; Elsevier, 1976. [Google Scholar]
- Grace R. C. R.; Kumar A. Observation of Cross Correlations in a Weakly Coupled 19F-1H Four-Spin System. J. Magn. Reson., Ser. A 1995, 115, 87–93. 10.1006/jmra.1995.1151. [DOI] [Google Scholar]
- Ottiger M.; Zerbe O.; Güntert P.; Wüthrich K. The NMR Solution Conformation of Unligated Human Cyclophilin A. J. Mol. Biol. 1997, 272, 64–81. 10.1006/jmbi.1997.1220. [DOI] [PubMed] [Google Scholar]
- Lu M.; Sarkar S.; Wang M.; Kraus J.; Fritz M.; Quinn C. M.; Bai S.; Holmes S. T.; Dybowski C.; Yap G. P. A.; et al. 19F Magic Angle Spinning NMR Spectroscopy and Density Functional Theory Calculations of Fluorosubstituted Tryptophans: Integrating Experiment and Theory for Accurate Determination of Chemical Shift Tensors. J. Phys. Chem. B 2018, 122, 6148–6155. 10.1021/acs.jpcb.8b00377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lau E. Y.; Gerig J. T. Effects of Fluorine Substitution on the Structure and Dynamics of Complexes of Dihydrofolate Reductase (Escherichia coli). Biophys. J. 1997, 73, 1579–1592. 10.1016/S0006-3495(97)78190-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ayala I.; Perry J. J. P.; Szczepanski J.; Tainer J. A.; Vala M. T.; Nick H. S.; Silverman D. N. Hydrogen Bonding in Human Manganese Superoxide Dismutase Containing 3-Fluorotyrosine. Biophys. J. 2005, 89, 4171–4179. 10.1529/biophysj.105.060616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Colgan J.; Yuan H.; Franke E. K.; Luban J. Binding of the Human Immunodeficiency Virus Type 1 Gag Polyprotein to Cyclophilin A Is Mediated by the Central Region of Capsid and Requires Gag Dimerization. J. Virol. 1996, 70, 4299–4310. 10.1128/jvi.70.7.4299-4310.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hayward S. Peptide-Plane Flipping in Proteins. Protein Sci. 2001, 10, 2219–2227. 10.1110/ps.23101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kay L. E.; Nicholson L. K.; Delaglio F.; Bax A.; Torchia D. A. Pulse Sequences for Removal of the Effects of Cross Correlation between Dipolar and Chemical-Shift Anisotropy Relaxation Mechanisms on the Measurement of Heteronuclear T1 and T2 Values in Proteins. J. Magn. Reson. 1992, 97, 359–375. 10.1016/0022-2364(92)90320-7. [DOI] [Google Scholar]
- Lipari G.; Szabo A. Model-Free Approach to the Interpretation of Nuclear Magnetic Resonance Relaxation in Macromolecules. 1. Theory and Range of Validity. J. Am. Chem. Soc. 1982, 104, 4546–4559. 10.1021/ja00381a009. [DOI] [Google Scholar]
- Wang L. P.; McKiernan K. A.; Gomes J.; Beauchamp K. A.; Head-Gordon T.; Rice J. E.; Swope W. C.; Martínez T. J.; Pande V. S. Building a More Predictive Protein Force Field: A Systematic and Reproducible Route to AMBER-FB15. J. Phys. Chem. B 2017, 121, 4023–4039. 10.1021/acs.jpcb.7b02320. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stoppelman J. P.; Ng T. T.; Nerenberg P. S.; Wang L. P. Development and Validation of AMBER-FB15-Compatible Force Field Parameters for Phosphorylated Amino Acids. J. Phys. Chem. B 2021, 125, 11927–11942. 10.1021/acs.jpcb.1c07547. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Unke O. T.; Chmiela S.; Sauceda H. E.; Gastegger M.; Poltavsky I.; Schütt K. T.; Tkatchenko A.; Müller K. R. Machine Learning Force Fields. Chem. Rev. 2021, 121, 10142–10186. 10.1021/acs.chemrev.0c01111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith J. S.; Isayev O.; Roitberg A. E. ANI-1: An Extensible Neural Network Potential with DFT Accuracy at Force Field Computational Cost. Chem. Sci. 2017, 8, 3192–3203. 10.1039/C6SC05720A. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Devereux C.; Smith J. S.; Davis K. K.; Barros K.; Zubatyuk R.; Isayev O.; Roitberg A. E. Extending the Applicability of the ANI Deep Learning Molecular Potential to Sulfur and Halogens. J. Chem. Theory Comput. 2020, 16, 4192–4202. 10.1021/acs.jctc.0c00121. [DOI] [PubMed] [Google Scholar]
- Qiao Z.; Welborn M.; Anandkumar A.; Manby F. R.; Miller T. F. OrbNet: Deep Learning for Quantum Chemistry Using Symmetry-Adapted Atomic-Orbital Features. J. Chem. Phys. 2020, 153, 124111 10.1063/5.0021955. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.