Skip to main content
Data in Brief logoLink to Data in Brief
. 2016 Mar 9;7:582–590. doi: 10.1016/j.dib.2016.02.086

Molecular dynamics simulations data of the twenty encoded amino acids in different force fields

F Vitalini a, F Noé b, BG Keller a,
PMCID: PMC4802541  PMID: 27054161

Abstract

We present extensive all-atom Molecular Dynamics (MD) simulation data of the twenty encoded amino acids in explicit water, simulated with different force fields. The termini of the amino acids have been capped to ensure that the dynamics of the Φ and ψ torsion angles are analogues to the dynamics within a peptide chain. We use representatives of each of the four major force field families: AMBER ff-99SBILDN [1], AMBER ff-03 [2], OPLS-AA/L [3], CHARMM27 [4] and GROMOS43a1 [5], [6]. Our data represents a library and test bed for method development for MD simulations and for force fields development. Part of the data set has been previously used for comparison of the dynamic properties of force fields (Vitalini et al., 2015) [7] and for the construction of peptide basis functions for the variational approach to molecular kinetics [8].

Keywords: Molecular dynamics, Force field, Amino acid


Specifications Table

Subject area Chemistry, Biology
More specific subject area Computational Molecular biophysics
Type of data Molecular Dynamics (MD) simulations
How data was acquired Classical all-atom MD simulation in explicit solvent
Data format GROMACS [9] trajectory output xtc
Experimental factors NVT ensemble at 300 K
Experimental features GROMACS 4.5.5 [9] software
Data source location Freie Universität Berlin Germany
Data accessibility Data within this article

Value of the data

  • The dataset constitutes a library of extensive all-atom simulations of the basic dynamic building block of peptides and proteins. The total simulation time in the data set amounts to 200 μs.

  • MD simulations represent a powerful tool to investigate the time evolution of biomolecules at atomistic resolution. For small systems, such as amino acids, it represents the only technique to probe the structural details of the dynamic processes.

  • New methods for the analysis of MD simulations are often tested on alanine dipeptide (Ac-A-NHMe), for which many groups have trajectories available. Our data set extends the set of available test cases to all twenty encoded amino acids.

  • The data can be used to construct peptide basis functions for the variational approach to molecular kinetics [10], [11], as described in Ref. [8].

  • Our data set allows the user to probe and compare the properties of the force fields [7] and to make an informed decision when choosing a force field for the simulation of larger systems.

1. Data

The public repository (ftp://bdg.chemie.fu-berlin.de/Ac-X-NHMe/) is structured as following. In the main folder a README.txt file can be found. It illustrates the simulation details common to all the set-ups, also described in Experimental Design, Material and Methods A. A GROMACS-specific [9] simulation parameters file (nvt_production_1mus.mdp) is also adduced for clarity.

The data is sorted according to force field. In each force-field subfolder twenty folders are present, one per amino acid. The folders are denoted as Ac-X-NHMe, where X is replaced with the one letter code of the amino acid. Within an amino-acid specific folder, another README.txt summarizes the number and length of the independent runs. A sub-folder is associated to each independent run, including the initial configuration Ac-X-NHMe_run0.gro and the trajectory in (GROMACS binary) format (.xtc). Moreover, a topology file (Ac-X-NHMe.top) is given, which contains the atom types, the bonded and non-bonded parameters of the force field of choice, and lists the constraints. The initial configuration (.gro), the topology file (.top) and the simulation file (.mdp) permit the re-run of the simulations.

2. Experimental design, materials and methods

2.1. MD simulations

We performed all-atom MD simulations in explicit solvent of terminally blocked amino acids, acetyl-X-methylamide (Ac-X-NHMe), where X stands for any of the twenty encoded amino acids. All twenty amino acids were simulated with five different force fields: AMBER ff-99SB-ILDN [1], AMBER ff-03 [2], OPLS-AA/L [3], CHARMM27 [4] and GROMOS43a1 [5], [6]. The water model was chosen to be in agreement with the one used for the validation of the force field, i.e. TIP3P [12] for AMBER ff99SB-ILDN, AMBER ff03, OPLS-AA/L and CHARMM27, and SPC [13] for GROMOS43a1. Simulations were performed with the GROMACS 4.5.5 simulation package9. The number of particles and the volume were fixed during the simulations. Temperature was restrained at 300 K using the V-Rescale thermostat [14]. Each initial set up was minimised using the steepest descent algorithm and equilibrated in the NVT ensemble for 100 ps. Subsequently two independent production runs of 1 μs each, were carried out for each amino acid/force field combination (exception: aliphatic amino acids A, G, I, L and P in ff99SB-ILDN [1] force field, production runs of 200 ns each). This yields to a total simulation time of 2 μs per simulation setup (exception: aliphatic amino acids A, G, I, L and P in ff99SB-ILDN [1] force field, 4 μs; A, V in ff-03 [2], OPLS-AA/L [3], CHARMM27 [4] and GROMOS43a1 [5], [6], 4 μs). The integration time-step was of 2 fs and atom positions of the solute were written to file every 1 ps. In the production runs, the leap-frog intergrator was used and bonds to hydrogen atoms were constrained using the LINCS algorithm [15] (lincs iter=1, lincs order=4). A cut-off of 1 nm was used for Lennard–Jones interactions. Electrostatic interactions were treated by the Particle-Mesh Ewald (PME) algorithm [16] in combination with a real space cut-off of 1 nm, a grid spacing of 0.16 nm, and an interpolation order of 4. Periodic boundary conditions were applied in all three dimensions. For further details refer to Table 1.

Table 1.

Simulation parameters per amino acid and force field: number of water molecules, size of simulation box, number of independent runs and total simulation time.

Amino Acid ff_AMBER ff99SB -ILDN
ff_AMBER ff03
ff_CHARMM27
ff_OPLS AA/L
ff_GROMOS43a1
# H2O box-size sim. time # H2O box-size sim. time # H2O box-size sim. time # H2O box-size sim. time # H2O box-size sim. time
Ac-A-NHMe 651 (2.7 nm)3 20×200 ns 646 (2.7 nm) 3 4×1 μs 680 (2.8 nm) 3 4×1 μs 684 (2.8 nm) 3 4×1 μs 531 (2.5 nm) 3 4×1 μs
Ac-C-NHMe 717 (2.8 nm) 3 2×1 μs 717 (2.8 nm) 3 2×1 μs 717 (2.8 nm) 3 2×1 μs 717 (2.8 nm) 3 2×1 μs 610 (2.7 nm) 3 2×1 μs
Ac-D-NHMe 763 (2.9 nm) 3 2×1 μs 763 (2.9 nm) 3 2×1 μs 751 (2.8 nm) 3 2×1 μs 763 (2.9 nm) 3 2×1 μs 648 (2.7 nm) 3 2×1 μs
Ac-E-NHMe 796 (2.9 nm) 3 2×1 μs 796 (2.9 nm) 3 2×1 μs 796 (2.9 nm) 3 2×1 μs 796 (2.9 nm) 3 2×1 μs 749 (2.8 nm) 3 2×1 μs
Ac-F-NHMe 903 (3.0 nm) 3 2×1 μs 903 (3.0 nm) 3 2×1 μs 911 (3.0 nm) 3 2×1 μs 903 (3.0 nm) 3 2×1 μs 854 (3.0 nm) 3 2×1 μs
Ac-G-NHMe 795 (2.9 nm) 3 20×200 ns 643 (2.7 nm) 3 3×1 μs 628 (2.7 nm) 3 2×1 μs 644 (2.7 nm) 3 2×1 μs 534 (2.5 nm) 3 2×1 μs
Ac-H-NHMe 848 (2.9 nm) 3 2×1 μs 848 (2.9 nm) 3 2×1 μs 854 (2.9 nm) 3 2×1 μs 848 (2.9 nm) 3 2×1 μs 785 (2.9 nm) 3 2×1 μs
Ac-I-NHMe 766 (2.9 nm) 3 20×200 ns 751 (2.9 nm) 3 4×1 μs 751 (2.8 nm) 3 2×1 μs 751 (2.8 nm) 3 2×1 μs 636 (2.7 nm) 3 2×1 μs
Ac-K-NHMe 924 (3.0 nm) 3 2×1 μs 924 (3.0 nm) 3 2×1 μs 924 (3.0 nm) 3 2×1 μs 924 (3.0 nm) 3 2×1 μs 876 (3.0 nm) 3 2×1 μs
Ac-L-NHMe 796 (2.9 nm) 3 20×200 ns 726 (2.8 nm) 3 4×1 μs 739 (2.8 nm) 3 2×1 μs 726 (2.8 nm) 3 2×1 μs 622 (2.7 nm) 3 2×1 μs
Ac-M-NHMe 782 (2.9 nm) 3 2×1 μs 782 (2.9 nm) 3 2×1 μs 782 (2.9 nm) 3 2×1 μs 782 (2.9 nm) 3 2×1 μs 664 (2.7 nm) 3 2×1 μs
Ac-N-NHMe 722 (2.8 nm) 3 2×1 μs 722 (2.8 nm) 3 2×1 μs 727 (2.8 nm) 3 2×1 μs 722 (2.8 nm) 3 2×1 μs 679 (2.8 nm) 3 2×1 μs
Ac-P-NHMe 860 (3.0 nm) 3 20×200 ns 704 (2.8 nm) 3 4×1 μs 723 (2.8 nm) 3 2×1 μs 704 (2.8 nm) 3 2×1 μs 610 (2.7 nm) 3 2×1 μs
Ac-Q-NHMe 881 (3.0 nm) 3 2×1 μs 881 (3.0 nm) 3 2×1 μs 881 (3.0 nm) 3 2×1 μs 881 (3.0 nm) 3 2×1 μs 824 (2.9 nm) 3 2×1 μs
Ac-R-NHMe 1026 (3.2 nm) 3 2×1 μs 1026 (3.2 nm) 3 2×1 μs 1026 (3.2 nm) 3 2×1 μs 1026 (3.2 nm) 3 2×1 μs 997 (3.1 nm) 3 2×1 μs
Ac-S-NHMe 691 (2.8 nm) 3 2×1 μs 691 (2.8 nm) 3 2×1 μs 706 (2.8 nm) 3 2×1 μs 691 (2.8 nm) 3 2×1 μs 614 (2.7 nm) 3 2×1 μs
Ac-T-NHMe 748 (2.8 nm) 3 2×1 μs 748 (2.8 nm) 3 2×1 μs 724 (2.8 nm) 3 2×1 μs 748 (2.8 nm) 3 2×1 μs 626 (2.7 nm) 3 2×1 μs
Ac-V-NHMe 676 (2.8 nm) 3 20×200 ns 672 (2.8 nm) 3 4×1 μs 672 (2.8 nm) 3 4×1 μs 672 (2.8 nm) 3 4×1 μs 577 (2.6 nm) 3 4×1 μs
Ac-W-NHMe 930 (3.0 nm) 3 2×1 μs 930 (3.0 nm) 3 2×1 μs 916 (3.0 nm) 3 2×1 μs 930 (3.0 nm) 3 2×1 μs 869 (3.0 nm) 3 2×1 μs
Ac-Y-NHMe 99- (3.1 nm) 3 2×1 μs 990 (3.1 nm) 3 2×1 μs 990 (3.1 nm) 3 2×1 μs 990 (3.1 nm) 3 2×1 μs 925 (3.0 nm) 3 2×1 μs

2.2. Ramachandran plots

Backbone dihedral angles Φ and ψ are good reaction coordinates for the dynamics of amino acids and short peptides. Using the GROMACS command g_rama we extracted the Φ and ψ time-series from the MD trajectories. The space spanned by the {Φ−ψ}-combinations of a single amino acid (capped or within a peptide-chain), has a well-defined distribution and can be represented in a two-dimensional plot known as Ramachandran plot. We constructed a Ramachandran plot for each of the amino acid-force field combinations by making an histogram of the Φ and ψ time-series over 2 μs of simulation time onto a regular grid of 360×360 bins (bin-width 1° per coordinate). The natural logarithm of such histogram counts are shown in Fig. 1, Fig. 2, Fig. 3, Fig. 4, Fig. 5 as indication of the exploration of the configurational space over the simulation time.

Fig. 1.

Fig. 1.

Ramachandran plots of all terminally capped amino acids simulated with AMBER ff99SB-ILDN force field. Represented is the logarithm of the {Φ-ψ}-pairs counts on a 1°-grid.

Fig. 2.

Fig. 2.

Ramachandran plots of all terminally capped amino acids simulated with AMBER ff03 force field. Represented is the logarithm of the {Φ-ψ}-pairs counts on a 1°-grid.

Fig. 3.

Fig. 3.

Ramachandran plots of all terminally capped amino acids simulated with CHARMM27 force field. Represented is the logarithm of the {Φ-ψ}-pairs counts on a 1°-grid.

Fig. 4.

Fig. 4.

Ramachandran plots of all terminally capped amino acids simulated with OPLS-AA/L force field. Represented is the logarithm of the {Φ-ψ}-pairs counts on a 1°-grid.

Fig. 5.

Fig. 5.

Ramachandran plots of all terminally capped amino acids simulated with GROMOS43a1 force field. Represented is the logarithm of the {Φ-ψ}-pairs counts on a 1°-grid.

Acknowledgements

F.V. acknowledges funding from the Dahlem Research School. F.N. acknowledges funding from ERC starting grant “pcCell”. This Research has been partially funded by the Deutsche Forschungsgemeinschaft (DFG) through grant CRC 1114. The computer facilities of the Freie Universität Berlin (ZEDAT) are acknowledged for computer time.

Footnotes

Appendix A

Supplementary data associated with this article can be found in the online version at doi:10.1016/j.dib.2016.02.086.

Contributor Information

F. Noé, Email: frank.noe@fu-berlin.de.

B.G. Keller, Email: bettina.keller@fu-berlin.de.

Appendix A. Supplementary material

Supplementary material

mmc1.pdf (276.5KB, pdf)

References

  • 1.Lindorff-Larsen K., Piana S., Palmo K., Maragakis P., Klepeis J.L., Dror R.O., Shaw D.E. Proteins. 2010;78:1950–1958. doi: 10.1002/prot.22711. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Duan Y., Wu C., Chowdhury S., Lee M.C., Xiong G., Zhang W.E.I., Yang R., Cieplak P., Luo R.A.Y., Lee T., Caldwell J., Wang J., Kollman P. J. Comput. Chem. 2003;24:1999–2012. doi: 10.1002/jcc.10349. [DOI] [PubMed] [Google Scholar]
  • 3.Kaminski G.A., Friesner R.A., Tirado-Rives J., Jorgensen W.L. J. Phys. Chem. B. 2001;2:6474–6487. [Google Scholar]
  • 4.MacKerell a D., Banavali N., Foloppe N. Biopolymers. 2001;56:257–265. doi: 10.1002/1097-0282(2000)56:4<257::AID-BIP10029>3.0.CO;2-W. [DOI] [PubMed] [Google Scholar]
  • 5.Daura X., Mark A.E., Van Gunsteren W.F. J. Comp. Chem. 1998;19:535–547. [Google Scholar]
  • 6.Scott W.R.P., Hünenberger P.H., Tironi I.G., Mark A.E., Billeter S.R., Fennen J., Torda A.E., Huber T., Krüger P., van Gunsteren W.F. J. Phys. Chem. A. 1999;103:3596–3607. [Google Scholar]
  • 7.Vitalini F., Mey A.S.J.S., Noe’ F., Keller B.G. J. Chem. Phys. 2015;142:084101. doi: 10.1063/1.4909549. [DOI] [PubMed] [Google Scholar]
  • 8.Vitalini F., Noe’ F., Keller B.G. J. Chem. Theory Comput. 2015;11:3992–4004. doi: 10.1021/acs.jctc.5b00498. [DOI] [PubMed] [Google Scholar]
  • 9.Van Der Spoel D., Lindahl E., Hess B., Groenhof G., Mark A.E., Berendsen H.J.C. J. Comput. Chem. 2005;26:1701–1718. doi: 10.1002/jcc.20291. [DOI] [PubMed] [Google Scholar]
  • 10.Noé F., Nüske F. Multiscale Model Simul. 2013;11:635–655. [Google Scholar]
  • 11.Nüske F., Keller B.G., Pérez-Hernández G., Mey A.S.J.S., Noé F. J. Chem. Theory Comput. 2014;10:1739–1752. doi: 10.1021/ct4009156. [DOI] [PubMed] [Google Scholar]
  • 12.Jorgensen W.L., Chandrasekhar J., Madura J.D., Impey R.W., Klein M.L. J. Chem. Phys. 1983;79:926–935. [Google Scholar]
  • 13.Berendsen H.J.C., Grigera J.R., Straatsma T.P.J. Phys. Chem. 1987;91:6269–6271. [Google Scholar]
  • 14.Bussi G., Donadio D., Parrinello M. J. Chem. Phys. 2007;126:014101. doi: 10.1063/1.2408420. [DOI] [PubMed] [Google Scholar]
  • 15.Hess B., Bekker H., Berendsen H.J.C., Fraaije J.G.E.M. J. Comput. Chem. 1997;18:1463–1472. [Google Scholar]
  • 16.Darden T., York D., Pedersen L. J. Phys. Chem. 1993;98:10089–10092. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary material

mmc1.pdf (276.5KB, pdf)

Articles from Data in Brief are provided here courtesy of Elsevier

RESOURCES