Table I.
Simulation set | # of proteins | # of simulations | Simulation time (μs) | # of structures | Simulation data (TB)a | Analysis data (TB) |
---|---|---|---|---|---|---|
298 K | 996 | 1259 | 39.4 | 56.6 × 106 | 7.6 | 1.3 |
498 K | 922 | 5355 | 111.5 | 159.3 × 106 | 35.1 | 6.4 |
SNPs (310 K) | 229 | 649 | 30.2 | 30.2 × 106 | 11.2 | 2.0 |
DB Totalb | 1225 | 7263 | 181.1 | 246.1 × 106 | 53.9 | 9.7 |
Top 100 (298 K)c | 100 | 100 | 3.2 | 3.2 × 106 | 1.0 | 0.2 |
SLIRP:GGXGG (298 K)c | 23 | 38 | 3.8 | 3.8 × 106 | 0.017 | 0.007 |
Simulations waiting to be loaded into the Dynameomics Database (in Linux Warehouse)d | ||||||
Dynameomics | 168 | 434 | 9.3 | 14.9 × 106 | N/A | N/A |
SLIRP | 230 | 306 | 16.9 | 16.9 × 106 | ||
Amyloid Proteins | 149 | 607 | 32.7 | 32.7 × 106 | ||
Other folding + native | 119 | 1065 | 58.4 | 64.5 × 106 | ||
Peptide Design | 219 | 705 | 14.6 | 14.6 × 106 | ||
SNPs | 123 | 454 | 19.7 | 19.7 × 106 | ||
Linux Total | 1008 | 3571 | 151.6 | 163.3 × 106 | ||
Grand Total | 2233 | 10,834 | 332.7 | 409.4 × 106 |
These simulations represent all targets from the v2009 consensus domain dictionary as well as multiple proteins simulated for some highly populated metafolds. The set contains representatives of all autonomous protein domains and all simulations and their metadata have been loaded into the Dynameomics Database; this represents the core of Dynameomics. In addition the SLIRP portion of the database contains simulations of the 20 amino acids (with Asp, Glu and His both protonated and deprotonated) within the GGXGG peptide and expansion of the database to include SNP-associated proteins. Only protein coordinates, not solvent, are loaded into the database at this time.
Note that the proteins simulated at 498 K were also simulated at 298K. There were 11 additional proteins simulated at 298 K that were not run at 498 and 23 GGXGG peptide simulations, giving a total of 1225 comprised of 1202 protein simulations and 23 peptide simulations.
These simulation data are available at www.dynameomics.org.
These simulations have not yet been loaded in the database but they are contained within a structured, queryable warehouse while waiting to be added to the database. (Note that it took ~6 months to load the data already contained in the database.) In any case, the combined Windows database and Linux warehouse are nonoverlapping and contain simulations of a total of 2233 distinct protein/peptide systems of which 1761 are proteins. The DB/warehouse comprises 10,834 simulations and 4.1 × 108 structures. The simulations listed here on the Linux system occupy approximately 40 TB of 90 TB awaiting incorporation into the database; however, these files also contain solvent, so combined with different compression techniques, the file sizes aren’t directly comparable and aren’t broken down for the Linux side.