Abstract
We recently observed artificial temperature gradients in molecular dynamics (MD) simulations of phase-separating ternary lipid mixtures using the Martini 2 force field. We traced this artifact to insufficiently converged bond length constraints with typical time steps and default settings for the linear constraint solver (LINCS). Here, we systematically optimize the constraint scaffold of cholesterol. With massive virtual sites in an equimomental arrangement, we accelerate bond constraint convergence while preserving the original cholesterol force field and dynamics. The optimized model does not induce nonphysical temperature gradients even at relaxed LINCS settings and is at least as fast as the original model at the strict LINCS settings required for proper thermal sampling. We provide a python script to diagnose possible problems with constraint convergence for other molecules and force fields. Equimomental constraint topology optimization can also be used to boost constraint convergence in atomistic MD simulations of molecular systems.
1. Introduction
Molecular dynamics (MD) simulations give us a molecularly detailed view of biological processes. To reach relevant length and time scales, the number of degrees of freedom can be reduced by coarse graining (CG). As one of the most successful approaches, the Martini force field1,2 on average maps four non-hydrogen atoms into a single bead. With the Martini force field, integration time steps of up to about Δt = 30 fs can be used, compared to the 2 fs typical in all-atom simulations. CG force fields make it possible to study phase separation in lipid bilayers3,4 as possible models for the lipid rafts5,6 implicated in a myriad of cellular processes.7−9 However, we recently observed unphysical temperature gradients across the liquid-ordered (Lo) and liquid-disordered (Ld) phase boundaries in Martini 2 simulations of phase-separating ternary lipid mixtures.10 We traced these gradients to an insufficient convergence of the highly coupled bond-length constraints in the Martini 2 cholesterol model,10,11 with cholesterol being a major component of phase-separating lipid membranes. Insufficient constraint convergence also affects other observables at typical time steps, such as the diffusion coefficient or the contact fraction,12 which describes the degree of the phase separation in the system.
Slow convergence of the linear constraint solver (LINCS)13,14 is expected for any system with strongly coupled constraints. In Martini 2, this includes sterols such as cholesterol and ergosterol. In atomistic simulations, the issue arises, e.g., for angle-constrained butane or pentane, where LINCS fails entirely13 or cholesterol with all bonds constrained.10 Thus, an easy-to-use tool to estimate the required LINCS settings for proper convergence and a strategy to ameliorate bond length constraint issues would be valuable.
Here, we present a general method based on rigid-body mechanics15 and the use of virtual sites16 to optimize the LINCS convergence behavior of highly constrained topologies. Applied to the Martini 2 cholesterol topology, we obtain a cholesterol model that is optimized in terms of LINCS convergence but fully retains the original parametrization of the molecule in terms of force field and dynamics. First, we recapitulate the fundamental problem with the joint use of coupled constraints and the LINCS algorithm. We provide a set of guidelines that can help to avoid the creation of such constraints, and we present a script that enables the detection of constraint-related issues without explicitly performing any simulation. We apply the script to assess the behavior of constrained molecules in the Martini 3 small-molecule library17 and different atomistic cholesterol topologies. Second, we modify the cholesterol constraint topology to reduce the time-step dependence and the temperature gradients of the system. Finally, we perform Martini 2 simulations of (i) a single cholesterol molecule, (ii) a phase-separating lipid bilayer, and (iii) a G-protein-coupled receptor (GPCR) embedded in a cholesterol-containing membrane using both the original Martini 2 and our optimized cholesterol models to demonstrate the improvements introduced by our model.
2. Theory
2.1. LINCS Constraint Algorithm
As discussed in detail in the original LINCS papers,13,14 the intramolecular bond-length constraints are enforced after an unconstrained step according to
1 |
where rn+1unc is the position after the unconstrained time step n + 1, M is the diagonal matrix of particle masses, Bn is the gradient matrix of the constraint equations at time step n, and the vector d contains the prescribed length of the constraints. A key step in enforcing the constraints with eq 1 is the inversion of BnM–1BnT, a square matrix of dimensions K × K with K being the number of distance constraints. To reduce the computational cost, one rewrites the inverse in the form
2 |
where S is a diagonal matrix defined as the inverse square root13 of the diagonal of BnM–1BnT. Its elements are with iA and iB indexing the two sites in distance constraint i = 1, ..., K. Note that eq 20 in ref (13) actually defines the inverse of S. An is a sparse symmetric matrix with zeros along its diagonal. The off-diagonal elements of An are given by the cosine of the angle between the constraints multiplied by a dimensionless mass factor. In LINCS,13 the inverse is evaluated approximately in terms of a truncated geometric series
3 |
This expansion is only applicable if the largest absolute value of the eigenvalues of An is less than one, , and it converges poorly as the magnitude of the eigenvalue approaches one. As the authors of LINCS note, angle-constrained butane has λmax = 0.8 while angle-constrained pentane has λmax = 1.2.13 The reason for coupled constraints being prone to fail is that increasing powers (p) of the An matrix represent the coupling effect of constraints that are p constraints away. The largest power p in the truncated series in eq 3 corresponds to the requested LINCS order (lincs_order). In coupled triangles, the third constraint away from a given constraint is already the constraint itself. As such, for highly coupled geometries, the expansion in eq 3 usually converges slowly or not at all. Consequently, the Gromacs MD simulation engine18 internally doubles the requested lincs_order for all constraints involved in triangular arrangements.14
As the An matrix can be diagonalized, one can conveniently assess the convergence of eq 3 with increasing lincs_order based on how fast λmaxp decreases. Under typical bond distortions, λmax is approximately 0.4.14 Combined with lincs_order = 4, the error of the expansion is proportional to . Here, we use this relationship as a rule-of-thumb to estimate the lincs_order required for convergence.
2.2. Rigid-Body Mechanics
In Newton’s equations of motion, the mechanics of a rigid body is completely specified by the zeroth, first, and second moments of its mass distribution, namely, the total mass (M), the center of mass (CoM) vector (R), and the inertia tensor (X). Therefore, one is free to alter the positions of a set of rigidly connected, massive, noninteracting particles as long as M, R, and X are kept constant. Sets of points that possess the same M, R, and X are called equimomental systems.15 Naturally, to preserve the dynamics of the system, the massless interacting sites must subsequently be reconstructed around the new “scaffold” of massive noninteracting particles. This well-known concept is the basis of virtual sites in MD.16
While finding equimomental systems is trivial for rigid linear molecules, it becomes increasingly complicated for planar and nonplanar molecules due to couplings between M, R, and X. Recently, Laus and Selig presented a general procedure to generate equimomental systems from a regular tetrahedron.15 Here, we only present a brief outline of the theory. The central object of the formalism is the pseudo inertia tensor given by
4 |
The X̃ matrix can be readily constructed from outer products of the homogeneous coordinates of the particle positions as
5 |
where mi are the masses of the individual particles. It is clear from eq 4 that setting the center of mass of a point cloud R = 0 and orienting it such that X is diagonal produces a pseudo inertia tensor that is also diagonal. This transformation can be expressed as
6 |
where G is an element of the special Euclidean group SE(3) in the 4 × 4 matrix representation, and it encodes the translation and rotation of the point cloud of rigidly connected particles. Moreover, the individual terms in eq 5 are rank 1 matrices. Because four points not in a common plane already result in a full-rank matrix, any general rigid body can be replaced by a rigid structure with a minimum of four points.
Following ref (15) and without loss of generality, we assume in the following that we are in the principal frame of the rigid body, where by an appropriate translation and rotation the center of mass has been placed at the origin and the tensor of inertia is diagonal. The pseudo tensor of inertia is then also diagonal, . We define the diagonal matrix and vectors that satisfy . Then, X̃ can be decomposed as
7 |
where I is the 4 × 4 identity matrix. The last identity
8 |
follows from the fact that by construction. Following ref (15), we now introduce a rotation in four dimensions, U ∈ SO(4), to define rotated 4-vectors . This 4D rotation leaves the pseudo inertia tensor unchanged
9 |
where we used eq 8 and UUT = I. The 4D rotations by U thus correspond to the required equimomental transformations.15
However, to interpret the 4D-rotated vectors in terms of point masses in 3D, care must be taken to adjust also the masses mi by mutiplying them with the square of the fourth element of the rotated 4-vector, . This ensures that defines transformed 3D positions with transformed masses mi′ that together leave X̃ and thus the inertia tensor, center of mass, and total mass unchanged. Importantly, the resulting masses are bounded by 0 ≤ mi′ ≤ M with ∑imi = ∑imi′ = M. In addition to the fact that 4D rotations allow one to create all possible equimomental systems, one has considerable freedom in fixing certain positions or masses of the final system.15 We note that in this process one can also change the number of mass points, e.g., by reducing the number of sites to the minimum of four for a general rigid body.15
3. Methods
3.1. Topology Optimization to Minimize λmax
We developed a python script using the MDAnalysis19 package to compute λmax from a single molecular configuration. The use of a single configuration is justified only if the constrained particles do not undergo significant fluctuations during their motion. This condition holds for the Martini 2 cholesterol model as the two coupled triangles never substantially deviate from coplanarity. On the basis of the framework for the generation of equimomental systems15 introduced in the previous section, we minimized λmax of An computed by the script with respect to equimomental configurations of a fixed number of rigid sites. In this way, we aimed at reducing the lincs_order required for properly constraining Martini 2 cholesterol.
We optimized the cholesterol model by minimizing the largest eigenvalue λmax of the constraint matrix An as follows (see Figure S1 for illustration).
-
1.
First, we decoupled the masses of the four beads involved in the two coupled constraint triangles from the interaction sites by introducing four additional noninteracting sites positioned initially at the location of the respective interacting bead. The cholesterol tail bead forming the fifth massive site was left unchanged throughout the optimization, being connected to the rest of the molecule by a flexible bond and thus not part of any constraint. The four newly introduced beads initially inherited the masses of the original beads, while the original beads became massless virtual interaction sites. As a result of the decoupling, the optimized model has 12 beads, 4 more than the original model.
-
2.
Second, we iteratively optimized the positions and masses of the four newly introduced massive sites in an equimomental manner. We used eq 7 to find the matrix D̃ and vectors (i = 1, ..., 4) for the four massive sites. Then, at each iteration step, we generated random four-dimensional rotations U ∈ SO(4) for small rotation angles. The application of D̃U to produced new homogeneous coordinate vectors and hence provided new positions ri′ and masses mi′. Following Laus and Selig,15 this procedure guaranteed that M, R, and X remained fixed throughout the optimization process.
-
3.
Then, we computed the An matrix of the newly generated configuration with modified positions and masses of the four massive constrained particles and determined λmax and the estimate of the required lincs_order. Following a Monte Carlo scheme, we accepted any new configuration that lowered λmax and repeated the refinement (steps 2 and 3) until a sufficiently low value was reached. To avoid pathological structures, we specified a minimum distance of 2.7 Å between any two beads as a constraint for the optimization.
-
4.
After the iterative equimomental optimization of the masses and positions of the four newly introduced sites (black circles in Figure 1, right), the relative positions of the seven massless virtual sites (four originally massive interaction sites plus the three original virtual sites; shown as pink and magenta circles in Figure 1, right) were reconstructed in the reference frame of the four new sites.
3.2. MD Simulations
To compare the geometric properties of the original Martini 2 cholesterol model and our optimized model, we performed MD simulations of single isolated cholesterol molecules in the NVT ensemble using the two topologies. The simulations were 50 ns long each with time steps of 10 fs and lincs_order = 4. This combination of settings ensured sufficient convergence of the molecular properties.
We also performed MD simulations of ternary phase-separating lipid mixtures based on the simulations of Thallmair et al.10 Systems consisting of 1276 dipalmitoyl-phosphatidyl-choline (DPPC), 912 cholesterol, and 950 dilinoleoyl-phosphatidyl-choline (DLiPC) molecules corresponding to a molar ratio of 0.42/0.28/0.30 were built using the program insane.py in a random distribution.20 The bilayers were solvated in a 0.15 M NaCl solution. This resulted in an overall number of 38 917 CG water beads (10% of which were antifreeze particles) and 428 NaCl ion pairs. All simulations were performed with Gromacs18 version 2020.1. Energy minimization of the initial structures using a steepest descent algorithm was followed by two 500 ps pre-equilibration runs with time steps of 1 and 10 fs, respectively. The equilibration was concluded by a run of 1 μs, which is long enough for phase separation to occur. During production, we simulated 750 000 000 steps corresponding to 7.5 μs with Δt = 10 fs, 15 μs with Δt = 20 fs, and 22.5 μs with Δt = 30 fs. The MD parameters and settings corresponded to the “New-RF” values.21
During the simulations, the pressure was maintained at 1 bar using a Parrinello–Rahman barostat22 with semi-isotropic coupling. We employed a coupling constant of τp = 12.0 ps and a compressibility of β = 3 × 10–4 bar–1. The temperature was kept constant at 310 K using a velocity rescaling thermostat23 with a coupling constant of τT = 1.0 ps. One thermostat was used for the solvent beads (water, antifreeze, and ion beads) and a second independent thermostat for the lipids.
We assessed the two different constraint topologies of cholesterol on the membrane properties by performing runs with the original Martini 2 cholesterol model and with the optimized geometry of virtual sites. We simulated both models using lincs_order = 4, 6, and 8 and time steps of Δt = 10, 20, and 30 fs to check for possible temperature gradients and to evaluate the dependence of structural and dynamic properties on the lincs_order and the time step.
To investigate the differences in lipid–protein interactions, we performed simulations of a β2-adrenergic receptor (β2AR, PDB ID: 2RH1) embedded in an asymmetric lipid bilayer consisting of 65% POPC and 35% cholesterol in the upper leaflet and 55% POPC, 35% cholesterol, and 10% PIP2 in the lower leaflet. The initial configuration was obtained from Song et al.24 (personal communication) and simulated using lincs_order = 4, 6, and 8 and time steps of Δt = 10, 20, and 30 fs to assess the impact of improper constraining on the interactions of β2AR and cholesterol. Three replicas of 15 μs length were simulated with each combination of lincs_order and time step, and the first 5 μs of every trajectory were discarded from analysis as equilibration. All reported quantities were averaged over the three replicas.
3.3. Analysis
To assess the achieved improvements in the optimized cholesterol model, we computed a range of observables for the original and optimized cholesterol models. To show that our approach does not alter the original model in any detectable fashion, we extensively compared the solvent-accessible surface area (SASA) and the equilibrium and RMSD values of pairwise bead distances of the single isolated cholesterols. The SASA was computed using gmx sasa and a probe sphere of radius 1.85 Å, while the bead distances were computed using a PLUMED 2.725 script.
In the systems containing lipid bilayers, we computed the temperature of the different lipids using the Gromacs18 tool gmx traj. For the different cholesterol models, the temperatures calculated from the kinetic energies were corrected for the respective numbers of free and constrained degrees of freedom. The standard Martini 2 cholesterol consisting of eight beads has 3N = 24 degrees of freedom. However, the three massless virtual sites do not contribute to the kinetic energy. Together with the 5 constraints imposed on the structure, they leave only 10 degrees of freedom. Hence, a correction of 24/10 was applied to the temperature values. The same reasoning results in a correction factor of 36/10 for our optimized model.
To gain deeper insights into the temperature gradients of the phase-separating systems, we computed the lateral distribution of temperature in the membrane and its difference between DLiPC and DPPC lipids (ΔT) with an in-house Python script using the MDAnalysis library.19 The kinetic energy of the lipids was calculated through Ek = mv2/2, where v is the velocity and m is the mass of the particle. According to the equipartition theorem, the temperature was then obtained as , where kB is Boltzmann’s constant and NDoF is the number of actual degrees of freedom (corrected for the constraints). Finally, the temperature was binned into 2D histograms. For ease of representation, we averaged the 2D temperature maps along the axis parallel to the boundaries between the Lo and Ld regions, that is, between the low- and high-temperature domains.
The gmx mindist tool was used to compute the contact fraction between DPPC and DLiPC lipids defined as12
10 |
where cA–B is the number of contacts between species A and B within a cutoff radius of 0.7 nm. The contacts were evaluated based on the PO4 bead of the lipids.
The lateral diffusion coefficients of the phospholipid species were calculated from the mean square displacement of their CoM. The trajectories of individual lipids were unwrapped using the NPT-corrected scheme of von Bülow et al.26 and analyzed with a Generalized Least Squares estimator27 to obtain the diffusion coefficients. The lateral motion of the lipids was measured with respect to the CoM of the entire bilayer, thereby eliminating the drift of the overall CoM.
We analyzed the protein–lipid interaction between the β2-adrenergic receptor and cholesterol using the PyLipID package.24 Similar to the original paper, we used 0.475 and 0.80 nm for the dual cutoffs that deal with the “rattling effect” in lipid binding.24 The binding sites were required to consist of at least four residues. We computed the per-residue and per-binding-site lipid count, occupancy, duration, residence time, and unbinding rate koff for each combination of lincs_order and time step, as in simulations of membranes without proteins. Additionally, we calculated the SASA and binding pose RMSD for the binding sites. The per-residue and per-binding-site observables computed by PyLipID were averaged over the 50 highest scoring residues and the top 3 scoring binding sites, respectively. See the original PyLipID paper24 for more details about the quantities.
4. Results and Discussion
4.1. Cholesterol Optimization Decreases λmax
In addition to three virtual sites, the Martini 2 model of cholesterol contains five massive sites, four of which form two coupled triangles. The original and optimized cholesterol models are illustrated in Figure 1.
Due to the presence of the coupled triangles, λmax ≈ 0.95 and the estimated lincs_order is 72. Here and in the following, the internal doubling by Gromacs is not taken into account.14 The reason behind the high eigenvalues is closely related to the unequal masses and coupled, far-from-equilateral triangles involved in the constraints. While the use of equal masses and equilateral triangles would completely distort the topology of cholesterol, it would result in λmax ≈ 0.50 and lincs_order = 5. The excessively large lincs_order of the original Martini 2 cholesterol is not only infeasible in simulations but would also be applied to all constraints in the system, not just the coupled ones with convergence issues.
Using the procedure outlined in the Methods section, we reduced λmax from 0.95 to 0.80, corresponding to a decrease in the required lincs_order from 72 to 16. Even though the topology optimization described above efficiently reduced the required lincs_order, large distortions of the geometry of the original model can cause other instabilities. In the case of cholesterol, a further reduction in lincs_order was not possible, because the resulting topologies had massive beads that were too close. Due to the proximity of these massive beads, the integration of the equations of motion produced overly large deviations from the prescribed values, leading to a different kind of LINCS instability.
4.2. Optimized Model Leaves Cholesterol Geometry Intact
We verified that the optimization procedure does not alter the configurations of the interacting beads by running simulations of a single cholesterol molecule in vacuum with both models. Our optimized model excellently reproduces the mean SASA value and its distribution (see Figure S2). To further prove the correctness of the optimized model, we computed all pairwise particle distances (Tables S1 and S2). The optimized model reproduces the mean values of all distances of the original model up to a precision of 0.02 Å as well as their standard deviation (compare Tables S3 and S4). Furthermore, we computed the probability density functions of the single bond, angle, and dihedral angle of the cholesterol models that do not rely on constraints or virtual sites. The distributions are virtually indistinguishable (Figure 2).
4.3. Optimized Model Eliminates Artificial Temperature Gradients in Phase-Separating Systems
We extensively compared the properties of lipid bilayers containing cholesterol described with the original and optimized model using a range of Δt and lincs_order values. As a first test, we evaluated the average temperature difference between DLiPC and DPPC lipids, ΔT = TDLiPC – TDPPC. For the original cholesterol model, the insufficient convergence of the LINCS algorithm led to the development of significant temperature differences, as shown in Figure 3. With the least strict LINCS settings (lincs_order = 4) and largest time step (30 fs), the differences can reach ΔT = 56 K between the two lipid types (Figure 3a). Decreasing the time step to 20 fs is not adequate even when lincs_order = 8 is used. To recover a temperature difference below 2 K with the original model, one has to use a 10 fs time step, incurring a significant penalty in the simulation performance.
The magnitude of the temperature gradient is even more striking when one examines the different membrane domains as a function of position (see Figure 3b and 3c). The reasons for the even larger temperature differences are that the individual phases are not composed uniquely of single lipid types and that the Lo phase contains the majority of cholesterol along with DPPC. We found a temperature difference between the two halves of the simulation box as high ΔT ≈ 80 K.
In contrast to the original cholesterol model, we observed only small temperature differences between the two phospholipid types with the optimized model. Even for a low lincs_order = 4 and long time step of 30 fs, the temperature difference is only 3.9 K (Figure 3a). For lincs_order = 6 and a 30 fs time step, ΔT drops to 0.2 K. Moreover, the temperature difference between the two membrane domains was in all cases under 8 K and became negligible for time steps below 30 fs or lincs_order = 8 for a 30 fs time step.
4.4. Properties of Phase-Separating System Converge When Temperature Gradients Are Eliminated
As a test of the dynamic properties, we computed the ratio of diffusion coefficients of DPPC and DLiPC lipids. As a probe of the local structure, we calculated the lipid–lipid contact fraction defined in eq 10. Results are shown as a function of the observed artificial temperature difference ΔT between DLiPC and DPPC lipids (Figure 4). Both quantities are greatly affected by ΔT and can only be considered converged within the sampling uncertainty when ΔT < 2 K (vertical dashed line). For the original cholesterol model, this requires a short time step of 10 fs; by contrast, for the optimized cholesterol model we have converged results in all cases except for the lowest lincs_order = 4 combined with the longest time step of 30 fs. Both models converge to the same values at small ΔT, further supporting the consistency of our optimization procedure.
As another property impacted by temperature gradients, we computed the distribution of cholesterol along the membrane normal. Figure 5 indicates that while the cholesterol distribution is quite sensitive to the combination of time step and LINCS settings in the original model, all curves are on top of each other in the optimized model. Moreover, the original model converges to our optimized model in the limit of ΔT ≤ 2 K, that is, at small time steps and high lincs_order settings. The slight asymmetry in the width and height of the cholesterol populations of the two leaflets is due to the position restraints along the z axis of the DLiPC lipids in one of the two leaflets, which was applied to suppress membrane undulations.29
4.5. Cholesterol−β2AR Interactions Are Only Weakly Affected by Insufficient Constraining
We investigated the interactions between cholesterol and a β2AR in a mixed-lipid, asymmetric bilayer, as described in the Methods section. Here, we analyzed the difference between the temperature of cholesterol and the reference temperature of the thermostat ΔTchol = Tref – Tchol. The observed ΔTchol values as a function of time step and lincs_order are shown in Figure 6 for the original (top) and optimized models (bottom). The large (>2 K) values of ΔTchol indicate that the protein–lipid systems also suffer from the LINCS convergence issues observed in the phase-separating bilayers, albeit to a much lesser extent. This is due to a lower local cholesterol concentration compared to the Lo phase of the ternary system. By contrast, the systems simulated using the optimized model show virtually no temperature differences (see Figure 6 bottom) irrespective of the chosen parameters.
Despite the significant values of ΔTchol, the observables computed with PyLipID24 for the system containing β2AR do not exhibit systematic changes as a function of ΔTchol in the case of the original model or significant differences between the two models (see Figures 7, S3, and S4). However, the system with the least strict settings (and largest ΔTchol) tends to be an outlier. Therefore, we discourage the use of lincs_order = 4 in combination with a long time step of Δt = 30 fs in the investigated system. We also note that while there are no systematic differences between the two cholesterol topologies, the quantities determined using the optimized model have smaller variances (Figures S3 and S4). Moreover, if the lipid bilayers exhibit phase separation and thus have larger local cholesterol concentrations, we expect an increased impact of the nonconverged constraints on the protein–cholesterol interactions.
4.6. Optimal Cholesterol Model Improves Computational Performance
Results reporting on the computational efficiency of the original and the optimized cholesterol models in the phase-separating lipid bilayer are listed in Table 1. For details about the overall hardware configuration and the efficiency of the β2AR-containing simulations, we refer to section 6 of the Supporting Information and Table S5, respectively. For the given settings, MD simulations with the optimized cholesterol model incurred a performance penalty of ∼10–15% in all cases. Because this comparison does not take into account whether the physics of the system is correct or not, we also compared the performance of the previously recommended parameters for proper LINCS convergence (lincs_iter = 2, lincs_order = 12, Δt = 20 fs) and (lincs_iter = 3, lincs_order = 12, Δt = 30 fs).10 Requiring the temperature gradient to be negligible (|ΔT| < 2 K, based on the convergence of properties in Figure 4), one needs to perform simulations of the original model at Δt = 20 fs using lincs_iter = 2, lincs_order = 12, resulting in 1623 ns/day, while it suffices to run the optimized model using lincs_iter = 1, lincs_order = 4, which allows the simulation of 1752 ns/day. The same is true at 30 fs, where one would have to run the original model with lincs_iter = 3, lincs_order = 12, giving 2264 ns/day. Our optimized model can run with lincs_iter = 1, lincs_order = 6 with a performance of 2453 ns/day, which represents a similar gain in performance. We conclude that MD simulations of the phase-separating bilayer model are at least as fast with the optimized cholesterol model as with the original model.
Table 1. Performance Comparison for MD Simulations of Phase-Separating Lipid Bilayers Using the Original and Optimized Cholesterol Modelsa.
lincs_order |
||||||
---|---|---|---|---|---|---|
original |
optimized |
|||||
Δt [fs] | 4 | 6 | 8 | 4 | 6 | 8 |
30 | 2925 | 2801 | 2741 | 2608 | 2453 | 2403 |
20 | 2081 | 1887 | 1892 | 1752 | 1714 | 1664 |
10 | 1150 | 1076 | 1050 | 963 | 931 | 904 |
Listed are simulated times in units of nanoseconds per day of wall-clock time. Results are shown as a function of time step Δt and lincs_order at fixed lincs_iter=1.
Crucially, using our optimized model, other constrained molecules present in the simulation box are not subjected to the overly high LINCS requirements of the original cholesterol model. We evaluated the computational cost of unnecessarily constraining molecules in the simulations of the membrane-embedded β2AR using the same LINCS settings as for the protein-less phase-separating lipid bilayer. The protein model contains 462 constraints that do not require strict LINCS settings. The use of the optimized model results in an ∼30% performance increase compared to stricter LINCS settings10 (see Table S5).
4.7. Perspective on Other Potentially Affected Molecular Topologies
4.7.1. Martini 3 Small-Molecule Library Does Not Suffer from LINCS Convergence Issues
The representation of a rigid topology by two connected triangular constraints, the so-called hinge model, as it is used in cholesterol, served as a blueprint for the Martini 3 topologies of a number of small molecules. Therefore, we assessed the quality of the constraint topology in terms of λmax for 77 constrained Martini 3 small molecules17 (available at https://github.com/ricalessandri/Martini3-small-molecules) by estimating the required LINCS order for convergence and by performing explicit simulations (see Tables S6–S8). Our python script identified the molecules BZTA (benzothiazole), BZTH (benzothiophene), and MINDA (1-methylindazole) as having the largest eigenvalues of λmax = 0.76. While these λmax values are not excessively large, the reason behind them is the same as that for cholesterol: uneven masses and slightly distorted triangles. All three molecules were of planar, trapezoidal geometry with constraints applied to the four sides and the longer diagonal. As a test, we “flipped” the constraint along the diagonal of the molecules to constrain the other, shorter diagonal, which resulted in a decrease of λmax in all three cases (BZTA and BZTH, 0.76–0.71; MINDA, 0.76–0.65).
In MD simulations using the new-rf input parameters21 and various combinations of lincs_order and time step, we found that all differences between the solute temperature and the thermostat reference temperature were less than 1.5 K and that the temperature difference between the solute and the solvent never exceeded 2 K (Tables S6–S8). Interestingly, “flipping” the diagonal constraint in the molecular topologies did not produce a clear improvement (Tables S6–S8), most likely due to the only moderately large λmax values.
The explicit simulations fully support our conclusions drawn based on λmax. Remarkably, the eigenvalue analysis of all 77 molecules took less than 3 min on a standard laptop, while the explicit simulations require tests using various lincs_order and time step values and take a few hours per system using high-performance computers (running on a single node, 12 000 particles, and 15 million integration steps).
Finally, the topologies involve 2-to-1 mappings of non-hydrogen atoms to CG beads and contain “tiny” beads. The standard Martini 3 parameters of the “tiny” beads restrict the time step Δt to well below 30 fs.30 While we did not encounter any crashes during the explicit simulations of the above systems, caution must be taken when other molecules with “tiny” beads are present.
4.7.2. Atomistic Topologies with All Bonds Constrained Suffer from Poor Constraint Convergence in LINCS
In our previous study, we showed that atomistic systems containing cholesterol also can suffer from temperature gradients due to nonconverged constraints.10 Two examples are the CHARMM36 force field with hydrogen mass repartitioning (HMR)31 and the CHARMM36 model with hydrogens modeled as virtual sites (VIS).32,33 To allow time steps of up to Δt = 5 fs, both models constrain all bonds. Although these cholesterol models do not contain any coupled triangles, larger rings of five or six atoms are present (Figure 1, left). Similar to three-membered rings, the resulting coupled constraints affect the convergence of the LINCS algorithm.
We analyzed the largest eigenvalues λmax of the An matrix for the standard CHARMM36 cholesterol model as well as the HMR and VIS ones using our script. While the standard CHARMM36 cholesterol model is typically run by constraining solely bonds involving hydrogen atoms, HMR and VIS constrain all bonds to enable larger time steps. As expected, the standard CHARMM36 model34 has a low λmax value of 0.06 because no coupled constraints are present. The other two models, however, exhibit considerably higher λmax values of 0.73 (HMR) and 0.71 (VIS). For proper convergence, they would require lincs_order = 11 and 10, respectively. Note that the internal doubling of lincs_order is not initiated by Gromacs for these topologies because no constrained triangles are present.
This shows that also for atomistic systems in which all bonds are constrained, the analysis of the eigenvalues of the An matrix is a valuable diagnosis tool. Our script can be used to detect potential convergence issues of the LINCS algorithm and estimate the required LINCS settings.18
5. Conclusions
For phase-separating lipid bilayers, Martini 2 simulations with typical parameter settings have recently been found to suffer from substantial artificial temperature gradients across the phase boundaries. The locally different temperatures impacted other physical properties of the system such as the ratio of diffusion coefficients between the saturated and the unsaturated lipids, the degree of phase separation,10,11 and the distribution of cholesterol along the membrane normal direction (Figures 4 and 5). The origin of the artifact was traced back to insufficient convergence of the highly coupled bond constraints in cholesterol, one of the major components in such bilayers.
Here, we used the mechanics of rigid bodies15 to develop an optimization strategy for constraint molecular topologies to achieve quicker constraint convergence with the LINCS algorithm. We did not consider alternative ways of solving the constraint equations or other numerical methods to invert the matrix I – An in LINCS. In the optimization of the constraint topology, we minimized the largest absolute value of the eigenvalues of the An matrix, λmax. We also provide a python script to rapidly evaluate the quality of the constraint topology in terms of λmax (available at https://github.com/bio-phys/constraint-coupling-analysis). We demonstrate the optimization strategy for the Martini 2 cholesterol model. By fully preserving the force field and the dynamics in the limit of infinitesimal time steps and perfect accounting for the constraints, the optimized model reproduces the single-molecule properties of the original model such as the solvent-accessible surface area or the bond/angle/dihedral distributions. With the exception of the largest time step and lowest lincs_order considered, the new model did not develop artificial temperature gradients in the phase-separating bilayer. The optimized model is publicly available at the Martini Web site (http://cgmartini.nl/images/parameters/ITP/martini_v2.0_CHOL_02-optLINCS.itp).
We further investigated the magnitude of the artifacts and their impact on cholesterol–protein interactions using a membrane-embedded β2-adrenergic receptor. Whereas the temperature of the original cholesterol model deviated significantly from the target temperature of the thermostat at larger time steps, there were no significant differences observed in lipid organization and dynamics around the protein between simulations with the original and the optimized cholesterol model. For MD simulations of membrane proteins with the optimized cholesterol model, we recommend the combined use of at least lincs_order = 6 with at most a 30 fs time step.
In the optimization of the cholesterol model, we ensure that the energetic and dynamic properties of the original model are fully maintained. The four additional beads in the constraint topology incur a computational cost. On the other hand, the stricter LINCS settings required for the original Martini 2 cholesterol model also impact the computational cost. In MD simulations of phase-separating ternary lipid mixtures with LINCS settings chosen to ensure similarly small temperature gradients, we achieved comparable performance with the original and optimized cholesterol model in terms of simulated time per wall-clock time (ns/day). The performance advantage of the optimized model increased to ∼30% in the presence of the membrane protein β2AR because the increase in the lincs_order required for the original cholesterol model applies also to constraints in the membrane protein.
We also analyzed the constraint topologies of the Martini 3 small-molecule library17 for the highest λmax, and we performed explicit simulations for the three molecules with the largest eigenvalues λmax that corroborated the eigenvalue analysis. Overall, even for the largest λmax = 0.76 we did not observe appreciable temperature gradients. The analysis and optimization method presented here can be readily incorporated into automatic topology builders and is potentially useful for other constrained molecules as well as rigid-body simulations.
We conclude by emphasizing the generality of the procedure described here to optimize the molecular constraint scaffold for rapid constraint convergence with LINCS. Possible applications include automated topology building of molecules,35,36 e.g., at the Martini 3 level of coarse graining.35,36
Acknowledgments
B.F. acknowledges Wanling Song, Michael Horrell and Mark Sansom for providing the B2AR input files, and Jakob T. Bullerjahn for fruitful discussions about 4D rotations and statistical data analysis. B.F. thanks the Alexander von Humboldt-Foundation for funding. B.F. and G.H. thank the Max Planck Society for support and the Max Planck Computing and Data Facility for computational resources. S.T. and G.H. acknowledge the Center for Multiscale Modeling in Life Sciences (CMMS) sponsored by the Hessian Ministry of Science and Art for funding. S.T. acknowledges the Alfons und Gertrud Kassel Foundation and the Dr. Rolf M. Schwiete Foundation for funding. The algorithm used to sample 4D rotations is described in ref (37).
Data Availability Statement
Simulation input files and analysis scripts of this study are openly available on Zenodo at 10.5281/zenodo.7199702. The optimized topology is available at http://cgmartini.nl, while the python analysis script for the eigenvalues is available at https://github.com/bio-phys/constraint-coupling-analysis.
Supporting Information Available
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jctc.2c01032.
Solvent-accessible surface area of the original and optimized models, mean values and standard deviations of pairwise intramolecular distances of the original and optimized models, cholesterol−β2AR interactions computed by PyLipID, hardware configuration for the comparison of model performance, performance comparison of membrane-embedded β2AR simulations, and temperature differences in the Martini 3 small-molecule library (PDF)
Open access funded by Max Planck Society.
The authors declare no competing financial interest.
Supplementary Material
References
- Marrink S. J.; De Vries A. H.; Mark A. E. Coarse grained model for semiquantitative lipid simulations. J. Phys. Chem. B 2004, 108, 750–760. 10.1021/jp036508g. [DOI] [Google Scholar]
- Marrink S. J.; Risselada H. J.; Yefimov S.; Tieleman D. P.; De Vries A. H. The MARTINI force field: coarse grained model for biomolecular simulations. J. Phys. Chem. B 2007, 111, 7812–7824. 10.1021/jp071097f. [DOI] [PubMed] [Google Scholar]
- Weiner M. D.; Feigenson G. W. Presence and role of midplane cholesterol in lipid bilayers containing registered or antiregistered phase domains. J. Phys. Chem. B 2018, 122, 8193–8200. 10.1021/acs.jpcb.8b03949. [DOI] [PubMed] [Google Scholar]
- Fowler P. W.; Williamson J. J.; Sansom M. S.; Olmsted P. D. Roles of interleaflet coupling and hydrophobic mismatch in lipid membrane phase-separation kinetics. J. Am. Chem. Soc. 2016, 138, 11633–11642. 10.1021/jacs.6b04880. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marsh D. Cholesterol-induced fluid membrane domains: a compendium of lipid-raft ternary phase diagrams. Biochimica et Biophysica Acta (BBA)-Biomembranes 2009, 1788, 2114–2123. 10.1016/j.bbamem.2009.08.004. [DOI] [PubMed] [Google Scholar]
- Friedman R.; Khalid S.; Aponte-Santamaría C.; Arutyunova E.; Becker M.; Boyd K. J.; Christensen M.; Coimbra J. T. S.; Concilio S.; Daday C.; van Eerden F. J.; Fernandes P. A.; Gräter F.; Hakobyan D.; Heuer A.; Karathanou K.; Keller F.; Lemieux M. J.; Marrink S. J.; May E. R.; Mazumdar A.; Naftalin R.; Pickholz M.; Piotto S.; Pohl P.; Quinn P.; Ramos M. J.; Schiøtt B.; Sengupta D.; Sessa L.; Vanni S.; Zeppelin T.; Zoni V.; Bondar A.-N.; Domene C. Understanding Conformational Dynamics of Complex Lipid Mixtures Relevant to Biology. J. Membr. Biol. 2018, 251, 609–631. 10.1007/s00232-018-0050-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lingwood D.; Simons K. Lipid rafts as a membrane-organizing principle. Science 2010, 327, 46–50. 10.1126/science.1174621. [DOI] [PubMed] [Google Scholar]
- Sengupta P.; Seo A. Y.; Pasolli H. A.; Song Y. E.; Johnson M. C.; Lippincott-Schwartz J. A lipid-based partitioning mechanism for selective incorporation of proteins into membranes of HIV particles. Nat. Cell Biol. 2019, 21, 452–461. 10.1038/s41556-019-0300-y. [DOI] [PubMed] [Google Scholar]
- Levental I.; Levental K. R.; Heberle F. A. Lipid rafts: controversies resolved, mysteries remain. Trends in Cell Biology 2020, 30, 341–353. 10.1016/j.tcb.2020.01.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thallmair S.; Javanainen M.; Fábián B.; Martinez-Seara H.; Marrink S. J. Nonconverged Constraints Cause Artificial Temperature Gradients in Lipid Bilayer Simulations. J. Phys. Chem. B 2021, 125, 9537–9546. 10.1021/acs.jpcb.1c03665. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Javanainen M.; Fábián B.; Martinez-Seara H.. Comment on “Capturing Phase Behavior of Ternary Lipid Mixtures with a Refined Martini Coarse-Grained Force Field”. arXiv:2009.07767 2020. 10.48550/arXiv.2009.07767 [DOI] [PubMed] [Google Scholar]
- Barnoud J.; Rossi G.; Marrink S. J.; Monticelli L. Hydrophobic compounds reshape membrane domains. PLoS Computational Biology 2014, 10, e1003873 10.1371/journal.pcbi.1003873. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hess B.; Bekker H.; Berendsen H. J.; Fraaije J. G. LINCS: a linear constraint solver for molecular simulations. J. Comput. Chem. 1997, 18, 1463–1472. . [DOI] [Google Scholar]
- Hess B. P-LINCS: A parallel linear constraint solver for molecular simulation. J. Chem. Theory Comput. 2008, 4, 116–122. 10.1021/ct700200b. [DOI] [PubMed] [Google Scholar]
- Laus L.; Selig J. Rigid body dynamics using equimomental systems of point-masses. Acta Mechanica 2020, 231, 221–236. 10.1007/s00707-019-02543-3. [DOI] [Google Scholar]
- Larsson P.; Kneiszl R. C.; Marklund E. G. MkVsites: A tool for creating GROMACS virtual sites parameters to increase performance in all-atom molecular dynamics simulations. J. Comput. Chem. 2020, 41, 1564–1569. 10.1002/jcc.26198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alessandri R.; Barnoud J.; Gertsen A. S.; Patmanidis I.; de Vries A. H.; Souza P. C.; Marrink S. J. Martini 3 Coarse-Grained Force Field: Small Molecules. Advanced Theory and Simulations 2022, 5, 2100391. 10.1002/adts.202100391. [DOI] [Google Scholar]
- Abraham M. J.; Murtola T.; Schulz R.; Páll S.; Smith J. C.; Hess B.; Lindahl E. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 2015, 1, 19–25. 10.1016/j.softx.2015.06.001. [DOI] [Google Scholar]
- Michaud-Agrawal N.; Denning E. J.; Woolf T. B.; Beckstein O. MDAnalysis: a toolkit for the analysis of molecular dynamics simulations. J. Comput. Chem. 2011, 32, 2319–2327. 10.1002/jcc.21787. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wassenaar T. A.; Ingólfsson H. I.; Böckmann R. A.; Tieleman D. P.; Marrink S. J. Computational lipidomics with insane: a versatile tool for generating custom membranes for molecular simulations. J. Chem. Theory Comput. 2015, 11, 2144–2155. 10.1021/acs.jctc.5b00209. [DOI] [PubMed] [Google Scholar]
- De Jong D. H.; Baoukina S.; Ingólfsson H. I.; Marrink S. J. Martini straight: Boosting performance using a shorter cutoff and GPUs. Comput. Phys. Commun. 2016, 199, 1–7. 10.1016/j.cpc.2015.09.014. [DOI] [Google Scholar]
- Parrinello M.; Rahman A. Polymorphic transitions in single crystals: A new molecular dynamics method. J. Appl. Phys. 1981, 52, 7182–7190. 10.1063/1.328693. [DOI] [Google Scholar]
- Bussi G.; Donadio D.; Parrinello M. Canonical sampling through velocity rescaling. J. Chem. Phys. 2007, 126, 014101. 10.1063/1.2408420. [DOI] [PubMed] [Google Scholar]
- Song W.; Corey R. A.; Ansell T. B.; Cassidy C. K.; Horrell M. R.; Duncan A. L.; Stansfeld P. J.; Sansom M. S. PyLipID: A python package for analysis of protein–lipid interactions from molecular dynamics simulations. J. Chem. Theory Comput. 2022, 18, 1188–1201. 10.1021/acs.jctc.1c00708. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bonomi M. Promoting transparency and reproducibility in enhanced molecular simulations. Nat. Methods 2019, 16, 670–673. 10.1038/s41592-019-0506-8. [DOI] [PubMed] [Google Scholar]
- von Bülow S.; Bullerjahn J. T.; Hummer G. Systematic errors in diffusion coefficients from long-time molecular dynamics simulations at constant pressure. J. Chem. Phys. 2020, 153, 021101. 10.1063/5.0008316. [DOI] [PubMed] [Google Scholar]
- Bullerjahn J. T.; von Bülow S.; Hummer G. Optimal estimates of self-diffusion coefficients from molecular dynamics simulations. J. Chem. Phys. 2020, 153, 024116. 10.1063/5.0008312. [DOI] [PubMed] [Google Scholar]
- Humphrey W.; Dalke A.; Schulten K. VMD: visual molecular dynamics. J. Mol. Graphics 1996, 14, 33–38. 10.1016/0263-7855(96)00018-5. [DOI] [PubMed] [Google Scholar]
- Ingólfsson H. I.; Melo M. N.; Van Eerden F. J.; Arnarez C.; Lopez C. A.; Wassenaar T. A.; Periole X.; De Vries A. H.; Tieleman D. P.; Marrink S. J. Lipid organization of the plasma membrane. J. Am. Chem. Soc. 2014, 136, 14554–14559. 10.1021/ja507832e. [DOI] [PubMed] [Google Scholar]
- Fábián B.; Thallmair S.; Hummer G. Small ionic radii limit time step in Martini 3 molecular dynamics simulations. J. Chem. Phys. 2022, 157, 034101. 10.1063/5.0095523. [DOI] [PubMed] [Google Scholar]
- Balusek C.; Hwang H.; Lau C. H.; Lundquist K.; Hazel A.; Pavlova A.; Lynch D. L.; Reggio P. H.; Wang Y.; Gumbart J. C. Accelerating membrane simulations with hydrogen mass repartitioning. J. Chem. Theory Comput. 2019, 15, 4673–4686. 10.1021/acs.jctc.9b00160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Loubet B.; Kopec W.; Khandelia H. Accelerating all-atom MD simulations of lipids using a modified virtual-sites technique. J. Chem. Theory Comput. 2014, 10, 5690–5695. 10.1021/ct500100f. [DOI] [PubMed] [Google Scholar]
- Olesen K.; Awasthi N.; Bruhn D. S.; Pezeshkian W.; Khandelia H. Faster simulations with a 5 fs time step for lipids in the CHARMM force field. J. Chem. Theory Comput. 2018, 14, 3342–3350. 10.1021/acs.jctc.8b00267. [DOI] [PubMed] [Google Scholar]
- Lim J. B.; Rogaski B.; Klauda J. B. Update of the cholesterol force field parameters in CHARMM. J. Phys. Chem. B 2012, 116, 203–210. 10.1021/jp207925m. [DOI] [PubMed] [Google Scholar]
- Bereau T.; Kremer K. Automated Parametrization of the Coarse-Grained Martini Force Field for Small Organic Molecules. J. Chem. Theory Comput. 2015, 11, 2783–2791. 10.1021/acs.jctc.5b00056. [DOI] [PubMed] [Google Scholar]
- Graham J. A.; Essex J. W.; Khalid S. PyCGTOOL: Automated Generation of Coarse-Grained Molecular Dynamics Models from Atomistic Trajectories. J. Chem. Inf. Model. 2017, 57, 650–656. 10.1021/acs.jcim.7b00096. [DOI] [PubMed] [Google Scholar]
- Bullerjahn J. T.; Fabian B.; Hummer G.. Efficient generation of random rotation matrices in four dimensions. arXiv, 2023, 10.48550/arXiv.2302.06230. [DOI] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Simulation input files and analysis scripts of this study are openly available on Zenodo at 10.5281/zenodo.7199702. The optimized topology is available at http://cgmartini.nl, while the python analysis script for the eigenvalues is available at https://github.com/bio-phys/constraint-coupling-analysis.