Toward the Evolutionary Optimisation of Small Molecules Within Coarse-Grained Simulations: Training Molecules to Hide Behind Lipid Head Groups

Sebastian Lütge; Maximilian Krebs; Herre Jelger Risselada

doi:10.1021/acs.jpcb.4c08200

. 2025 Feb 21;129(9):2482–2492. doi: 10.1021/acs.jpcb.4c08200

Toward the Evolutionary Optimisation of Small Molecules Within Coarse-Grained Simulations: Training Molecules to Hide Behind Lipid Head Groups

Sebastian Lütge ¹, Maximilian Krebs ¹, Herre Jelger Risselada ^1,^*

PMCID: PMC11891906 PMID: 39984164

Abstract

graphic file with name jp4c08200_0007.jpg

Exploring the vast chemical space of small molecules poses a significant challenge. We develop a new strategy to efficiently explore this space using coarse-grained toy-like molecules utilizing the Martini3 force field and graph representations. This yields initial proof-of-concept results for the approach enabling the identification of optimal molecules with specific properties targeting lipid bilayers. By leveraging genetic algorithms and coarse-grained molecular dynamics simulations, we demonstrate the potential of our method in designing simple, linear molecules. Our findings show a good convergence toward molecules with weak amphiphilic properties, resembling known (general anesthetic) molecules. While this study demonstrates the feasibility of our method, further refinement is needed to fully realize its potential and explore more complex molecular topologies. Nevertheless, these encouraging results suggest a new path for future research in small molecule discovery and design without relying on extensive data sets.

1. Introduction

The biggest and most commonly cited challenge in the search of new small molecules is their vast solution space. The relevant search space for biologically relevant small molecules lies in the order of 10⁶⁰ molecules considering the topological as well as chemical properties of molecules.¹ Given that the available data banks usually contain 10⁸ to 10¹² molecules,² the problem of sampling is immediately apparent. Furthermore, the graph-based methodology applied to the GDB-17 database reveals that some molecular topologies in uncovered molecules are disproportionately represented whereas large sections of search space remain entirely uncovered.³ This suggests that exploration of the small molecule universe is localized, with new discoveries frequently occurring near existing compounds suggesting that the sampling methods employed lack comprehensive coverage of the entire potential solution space.

Multiple strategies have been developed to effectively reduce the dimensionality of the small molecule space.^4,5 A widely used strategy is to construct a latent space based on relevant descriptors that relate a measured functionality to their physicochemical properties.^4,5 This approach leverages the inherent structure and composition of molecules to encode complex information into a lower-dimensional representation, facilitating easier analysis and interpretation. Techniques such as principal component analysis (PCA) and variational autoencoders (VAEs) are instrumental in this process.⁴ PCA, a statistical method, identifies the principal components that maximize variance in the data set, allowing for the transformation of high-dimensional data into fewer dimensions while retaining the essential characteristics of the molecules. On the other hand, VAEs, a type of generative model, employ encoders and decoders to learn a probabilistic mapping between the input space and a latent space. This method not only reduces dimensionality but also ensures that the encoded representations retain the ability to generate new, plausible molecules. While these methods can successfully generate new molecules, the necessity of constructing a latent space along with variational encoders/decoders inherently requires large data on functionality while limiting the generation to interpolation within existing molecular spaces (applicability domain).

An underexplored method for diminishing the complexity of chemical space involves transforming actual molecular structures into coarse-grained approximations.⁶⁻⁸ This technique leverages the overlap in thermodynamic characteristics among various chemically mapped groups, typically comprising 2–4 heavy atoms, such as their partitioning free energies in water and octanol.^9,10 The mapping process is often guided by predefined, top-down parametrization using near-atomic resolution coarse-grained force-fields, exemplified by coarse-grained force-fields such as the Martini force-field.¹¹ Notably, the latest iteration, Martini 3,¹² has demonstrated effectiveness in modeling the interactions between small drug-like molecules and proteins across a broad spectrum of systems, including well-studied cases like T4 lysozyme, GPCRs, nuclear receptors, and various enzymes.¹³ The significant computational efficiency gains from coarse-grained models pave the way for high-throughput screening of ligand libraries via molecular dynamics simulations using the Martini model. However, a critical hurdle in applying this approach to accurately screen databases lies in the rapid, automated creation of realistic coarse-grained representations. While several automated parametrization methods exist for deriving coarse-grained models from atomic structures of small molecules,^9,10 they often produce preliminary models that are imprecise and that would require additional fine-tuning of bonded or even nonbonded forces to recover realistic behavior. Enhancing the fidelity of these models is both challenging and time-intensive though recent advancements like CGCompiler employing discrete particle swarm optimization offer promising avenues for automated improvement.^14,15 However, since these automated approaches are computationally expensive and additionally rely on the labor-intensive collection of target data (experiments, atomistic simulations or ab initio calculations), fine-tuning the coarse-grained models of thousands, let alone millions, of different small molecules is still not realistic. Therefore, despite the demonstrated potential utility of coarse-graining small molecules within the Martini framework for understanding the relationship between molecular structure and function,⁷ the creation of large and accurate molecular databases is still far from trivial.

The question posed is whether the limiting step of systematic near-atomic coarse-graining in small molecule databases represents an essential, strategic step for exploring new areas in chemical space. An potential strategy is to drastically reduce the dimensionality of the interaction space describing these molecules to very few basic hydrophobic, hydrophilic and charged superatom types, enabling the derivation of fundamental design rules governing functionality within a low dimensional latent space and exploit these design rules in identifying potential targets in existing small molecule databases.⁵

Our present study aims to pioneer a novel approach for generating small molecules with near atomic detail without relying on preexisting databases or constructing latent spaces and VAEs and decoders. Instead, we employ graph structures coined “toy-models” that iteratively evolve within the framework of evolutionary algorithms toward functionality convergence, as measured in coarse-grained molecular dynamics simulations. The primary objective is to produce solutions within the toy-model space defined by the Martini 3 force field, assigning meaningful chemical atomistic features to these solutions at a later stage. This approach ultimately shifts the focus from assigning force–field parameters to known small molecules to translating an unknown optimized coarse-grained representation into its likely corresponding ensemble of atomistic structures. Any uncertainties in the accuracy of assigning CG force field parameters to small molecules are inherently moved to the translational step and thus the Martini 3 representation is—aside form the inherent inaccuracy of the Martini force-field—exact in the given simulation. Since near-atomic coarse-grained models represent known chemical groups of atoms, they inherently enable universal translation (i.e., application domain independent) to corresponding atomistic structures. We contend that this approach is feasible and circumvents the challenges of systematically parametrizing small molecule databases or constructing application domain representative latent spaces. Moreover, this generative strategy is particularly adept at sampling new configurations within the small molecule space, as it optimizes molecular space independent of existing databases. Finally, we argue that even without translating toy-models to atomistic presentations, understanding optimal points within the chemically intuitive resolved near-atomic coarse-grained space can offer valuable highly detailed insights into the chemical properties that dictate functionality, akin to the well-known concept of pharmacophore models used in rational drug design.

To advance the field of molecular design, we have innovated upon the evolutionary molecular dynamics (Evo-MD) approach,¹⁶⁻¹⁸ initially designed for optimizing amino acid sequences interacting with lipid membranes. By integrating coarse-grained MD simulations with genetic algorithms (GAs), Evo-MD leverages the GROMACS software to iteratively refine molecules. Our present work represents a pioneering application of Evo-MD for the optimization of small molecules, aiming to generate small molecules with specific desired properties. In light of the critical need to advance the targeting of biologically relevant liquid interfaces, such as lipid membranes and membrane-less protein condensates, our initial case study focuses on investigating the potential of optimizing small molecule interactions with lipid membranes. Considering the underdeveloped state of this field for small molecules due to the lack of adequate methods akin to those utilized in structure-based drug design, we propose to test this method and its efficacy in identifying optimal small molecule properties. Specifically, our attention will be directed toward the physics-driven generation of molecules which exhibit a preference for positioning themselves within the hydrophobic region near the lipid headgroups in biological membranes. More precisely, they target the area where the oil–water surface tension in the lateral pressure profile of lipid bilayers attains its maximum value.¹⁹⁻²¹ The selective distribution of small molecules in this area leads to a decrease in oil–water surface tension while simultaneously increasing the effective surface area of lipid head groups and the membrane thickness.¹⁹⁻²¹ This mechanism has significant implications for the phase transition temperatures of lipid membranes,²² plays a crucial role in determining the potential of these molecules to induce positive curvature in the lipid bilayer,²³ and holds importance in understanding the actions of general anesthetics. This is because anesthetics’ effectiveness correlates with their ability to influence phase potential transitions and separations within lipid membranes.^22,24,25 Additionally, there is a hypothesis suggesting that the distribution of small molecules in this region influences the opening of membrane channels by altering the pressure profile and the concomitant work involved with channel opening.²⁶ Our study primarily aims to demonstrate the feasibility of using the Evo-MD framework in the targeted design of small molecules that can modulate specific structural and dynamic membrane properties.

In this first study, we explore the directed evolution of linear structures composed of 840 different Martini3²⁷ bead types, allowing these structures to adjust their dimensions autonomously. These linear chains serve as simplified models for well-known general anesthetics, including various known long-chain alcohol compounds.²⁸ Building upon this foundation, our subsequent research will introduce a novel module within Evo-MD termed the small molecule generator (SMOG). Currently in development, SMOG aims to enhance our methodology by incorporating the ability to sample topological spaces within the framework of GAs. Our study demonstrates that Evo-MD evolves toward realistic molecules that mimic features akin to recognized general anesthetic molecules. Furthermore, we establish that these molecules significantly influence the lateral pressure profile of lipid membranes. Additionally, we observe that the molecular length of these linear chains tends toward an optimal size; however, this ideal length varies depending on the fitness function employed to guide the evolutionary process. Our research opens avenues for uniquely and strategically exploring the relationships between molecular features and their ability to alter both structural and dynamical properties of lipid membranes.

2. Methods

To address the challenge of uncovering new molecules in the CG search space of Martini3, the recently developed Evo-MD framework is extended to handle the directed evolution of small molecules. To this end, we introduce a SMOG that facilitate topological crossover operations and mutations within nonlinear graph topologies. These molecules generated by SMOG are then simulated with the GROMACS²⁹ simulation software. Our sampling task could be considered somewhat analogous to a previous study, which accessed the log P of molecules via brute force-sampling of all possible bead combinations for molecules consisting of only two interaction sites.^30,31 This analysis was done within the former Martini 2 model.³² However, since the search space of linear chains up to a length of 10 interaction sites within the newer Martini3 model²⁷ is vast, being about 840¹⁰ dimensions, brute force sampling is simply not possible and the sampling requires an efficient directed scheme to uncover optima within this high dimensional search space. The details for the procedure are explained in the following paragraphs.

2.1. Evo-MD

The standard procedure of Evo-MD, described and designed by Methorst et al.³³ consists of multiple steps to achieve evolutionary convergence. A purely random, initial population is generated from a defined list of genes. Here, the genes are all possible bead types within the Martini3²⁷ model except its water type. We employ SMOG to generate systems comprising 50 identical linear chain molecules, with each of these systems consisting of chain molecules with different length and bead type combinations. All systems further contain of a POPC membrane consisting of 128 lipids and 1928 water beads to which the 50 identical chains are added. This procedure is repeated iteratively for each new generation in the course of evolution. The fitness value for each system is calculated by ensemble averaging over the last 100 ns of (minimally) 600 ns production run and the population is sorted accordingly. After conducting the necessary genetic steps of crossover and mutations, a new population can be evaluated with the same simulation setup. For simple, chain-like molecules the genetic operations are straightforward: The crossover is simply cutting a chain in two parts and assembling a new molecule from two parents. Mutations can occur as point mutations, with only the bead-type changing in a randomly chosen position. Also, a bead can be deleted or added to the chain, considering the allowed maximal length of a molecule. An overview of the whole procedure is depicted in Figure 1.

Schematic depiction of the Evo-MD³³ workflow. In the first step a random population of small molecules is generated. In this example, the molecules consist of simple linear chains with a maximal length of 10 beads. This population is evaluated and ranked in respect to the fitness of each sampled test molecule. A recombination with different genetic steps are conducted on the best N performers to create a new population. This cycle is repeated until the fitness converges or a cutoff condition is reached. Adapted from ref (18). Copyright 2024 American Chemical Society.

The evolutionary hyperparameters for the population size and number of parents are set to N_Population = 288 or 128 depending on the system and to conduct different tests and N_Parents = 56 or 32 respectively. This leads to a convergence of the fitness score in roughly 20 iterations for this setup and was tried for 20 different initial populations.

2.2. Fitness-Function Design

Ideally, the lateral pressure profile of a membrane is calculated explicitly (e.g., using the GROMACS_LS implementation from mdstress^34,35). The difference between a disturbed and a nondisturbed membrane can then be used to measure the effect of a SMOG molecule in the system. However, the calculation of the pressure profile requires high precision trajectories for every system, including the velocities and forces to calculate the pressure. These trajectories have to be then rerun to evaluate all information. This process takes up significant amounts of data for the trajectories and stalls the iterative process of evolution, by waiting for all rerun calculations to finish. Thus, it is not feasible for a high-throughput environment such as Evo-MD. Therefore, we have to take a slightly different approach in order to approximate the impact of a molecule on the membrane. As an external constrain, in order to consider the length of the chain of a tested molecule, the fitness value is normalized by the size. This is interpreted as a weight from all beads in the molecule in respect of their size. With the three different sizes from Martini 3, regular, small and tiny, describing four, three and two heavy atoms respectively, this weight is the sum of all beads times their size. This normalization ensures, that the fitness is not dependent on the size of molecules as the evolution could otherwise escape in the generation of infinitely long molecules to maximize its impact on the membrane.

In our example, we use changes in the lateral density profile imposed by small molecules as an approximation for the concomitant changes in the lateral pressure profile. The hypothesis is the following: If a molecule shifts the pressure profile, it has to locate in close proximity to the head-groups of the lipids in the membrane. This fitness value is computationally far less expensive and therefore better suited for the Evo-MD framework. To generate such a bias, a pressure profile calculation is conducted and the resulting pressure profile smoothed, normalized and shifted to represent values between 0 and 1. This process is shown in Figure 2. The fitness is then assigned by calculating

This relates to biasing the density profile with the location of the pressure maxima of the membrane, normalizing by the calculated weight m_M of the given molecule times the number of molecules N and summing up the discrete contributions i along slices of the z-axis. In order to ensure a sufficiently high resolution of the pressure profile plot, the z-axis is discretized into 223 segments in eq 1. The limiting factor for this precision is the sampling rate of the pressure profile, which becomes more expensive the smaller the discretization.

Pressure profile of a simple POPC membrane consisting of 128 lipids. Local pressures are calculated using mdstress.^34,35 (upper) Lateral pressure profile of the reference state used to calculate the fitness for a new molecule. The colored lines are the three individual components of the local pressure profile. The black line (”bias”) shows the lateral pressure profile rescaled to a value between 0 and 1. The oil–water surface tension peaks occur at z = 3 and z = 7. (lower) Comparison between the density profile of a small molecule within the lipid bilayer (a simple alcohol) and the bias pressure profile. The density based fitness seeks to align the weighted average position of the molecule with the oil–water surface tension peak. In this example, the molecule penetrates deeper into the membrane than desired.

The density is always referenced as the density of the whole system including the membrane. The fitness penalises molecules further away from the headgroups and enforces molecules or sections within molecules to come in closer proximity to the defined density reference. Our calculations showed that swelling of the membrane with newly added small molecules near the near the headgroups only slightly shifts the position of the reference.

The definition of fitness based on explicit pressure profile calculations compares the difference in absolute peak height between the reference calculation (membrane system only) and the system with a small molecules inserted. To discern a molecule’s effect on the pressure profile, we consider the average of the heights of the two oil–water surface tension peaks within the pressure profile. The additional consideration of peak width does not significantly change the results and is therefore not considered further.

2.3. Simulation Details and Settings

The molecular dynamic simulation settings and mdp parameters for the GROMACS engine on which the EVO-MD III A method is based are chosen as described below.

The initially generated molecules are first energy minimized with the steep integrator until emtol = 20 and then further with the cg integrator until emtol = 5. After a short equilibration with md for 75,000 steps with a time step dt = 0.05 fs in vacuum the molecule can be inserted into the membrane. Next, an equilibration with a very small time steps dt of 0.05 fs is conducted to prevent overlapping coordinates from exploding the system. Here, also the temperature and pressure of the system are set with v-rescale to 310 K and berendsen to 1 bar.

In the following production run, the md-vv integrator is used to simulate the system for at least 600 ns with a time step of 20 fs, which is in line with the Martini3 suggestions. Here, the nose-hover coupling is used to set the system temperature to 300 K and a semi-isotropic Parrinello–Rahman barostat to keep the pressure at 1 bar. This proved to be the most robust settings for pressure profile calculations with mdstress in a later stage.

3. Results and Discussion

3.1. Systematic Fitness Comparison

Our primary goal is to utilize a simple yet effective fitness function to guide the development of molecules efficiently. We have chosen a density profile-based fitness function over one based on pressure profiles. This decision was made because the density profile approach enables rapid and accurate determination of fitness values, thereby streamlining the evaluation of small molecules within an evolutionary algorithm framework. To assess the efficacy of our simplified methodology compared to traditional pressure profile calculations, we performed a comparative analysis using simple, poly alcohol like molecules. These molecules were constructed using various polar bead types attached to chains of different apolar bead types within the Martini 3 model. Our experiments involved systematically varying the alkyl chain lengths to observe the impact on both fitness functions. Figure 3 visually represents the outcomes of our investigation. Our analysis revealed the existence of an alkyl length-dependent optimum for both fitness functions. This implies that the fitness value achieves a maximum at a particular alkyl chain length, with this optimal length being distinctive for each polar bead and alkyl chain combination. Yet, the positioning of these optima tends to diverge between the two methodologies. Furthermore, this effect depends on the chemical nature of the polar headgroup within the poly alcohol. The density-based fitness function commonly identifies an optimum with longer alkyl chains, ranging between 2 and 5 alkyl chain beads. For TP6r headgroup types, the pressure profile-based optimum favors molecules consisting of only a single alkyl chain bead. In contrast, for SP6r and P6r headgroup types, the pressure profile-based optimum favors longer molecules with lengths exceeding 5 alkyl chain beads. A critical aspect of our methodology is its emphasis on a consistent position within the lipid bilayer, aligning with the oil–water surface tension peak observed in the reference bilayer. Nonetheless, this targeted position may adjust due to the integration of small molecules and the consequent swelling of the membrane. It is essential to acknowledge that while small molecules evolved to reduce the oil–water surface tension peak through close association with lipid head groups, this behavior does not inherently lead to superior performance in minimizing pressure profiles. Essentially, these represent separate objectives, even though the molecules produced are anticipated to possess overlapping characteristics, with a notable crossover in fitness trends. Given our focus on method development, we will exclusively employ the density-based fitness function to guide the evolution of small molecules throughout the remainder of this project. Subsequently, we will retrospectively analyze how these outcomes influence the pressure profile for the best performers.

Effects of fitness function. Comparison of a wide selection of poly alcohols molecule with the density based fitness versus pressure profile based fitness as a function of alkyl length. Sequences are denoted as the Martini bead type for the backbone and its repetition number N, similar to the notation of polymers, followed by the polar bead. For both fitness functions a clear optimum in alkyl length is observed though the position of this optimum differs between fitness functions. Generally, the pressure profile based fitness favors molecules consisting of only one alkyl chain bead whereas the density based fitness favors longer molecules.

3.2. Evo-MD

To evaluate the consistency of evolutionary convergence among small molecules, we initiated 20 Evo-MDs simulations, each beginning from distinct initial conditions, namely, different initial genetic pools. The evolution is designed such that it favors molecules that bind closely behind the lipid head groups. Highly hydrophobic molecules tend to cluster in the lipid tails, where the density profile reaches its highest point between the headgroup peaks. Conversely, highly hydrophilic molecules exhibit a denser concentration in the aqueous environment, outside the membrane. Regardless of their nature, both types experience a penalty in the form of evolutionary suppression due to this behavior. Consequently, the evolution converges to molecules that preferentially binds behind the lipid head groups. An example of such a small molecule system is depicted in Figure 4.

Visualization of the combined small molecule and membrane system formed within an Evo-MD production run. (left) The SMOG generated molecules are highlighted in yellow and red, given the bead type and the water is shown in blue. The POPC lipid bilayer is depicted as lines with gray carbon tails and orange headgroups. The molecule shown in this snapshot consists of four beads with the sequence TN1dq–SN1dq–SN1dq–TN3aq. (right) The corresponding density profile (orange) and its overlap with the rescaled pressure profile (black) of the reference membrane system. Note that the evolution seeks to match the density peak of small molecules with the oil–water surface tension peak.

Figure 5 illustrates the evolution of linear chain molecules with varying lengths toward an optimal performer. Notably, we found that the molecules converge toward an optimal length of 5 or 6 coarse-grained interaction site. An example of such an coarse-grained structure of an optimal performer is given in, Figure 6. This convergence toward an optimal length below the maximal allowed chain length of 10 interaction sites is in line with the simulations of simple poly alcohol where the chain length was systematically increased (Figure 3). Though in case of poly alcohols, the optimal length was found to be a bit shorter typically consisting of 3–4 interaction sites. This distinction arises because an unconstrained evolution tends to favor the repetitive occurrence of slightly polar groups within alkyl chains (e.g., see Figure 5), which seemingly favors longer molecules than poly alcohols consisting of an alkyl chain with a fixed apolar bead type. The occurrence of repetitive slightly polar groups evidently provides an evolutionary advantage in optimally aligning the molecules at the targeted location within the lipid membrane.

Convergence of evolution for small molecules with freely varying lengths. (upper) Results for the evolution of linear chains allowing a maximal length of 10 beads using the density based fitness function. The overall result comprises 20 Evo-MDs runs in total starting from different initial populations. Each of the populations converges toward an optimum with shared bead types. (lower) Retrospective comparison to explicit pressure profile calculations performed on the top 7 sequences of each iteration. Though convergence of the fitness is observed, the occurrence of generation best performers with higher fitness values (red colored points) early in the evolution suggests that better solutions consisting of much shorter molecules exist but are not favored by the density based fitness function. The reduction of the error bars in the course of evolution is due to a gradual decrease in molecular diversity.

Visualization of a good performing toy-model molecule obtained from Evo-MD. Its potential atomic representation is not unique and the translation step is currently work in progress. The toy model’s chemical sequence is given as TN1dq–SN1dq–SN1dq–TN3aq within the Martini 3 model.²⁷ The coarse grained structure exhibits a combination of both more hydrophobic (+, red color) and more hydrophilic (−, blue color) chemical features. This indicates, that the hydrophilic interaction sites are located toward the center of the molecule and contrasts the structure of previously examined example-alcohols. The size of the bead types shown indicates the differences in bead size within the simulations.

3.3. Pressure-Profile Comparison

In this work the evolution of small molecules was optimized using a density based fitness function. It is crucial to restate guiding evolution via the calculation of local pressure inherently involves large noise and thus demands extensive sampling to accurately reveal significant differences in fitness when ranking molecules within the framework of GAs. Consequently, tracking pressures and differences therein during evolution requires substantial computational expenses as well as temporary storage of large simulation trajectories to precisely detect minor variations in fitness, especially near the thermodynamic optimum where the differences in fitness becomes subtle. Due to these challenges, focusing directly on predefined locations within lipid membranes proves more practical than attempting to calculate local pressures since it allows for faster and more accurate averaging of fitness values.

To also assess molecular performance through explicit, precise pressure profile calculations, we retrospectively analyzed the pressure profiles of the top 7 performers from each iteration. These recalculations focused on their ability to reduce the oil–water surface tension peak relative to the system’s hydrostatic pressure. The results of this analysis are depicted in Figure 5. An interesting observation from the evolutionary plot is the decline in genetic diversity across iterations. As the genetic pool converges toward an optimal solution, diversity diminishes, leading to pressure calculation results within the genetic pool that become increasingly uniform, as being evident by a constant fitness value and simultaneous reduction of variance between different molecules. This similarity stems from the molecules’ growing homogeneity in both chemical properties and length.

Therefore, the convergence of fitness toward a fixed value offers limited insights on actual performance. Instead, comparing the fitness values at evolutionary convergence, which exhibits a value of around 1.0, with those observed in our poly alcohol benchmark systems (see Figure 3) provides more valuable information. The analysis of six poly alcohol combinations reveals that a fitness value of 1.0 occupies the highest range of achievable fitness values (optimal performance ranges from 0.5 to 1.2). While this solution demonstrates strong performance in reducing the oil–water interface, it likely falls short of optimal performance in terms of pressure profile peak reduction.

3.4. Translation to Atomistic Representation

As evolution successfully converges toward optimal performers, the next challenge is to translate the interaction features of these resolved molecules into a tangible chemical representation. To address this, we compile a list of common bead types identified within the top-performing molecules. This compilation is presented in Table S1. Given that previous studies (e.g., ref (6)) have not comprehensively covered the chemical translation of all bead types contributing to the optimal solutions, the translations remain speculative attempts. Considering the composition of linear molecules, it is plausible that they consist of apolar carbon backbones complemented by more polar nitrogen and oxygen groups. This arrangement helps maintain a balance between hydrophobic and hydrophilic properties. Furthermore, the 4:1 and 2:1 mappings of heavy atoms in the Martini model suggest the potential inclusion of side chains such as alcohol groups (−OH), ether groups (C=O), and inorganic atoms (such as chlorine and fluorine). Such structures are likely to be stable in environments containing both lipids and water, allowing the molecule to locate near the membrane surface without sinking due to excessively long carbon chains. Additionally, incorporating inorganic chemical groups into the translation might interpret these more polar interaction types as fluorinated carbon groups (fluoranes), commonly found in a broad category of general anesthetics.

Translating the example molecule depicted in the simulation snapshot (Figure 4), which is build from the sequence depicted in Figure 6, into an atomistic representation is currently beyond the capabilities of our SMOG generator tool. Assuming we loosely map the chemical groups encoded by the Martini 3 bead types, as listed in Table S1, in future samples this translational step can be performed. Here it has to be noted, considering the mapping nature of the Martini force field, this translation will not result in one unique representation. Instead, it will generate an ensemble of possible atomistic molecules, each associated with its own distinct likelihood. After translating the Evo-MD molecules into the atomistic world, we can in future works also probe our findings in more ways with simulations featuring atomic-scale variants or even laboratory experiments. These measurements could provide further insights and validation, helping us refine our understanding of the complex processes involved.

This resolved molecule servers as a proof of concept that the membrane is a target-able system for Evo-MD and that convergence within the solution space of small molecules is achievable at least up to a molecular size of 10 interaction sites. We anticipate that such a convergence would remain achievable even if we allow for larger diversity by including different topologies within the genetic pool though convergence is expected to be more challenging due to the larger number of degrees of freedom. This suggests that the effective sampling of most of the potentially existing small molecules consisting of up to 40 heavy atoms would be realistic. Furthermore, our result on optimal performers shows the obtained solutions are intuitive and well understood in terms of required physicochemical features though the translation toward a corresponding ensemble of atomistic, chemical representations remains a future improvement.

3.5. Discussion

Our study aimed to optimize or simply generate molecules with linear topologies that specifically target membrane regions identified by the peak of the oil–water surface tension, as revealed through the lateral pressure profile of the lipid bilayer. The molecules, designed to reduce the oil–water surface tension of lipid bilayers, can essentially be considered surfactants for lipid assemblies. Therefore, the absorption of these molecules is expected to modify critical structural properties of lipid membranes, such as the area occupied by each lipid and the overall thickness of the membrane.^19,20 These modifications will, in turn, impact the dynamic, elastic, and thermodynamic behaviors of lipid membranes.

In the context of comparing the efficiency of evolutionary optimization with random sampling, the vastness of the targeted solution space plays a crucial role. Our evolutionary directed sampling procedure generated approximately tens of thousands of unique molecules, which is several orders of magnitude smaller than the actual solution space of about 5¹². This disparity highlights the limitations of random sampling methods, which struggle to efficiently explore such vast solution spaces. In contrast, the GA employs a more sophisticated approach. It initiates multiple random and independent starting positions within the solution space, allowing the evolutionary algorithm to explore its optimal solutions given the initial conditions. Interestingly, all random starting populations eventually converged toward a common subset coarse-grained Martini3 beads, demonstrating the algorithm’s effectiveness in sampling the entire search space efficiently. This approach is particularly advantageous in complex topologies, where the solution space becomes even more intricate.

The small molecules generated in this study are anticipated to possess characteristics and molecular attributes somewhat similar to those of known general anesthetics. Specifically, these compounds likely feature one or more weakly polar functional groups combined with short, intermediate-length apolar alkyl chains. It has been hypothesized that general anesthetics modulate the lateral pressure profile within cell membranes by preferentially positioning themselves close to the polar lipid head groups.^{19,20,26,36,37} This localization could potentially alter the structure and function of membrane proteins. Such alterations may involve changes in the free energy required for channel activation and associated shifts in protein conformation.^26,36 These modifications could have significant implications for cellular processes and overall organismal physiology.

The emergence of the protein-focused hypothesis on general anesthetics has significantly challenged traditional lipid-centric views.³⁸ Notably, the observation of a cutoff effect in long-chain alcohols, where anesthetic effectiveness decreases beyond a certain chain length, suggests the specific binding of anesthetic molecules into confined pockets within membrane proteins and challenges lipid-centric models.²⁸ However, lipid-centric models can still explain this cutoff effect through differences in membrane partitioning behavior between short-chain and long-chain alkanols.³⁹ Short-chain alkanols exhibit rigid segments near the hydroxyl group, efficiently transferring stress from the membrane core to its surface due to their proximity to the water interface. Conversely, long-chain alkanols become more flexible as the chain lengthens, reducing the efficiency of stress redistribution and leading to diminished anesthetic effects beyond a certain chain length. Polyalkanols mimic short-chain alkanols’ anesthetic properties if the distance between adjacent hydroxyl groups is less than the cutoff. This concept is supported by experimental evidence, as hexadecanetetraol and octadecanetetraol showed significant anesthetic activity.⁴⁰

Our Evo-MD simulations employing toy models hinted toward the potential existence of a cutoff length akin to the known characteristics of general anesthetic molecules. However, contrary to direct pressure profile calculation, our simulations’ directed evolution process targets preferred molecular positions within lipid bilayers to indirectly lower the oil–water surface tension peak within pressure profiles. We found that such as an alternative fitness strategy however biases the precise location of this optimum. Further research will therefore focus on overcoming the computational hurdles involved in directly optimizing pressure profiles in our Evo-MDs simulations. Additionally, also the normalization of fitness may influence the location or even the existence of the optimal length. Maintaining a constant number of anesthetic molecules could unintentionally result in maximizing molecular length, thus maximizing their impact on the membrane’s pressure profile. To mitigate this, it is essential to adjust fitness metrics based on the total number of beads representing all anesthetic molecules or to maintain a constant bead count while adjusting the number of anesthetic molecules. A more equitable approach might involve running these simulations under conditions of constant chemical potential.

Our study primarily focuses on pioneering the concept of evolving coarse-grained representations of small molecules using evolutionary algorithms. Although our current investigation is limited to simple linear chains of varying lengths, even this basic concept provides a powerful tool for discerning essential molecular features relevant to optimizing desired functionality. Notably, the solutions obtained serve as an intuitive proxy for the corresponding atomistic solutions. While our focus has been on optimizing lipid membrane targeting, this concept can be readily extended to include binding pockets within proteins and protein–protein interfaces. The feasibility of this extension relies on the accuracy of the coarse-grained model, ensuring a realistic representation of the scenario, and the availability of an appropriate fitness function for optimization in high-throughput simulations. In the realm of lipid-centric theories explaining the general anesthetic effect, our methodology would provide a significantly more focused approach for computational anesthetic design compared to traditional methods. These traditional strategies primarily concentrate on leveraging the Meyer-Overton correlation, which seeks to enhance octanol–water partitioning in phase-separated systems composed of water and octanol phases. We also anticipate that our approach will be particularly valuable for addressing complex, dynamic problems, such as protein–protein and protein–membrane interfaces as well as protein liquid–liquid interfaces (e.g., protein condensates). In such cases, existing strategies for molecular design are scarce.

Currently, our ongoing research focuses on enabling molecules to freely alter their topology during evolution. When extending SMOG to more complex molecules with varying topologies, the genetics are more intricate. This is a problem that can be tackled by using the information and features of graph-like adjacency matrices and linear algebra. We will cover this strategy and the concomitant SMOG framework in more detail within a future upcoming work, where we additionally enable a diversity of topologies to occur in the process of evolutionary optimization. The handling of linear chains is a simpler case, yet our current implementation of SMOG is already utilizing adjacency matrices as molecular representations to perform the evolutionary operations on. This allows easier adaptations of the generator in the future.

A previous study employed a graph-based coarse-grained approach combined with VAEs and decoders to identify novel cardiolipin-selective small molecules.⁵ To reduce the dimensionality of latent space, the interaction space describing these molecules was simplified to very few basic hydrophobic, hydrophilic and charged superatom types, enabling the derivation of fundamental design rules governing functionality. These rules were subsequently applied to molecular databases to identify optimal compounds. Although effective, such an approach natively introduces inaccuracies due to the crude coarsening of interactions, potentially compromising encoding and decoding precision.

Our proposed methodology advances upon existing approaches by leveraging the full complexity of the Martini 3 model in searching for optimal solutions. This enables the identification of finer design features that latent space-based methods may overlook. Notably, our outcomes are exact within the definition of the Martini 3 model. However, it is important to acknowledge that the precise atomistic nature of the obtained solutions remains unknown.

We argue that machine learning techniques can effectively bridge this gap without necessitating drastic dimensionality reduction. While multiple potential chemical translations exist, the process itself is universal and does not depend on applicability of domain-specific knowledge. This universality stems from the inherent chemical universality of near-atomic coarse-grained models like the Martini 3 model. Therefore, chemical translation after optimization exhibits inherent advantages over latent space-based domain-specific design approaches.

Furthermore, we propose exploring systematic dictionary-based approaches (translation rules) that facilitate rapid convergence of toy-model solutions toward potentially corresponding atomistic structures. These approaches leverage the Martini force-fields’ inherent design to encode known chemical groups. While synthesizing predicted novel small molecules presents greater challenges than synthesizing novel peptide sequences, our future objectives include guiding the translation of toy models toward chemical feasibility and thus synthesizability. This approach not only eases experimental validation of predicted solutions but also facilitates a more straightforward transition toward applications.⁴²⁻⁴⁸

Acknowledgments

The authors gratefully acknowledge the Gauss Centre for Supercomputing e.V. (www.gauss-centre.eu) for funding this project by providing computing time through the John von Neumann Institute for Computing (NIC) on the GCS Supercomputer JUWELS⁴¹ (CGEVOMD: 60266) and the national supercomputer HPE Apollo Hawk at the High Performance Computing Center Stuttgart (HLRS) under the grant number (RECOGUN: 44249). The work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy—EXC-2033–390677874–RESOLV.

Supporting Information Available

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jpcb.4c08200.

Table S1 stating a first assumed decoding of SMOG molecules according to information from ref (6). Most common bead types in good performers and potential chemical groups (PDF)

The authors declare no competing financial interest.

Special Issue

Published as part of The Journal of Physical Chemistry Bspecial issue “The Dynamic Structure of the Lipid Bilayer and Its Modulation by Small Molecules”.

Supplementary Material

jp4c08200_si_001.pdf^{(129.3KB, pdf)}

References

Bohacek R. S.; McMartin C.; Guida W. C. The art and practice of structure-based drug design: A molecular modeling perspective. Med. Res. Rev. 1996, 16 (1), 3–50. . [DOI] [PubMed] [Google Scholar]
Ruddigkeit L.; van Deursen R.; Blum L. C.; Reymond J.-L. Enumeration of 166 billion organic small molecules in the chemical universe database gdb-17. J. Chem. Inf. Model. 2012, 52 (11), 2864–2875. 10.1021/ci300415d. [DOI] [PubMed] [Google Scholar]
Ruddigkeit L.; Van Deursen R.; Blum L. C.; Reymond J.-L. Enumeration of 166 billion organic small molecules in the chemical universe database gdb-17. J. Chem. Inf. Model. 2012, 52 (11), 2864–2875. 10.1021/ci300415d. [DOI] [PubMed] [Google Scholar]
Sanchez-Lengeling B.; Aspuru-Guzik A. Inverse molecular design using machine learning: Generative models for matter engineering. Science 2018, 361 (6400), 360–365. 10.1126/science.aat2663. [DOI] [PubMed] [Google Scholar]
Mohr B.; Shmilovich K.; Kleinwächter I. S.; Schneider D.; Ferguson A. L.; Bereau T. Data-driven discovery of cardiolipin-selective small molecules by computational active learning. Chem. Sci. 2022, 13, 4498–4511. 10.1039/D2SC00116K. [DOI] [PMC free article] [PubMed] [Google Scholar]
Alessandri R.; Barnoud J.; Gertsen A. S.; Patmanidis I.; de Vries A. H.; Souza P. C. T.; Marrink S. J. Martini 3 coarse-grained force field: Small molecules. Adv. Theory Simul. 2022, 5 (1), 2100391. 10.1002/adts.202100391. [DOI] [Google Scholar]
Menichetti R.; Kanekal K. H.; Bereau T. Drug–membrane permeability across chemical space. ACS Cent. Sci. 2019, 5 (2), 290–298. 10.1021/acscentsci.8b00718. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kjølbye L. R.; Pereira G. P.; Bartocci A.; Pannuzzo M.; Albani S.; Marchetto A.; Jiménez-García B.; Martin J.; Rossetti G.; Cecchini M.; et al. Towards design of drugs and delivery systems with the martini coarse-grained model. QRB Discovery 2022, 3, e19 10.1017/qrd.2022.16. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bereau T.; Kremer K. Automated parametrization of the coarse-grained martini force field for small organic molecules. J. Chem. Theory Comput. 2015, 11 (6), 2783–2791. 10.1021/acs.jctc.5b00056. [DOI] [PubMed] [Google Scholar]
Potter T. D.; Barrett E. L.; Miller M. Automated coarse-grained mapping algorithm for the martini force field and benchmarks for membrane–water partitioning. J. Chem. Theory Comput. 2021, 17 (9), 5777–5791. 10.1021/acs.jctc.1c00322. [DOI] [PMC free article] [PubMed] [Google Scholar]
Marrink S. J.; Risselada H. J.; Yefimov S.; Tieleman D. P.; De Vries A. H. The martini force field: coarse grained model for biomolecular simulations. J. Phys. Chem. B 2007, 111 (27), 7812–7824. 10.1021/jp071097f. [DOI] [PubMed] [Google Scholar]
Souza P. C. T.; Alessandri R.; Barnoud J.; Thallmair S.; Faustino I.; Grünewald F.; Patmanidis I.; Abdizadeh H.; Bruininks B. M. H.; Wassenaar T. A.; et al. Martini 3: a general purpose force field for coarse-grained molecular dynamics. Nat. Methods 2021, 18 (4), 382–388. 10.1038/s41592-021-01098-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
Souza P. C. T.; Thallmair S.; Conflitti P.; Ramírez-Palacios C.; Alessandri R.; Raniolo S.; Limongelli V.; Marrink S. J. Protein–ligand binding with the coarse-grained martini model. Nat. Commun. 2020, 11 (1), 3714. 10.1038/s41467-020-17437-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
Stroh K. S.; Souza P. C. T.; Monticelli L.; Risselada H. J. Cgcompiler: Automated coarse-grained molecule parametrization via noise-resistant mixed-variable optimization. J. Chem. Theory Comput. 2023, 19 (22), 8384–8400. 10.1021/acs.jctc.3c00637. [DOI] [PMC free article] [PubMed] [Google Scholar]
Empereur-Mot C.; Pesce L.; Doni G.; Bochicchio D.; Capelli R.; Perego C.; Pavan G. M. Swarm-cg: automatic parametrization of bonded terms in martini-based coarse-grained models of simple to complex molecules via fuzzy self-tuning particle swarm optimization. ACS Omega 2020, 5 (50), 32823–32843. 10.1021/acsomega.0c05469. [DOI] [PMC free article] [PubMed] [Google Scholar]
Methorst J.; Verwei N.; Hoffmann C.; Chodnicki P.; Sansevrino R.; Wang H.; van Hilten N.; Aschmann D.; Kros A.; Andreas L.; et al. Physics-based inverse design of cholesterol attracting transmembrane helices reveals a paradoxical role of hydrophobic length. bioRxiv 2021, 2021. 10.1101/2021.07.01.450699. [DOI] [Google Scholar]
van Hilten N.; Methorst J.; Verwei N.; Risselada H. J. Physics-based generative model of curvature sensing peptides; distinguishing sensors from binders. Sci. Adv. 2023, 9 (11), eade8839 10.1126/sciadv.ade8839. [DOI] [PMC free article] [PubMed] [Google Scholar]
Methorst J.; van Hilten N.; Hoti A.; Stroh K. S.; Risselada H. J. When data are lacking: Physics-based inverse design of biopolymers interacting with complex, fluid phases. J. Chem. Theory Comput. 2024, 20 (5), 1763–1776. 10.1021/acs.jctc.3c00874. [DOI] [PMC free article] [PubMed] [Google Scholar]
Terama E.; Ollila O. H. S.; Salonen E.; Rowat A. C.; Trandum C.; Westh P.; Patra M.; Karttunen M.; Vattulainen I. Influence of ethanol on lipid membranes: from lateral pressure profiles to dynamics and partitioning. J. Phys. Chem. B 2008, 112 (13), 4131–4139. 10.1021/jp0750811. [DOI] [PubMed] [Google Scholar]
Griepernau B.; Böckmann R. A. The influence of 1-alkanols and external pressure on the lateral pressure profiles of lipid bilayers. Biophys. J. 2008, 95 (12), 5766–5778. 10.1529/biophysj.108.142125. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hantal G.; Fábián B.; Sega M.; Jójárt B.; Jedlovszky P. Effect of general anesthetics on the properties of lipid membranes of various compositions. Biochim. Biophys. Acta Biomembr. 2019, 1861 (3), 594–609. 10.1016/j.bbamem.2018.12.008. [DOI] [PubMed] [Google Scholar]
Gray E.; Karslake J.; Machta B. B.; Veatch S. L. Liquid general anesthetics lower critical temperatures in plasma membrane vesicles. Biophys. J. 2013, 105 (12), 2751–2759. 10.1016/j.bpj.2013.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
Różycki B.; Lipowsky R. Spontaneous curvature of bilayer membranes from molecular simulations: Asymmetric lipid densities and asymmetric adsorption. J. Chem. Phys. 2015, 142 (5), 054101. 10.1063/1.4906149. [DOI] [PubMed] [Google Scholar]
Heimburg T.; Jackson A. D. The thermodynamics of general anesthesia. Biophys. J. 2007, 92 (9), 3159–3165. 10.1529/biophysj.106.099754. [DOI] [PMC free article] [PubMed] [Google Scholar]
Pavel M. A.; Petersen E. N.; Wang H.; Lerner R. A.; Hansen S. B. Studies on the mechanism of general anesthesia. Proc. Natl. Acad. Sci. U.S.A. 2020, 117 (24), 13757–13766. 10.1073/pnas.2004259117. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cantor R. S. The lateral pressure profile in membranes: a physical mechanism of general anesthesia. Biochemistry 1997, 36 (9), 2339–2344. 10.1021/bi9627323. [DOI] [PubMed] [Google Scholar]
Souza P. C. T.; Alessandri R.; Barnoud J.; Thallmair S.; Faustino I.; Grünewald F.; Patmanidis I.; Abdizadeh H.; Bruininks B. M. H.; Wassenaar T. A.; et al. Martini 3: a general purpose force field for coarse-grained molecular dynamics. Nat. Methods 2021, 18 (4), 382–388. 10.1038/s41592-021-01098-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
Franks N. P.; Lieb W. R. Mapping of general anaesthetic target sites provides a molecular basis for cutoff effects. Nature 1985, 316 (6026), 349–351. 10.1038/316349a0. [DOI] [PubMed] [Google Scholar]
M., Abraham, Alekseenko A., Basov V., Bergh C., Briand E., Brown A., Doijade M., Fiorin G., Fleischmann S., Gorelov S., et al. Gromacs 2024.2 manual, May 2024. URL DOI: 10.5281/zenodo.11148638. [DOI] [Google Scholar]
Centi A.; Dutta A.; Parekh S. H.; Bereau T. Inserting small molecules across membrane mixtures: Insight from the potential of mean force. Biophys. J. 2020, 118 (6), 1321–1332. 10.1016/j.bpj.2020.01.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bereau T.; Andrienko D.; von Lilienfeld O. A. Transferable atomic multipole machine learning models for small organic molecules. J. Chem. Theory Comput. 2015, 11 (7), 3225–3233. 10.1021/acs.jctc.5b00301. [DOI] [PubMed] [Google Scholar]
de Jong D. H.; Singh G.; Bennett W. F. D.; Arnarez C.; Wassenaar T. A.; Schäfer L. V.; Periole X.; Tieleman D. P.; Marrink S. J. Improved parameters for the martini coarse-grained protein force field. J. Chem. Theory Comput. 2013, 9 (1), 687–697. 10.1021/ct300646g. [DOI] [PubMed] [Google Scholar]
Methorst J.; van Hilten N.; Risselada H. J. Inverse design of cholesterol attracting transmembrane helices reveals a paradoxical role of hydrophobic length. bioRxiv 2021, 10.1101/2021.07.01.450699. [DOI] [Google Scholar]; https://www.biorxiv.org/content/early/2021/07/05/2021.07.01.450699
Vanegas J. M.; Torres-Sánchez A.; Arroyo M. Importance of force decomposition for local stress calculations in biomembrane molecular simulations. J. Chem. Theory Comput. 2014, 10 (2), 691–702. 10.1021/ct4008926. [DOI] [PubMed] [Google Scholar]
Vanegas J. M.; Arroyo M. Force transduction and lipid binding in mscl: A continuum-molecular approach. PLoS One 2014, 9 (12), 1139477. 10.1371/journal.pone.0113947. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ollila O. H. S.; Louhivuori M.; Marrink S. J.; Vattulainen I. Protein shape change has a major effect on the gating energy of a mechanosensitive channel. Biophys. J. 2011, 100 (7), 1651–1659. 10.1016/j.bpj.2011.02.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ollila O. H. S.; Risselada H. J.; Louhivuori M.; Lindahl E.; Vattulainen I.; Marrink S. J. 3d pressure field in lipid membranes and membrane-protein complexes. Phys. Rev. Lett. 2009, 102 (7), 078101. 10.1103/physrevlett.102.078101. [DOI] [PubMed] [Google Scholar]
Franks N. P.; Lieb W. R. Do general anaesthetics act by competitive binding to specific receptors?. Nature 1984, 310 (5978), 599–601. 10.1038/310599a0. [DOI] [PubMed] [Google Scholar]
Cantor R. S. Breaking the meyer-overton rule: predicted effects of varying stiffness and interfacial activity on the intrinsic potency of anesthetics. Biophys. J. 2001, 80 (5), 2284–2297. 10.1016/s0006-3495(01)76200-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
Mohr J. T.; Gribble G. W.; Lin S. S.; Eckenhoff R. G.; Cantor R. S. Anesthetic potency of two novel synthetic polyhydric alkanols longer than the n-alkanol cutoff: evidence for a bilayer-mediated mechanism of anesthesia?. J. Med. Chem. 2005, 48 (12), 4172–4176. 10.1021/jm049459k. [DOI] [PubMed] [Google Scholar]
Alvarez D. JUWELS Cluster and Booster: Exascale Pathfinder with Modular Supercomputing Architecture at Juelich Supercomputing Centre. J. Large Scale Res. Facil. 2021, 7 (A138), A183. 10.17815/jlsrf-7-183. [DOI] [Google Scholar]
Abraham M. J.; Murtola T.; Schulz R.; Páll S.; Smith J. C.; Hess B.; Lindahl E. Gromacs: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 2015, 1–2, 19–25. 10.1016/j.softx.2015.06.001. [DOI] [Google Scholar]; https://www.sciencedirect.com/science/article/pii/S2352711015000059
Van Rossum G.; Drake F. L.. Python 3 Reference Manual; CreateSpace: Scotts Valley, CA, 2009.
Harris C. R.; Millman K. J.; van der Walt S. J.; Gommers R.; Virtanen P.; Cournapeau D.; Wieser E.; Taylor J.; Berg S.; Smith N. J.; et al. Array programming with NumPy. Nature 2020, 585 (7825), 357–362. 10.1038/s41586-020-2649-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hunter J. D. Matplotlib: A 2d graphics environment. Comput. Sci. Eng. 2007, 9 (3), 90–95. 10.1109/MCSE.2007.55. [DOI] [Google Scholar]
Gabriel E.; Fagg G. E.; Bosilca G.; Angskun T.; Dongarra J. J.; Squyres J. M.; Sahay V.; Kambadur P.; Barrett B.; Lumsdaine A.; et al. Open MPI: Goals, concept, and design of a next generation MPI implementation; Springer: Budapest, Hungary, 2004, pp 97–104.Proceedings, 11th European PVM/MPI Users’ Group Meeting
Rogowski M.; Aseeri S.; Keyes D.; Dalcin L. mpi4py.futures: Mpi-based asynchronous task execution for python. IEEE Trans. Parallel Distr. Syst. 2023, 34 (2), 611–622. 10.1109/TPDS.2022.3225481. [DOI] [Google Scholar]
Dalcin L.; Fang Y.-L. L. mpi4py: Status update after 12 years of development. Comput. Sci. Eng. 2021, 23 (4), 47–54. 10.1109/MCSE.2021.3083216.33967632 [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

jp4c08200_si_001.pdf^{(129.3KB, pdf)}

[ref1] Bohacek R. S.; McMartin C.; Guida W. C. The art and practice of structure-based drug design: A molecular modeling perspective. Med. Res. Rev. 1996, 16 (1), 3–50. . [DOI] [PubMed] [Google Scholar]

[ref2] Ruddigkeit L.; van Deursen R.; Blum L. C.; Reymond J.-L. Enumeration of 166 billion organic small molecules in the chemical universe database gdb-17. J. Chem. Inf. Model. 2012, 52 (11), 2864–2875. 10.1021/ci300415d. [DOI] [PubMed] [Google Scholar]

[ref3] Ruddigkeit L.; Van Deursen R.; Blum L. C.; Reymond J.-L. Enumeration of 166 billion organic small molecules in the chemical universe database gdb-17. J. Chem. Inf. Model. 2012, 52 (11), 2864–2875. 10.1021/ci300415d. [DOI] [PubMed] [Google Scholar]

[ref4] Sanchez-Lengeling B.; Aspuru-Guzik A. Inverse molecular design using machine learning: Generative models for matter engineering. Science 2018, 361 (6400), 360–365. 10.1126/science.aat2663. [DOI] [PubMed] [Google Scholar]

[ref5] Mohr B.; Shmilovich K.; Kleinwächter I. S.; Schneider D.; Ferguson A. L.; Bereau T. Data-driven discovery of cardiolipin-selective small molecules by computational active learning. Chem. Sci. 2022, 13, 4498–4511. 10.1039/D2SC00116K. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref6] Alessandri R.; Barnoud J.; Gertsen A. S.; Patmanidis I.; de Vries A. H.; Souza P. C. T.; Marrink S. J. Martini 3 coarse-grained force field: Small molecules. Adv. Theory Simul. 2022, 5 (1), 2100391. 10.1002/adts.202100391. [DOI] [Google Scholar]

[ref7] Menichetti R.; Kanekal K. H.; Bereau T. Drug–membrane permeability across chemical space. ACS Cent. Sci. 2019, 5 (2), 290–298. 10.1021/acscentsci.8b00718. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref8] Kjølbye L. R.; Pereira G. P.; Bartocci A.; Pannuzzo M.; Albani S.; Marchetto A.; Jiménez-García B.; Martin J.; Rossetti G.; Cecchini M.; et al. Towards design of drugs and delivery systems with the martini coarse-grained model. QRB Discovery 2022, 3, e19 10.1017/qrd.2022.16. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref9] Bereau T.; Kremer K. Automated parametrization of the coarse-grained martini force field for small organic molecules. J. Chem. Theory Comput. 2015, 11 (6), 2783–2791. 10.1021/acs.jctc.5b00056. [DOI] [PubMed] [Google Scholar]

[ref10] Potter T. D.; Barrett E. L.; Miller M. Automated coarse-grained mapping algorithm for the martini force field and benchmarks for membrane–water partitioning. J. Chem. Theory Comput. 2021, 17 (9), 5777–5791. 10.1021/acs.jctc.1c00322. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref11] Marrink S. J.; Risselada H. J.; Yefimov S.; Tieleman D. P.; De Vries A. H. The martini force field: coarse grained model for biomolecular simulations. J. Phys. Chem. B 2007, 111 (27), 7812–7824. 10.1021/jp071097f. [DOI] [PubMed] [Google Scholar]

[ref12] Souza P. C. T.; Alessandri R.; Barnoud J.; Thallmair S.; Faustino I.; Grünewald F.; Patmanidis I.; Abdizadeh H.; Bruininks B. M. H.; Wassenaar T. A.; et al. Martini 3: a general purpose force field for coarse-grained molecular dynamics. Nat. Methods 2021, 18 (4), 382–388. 10.1038/s41592-021-01098-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref13] Souza P. C. T.; Thallmair S.; Conflitti P.; Ramírez-Palacios C.; Alessandri R.; Raniolo S.; Limongelli V.; Marrink S. J. Protein–ligand binding with the coarse-grained martini model. Nat. Commun. 2020, 11 (1), 3714. 10.1038/s41467-020-17437-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref14] Stroh K. S.; Souza P. C. T.; Monticelli L.; Risselada H. J. Cgcompiler: Automated coarse-grained molecule parametrization via noise-resistant mixed-variable optimization. J. Chem. Theory Comput. 2023, 19 (22), 8384–8400. 10.1021/acs.jctc.3c00637. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref15] Empereur-Mot C.; Pesce L.; Doni G.; Bochicchio D.; Capelli R.; Perego C.; Pavan G. M. Swarm-cg: automatic parametrization of bonded terms in martini-based coarse-grained models of simple to complex molecules via fuzzy self-tuning particle swarm optimization. ACS Omega 2020, 5 (50), 32823–32843. 10.1021/acsomega.0c05469. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref16] Methorst J.; Verwei N.; Hoffmann C.; Chodnicki P.; Sansevrino R.; Wang H.; van Hilten N.; Aschmann D.; Kros A.; Andreas L.; et al. Physics-based inverse design of cholesterol attracting transmembrane helices reveals a paradoxical role of hydrophobic length. bioRxiv 2021, 2021. 10.1101/2021.07.01.450699. [DOI] [Google Scholar]

[ref17] van Hilten N.; Methorst J.; Verwei N.; Risselada H. J. Physics-based generative model of curvature sensing peptides; distinguishing sensors from binders. Sci. Adv. 2023, 9 (11), eade8839 10.1126/sciadv.ade8839. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref18] Methorst J.; van Hilten N.; Hoti A.; Stroh K. S.; Risselada H. J. When data are lacking: Physics-based inverse design of biopolymers interacting with complex, fluid phases. J. Chem. Theory Comput. 2024, 20 (5), 1763–1776. 10.1021/acs.jctc.3c00874. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref19] Terama E.; Ollila O. H. S.; Salonen E.; Rowat A. C.; Trandum C.; Westh P.; Patra M.; Karttunen M.; Vattulainen I. Influence of ethanol on lipid membranes: from lateral pressure profiles to dynamics and partitioning. J. Phys. Chem. B 2008, 112 (13), 4131–4139. 10.1021/jp0750811. [DOI] [PubMed] [Google Scholar]

[ref20] Griepernau B.; Böckmann R. A. The influence of 1-alkanols and external pressure on the lateral pressure profiles of lipid bilayers. Biophys. J. 2008, 95 (12), 5766–5778. 10.1529/biophysj.108.142125. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref21] Hantal G.; Fábián B.; Sega M.; Jójárt B.; Jedlovszky P. Effect of general anesthetics on the properties of lipid membranes of various compositions. Biochim. Biophys. Acta Biomembr. 2019, 1861 (3), 594–609. 10.1016/j.bbamem.2018.12.008. [DOI] [PubMed] [Google Scholar]

[ref22] Gray E.; Karslake J.; Machta B. B.; Veatch S. L. Liquid general anesthetics lower critical temperatures in plasma membrane vesicles. Biophys. J. 2013, 105 (12), 2751–2759. 10.1016/j.bpj.2013.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref23] Różycki B.; Lipowsky R. Spontaneous curvature of bilayer membranes from molecular simulations: Asymmetric lipid densities and asymmetric adsorption. J. Chem. Phys. 2015, 142 (5), 054101. 10.1063/1.4906149. [DOI] [PubMed] [Google Scholar]

[ref24] Heimburg T.; Jackson A. D. The thermodynamics of general anesthesia. Biophys. J. 2007, 92 (9), 3159–3165. 10.1529/biophysj.106.099754. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref25] Pavel M. A.; Petersen E. N.; Wang H.; Lerner R. A.; Hansen S. B. Studies on the mechanism of general anesthesia. Proc. Natl. Acad. Sci. U.S.A. 2020, 117 (24), 13757–13766. 10.1073/pnas.2004259117. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref26] Cantor R. S. The lateral pressure profile in membranes: a physical mechanism of general anesthesia. Biochemistry 1997, 36 (9), 2339–2344. 10.1021/bi9627323. [DOI] [PubMed] [Google Scholar]

[ref27] Souza P. C. T.; Alessandri R.; Barnoud J.; Thallmair S.; Faustino I.; Grünewald F.; Patmanidis I.; Abdizadeh H.; Bruininks B. M. H.; Wassenaar T. A.; et al. Martini 3: a general purpose force field for coarse-grained molecular dynamics. Nat. Methods 2021, 18 (4), 382–388. 10.1038/s41592-021-01098-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref28] Franks N. P.; Lieb W. R. Mapping of general anaesthetic target sites provides a molecular basis for cutoff effects. Nature 1985, 316 (6026), 349–351. 10.1038/316349a0. [DOI] [PubMed] [Google Scholar]

[ref29] M., Abraham, Alekseenko A., Basov V., Bergh C., Briand E., Brown A., Doijade M., Fiorin G., Fleischmann S., Gorelov S., et al. Gromacs 2024.2 manual, May 2024. URL DOI: 10.5281/zenodo.11148638. [DOI] [Google Scholar]

[ref30] Centi A.; Dutta A.; Parekh S. H.; Bereau T. Inserting small molecules across membrane mixtures: Insight from the potential of mean force. Biophys. J. 2020, 118 (6), 1321–1332. 10.1016/j.bpj.2020.01.039. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref31] Bereau T.; Andrienko D.; von Lilienfeld O. A. Transferable atomic multipole machine learning models for small organic molecules. J. Chem. Theory Comput. 2015, 11 (7), 3225–3233. 10.1021/acs.jctc.5b00301. [DOI] [PubMed] [Google Scholar]

[ref32] de Jong D. H.; Singh G.; Bennett W. F. D.; Arnarez C.; Wassenaar T. A.; Schäfer L. V.; Periole X.; Tieleman D. P.; Marrink S. J. Improved parameters for the martini coarse-grained protein force field. J. Chem. Theory Comput. 2013, 9 (1), 687–697. 10.1021/ct300646g. [DOI] [PubMed] [Google Scholar]

[ref33] Methorst J.; van Hilten N.; Risselada H. J. Inverse design of cholesterol attracting transmembrane helices reveals a paradoxical role of hydrophobic length. bioRxiv 2021, 10.1101/2021.07.01.450699. [DOI] [Google Scholar]; https://www.biorxiv.org/content/early/2021/07/05/2021.07.01.450699

[ref34] Vanegas J. M.; Torres-Sánchez A.; Arroyo M. Importance of force decomposition for local stress calculations in biomembrane molecular simulations. J. Chem. Theory Comput. 2014, 10 (2), 691–702. 10.1021/ct4008926. [DOI] [PubMed] [Google Scholar]

[ref35] Vanegas J. M.; Arroyo M. Force transduction and lipid binding in mscl: A continuum-molecular approach. PLoS One 2014, 9 (12), 1139477. 10.1371/journal.pone.0113947. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref36] Ollila O. H. S.; Louhivuori M.; Marrink S. J.; Vattulainen I. Protein shape change has a major effect on the gating energy of a mechanosensitive channel. Biophys. J. 2011, 100 (7), 1651–1659. 10.1016/j.bpj.2011.02.027. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref37] Ollila O. H. S.; Risselada H. J.; Louhivuori M.; Lindahl E.; Vattulainen I.; Marrink S. J. 3d pressure field in lipid membranes and membrane-protein complexes. Phys. Rev. Lett. 2009, 102 (7), 078101. 10.1103/physrevlett.102.078101. [DOI] [PubMed] [Google Scholar]

[ref38] Franks N. P.; Lieb W. R. Do general anaesthetics act by competitive binding to specific receptors?. Nature 1984, 310 (5978), 599–601. 10.1038/310599a0. [DOI] [PubMed] [Google Scholar]

[ref39] Cantor R. S. Breaking the meyer-overton rule: predicted effects of varying stiffness and interfacial activity on the intrinsic potency of anesthetics. Biophys. J. 2001, 80 (5), 2284–2297. 10.1016/s0006-3495(01)76200-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref40] Mohr J. T.; Gribble G. W.; Lin S. S.; Eckenhoff R. G.; Cantor R. S. Anesthetic potency of two novel synthetic polyhydric alkanols longer than the n-alkanol cutoff: evidence for a bilayer-mediated mechanism of anesthesia?. J. Med. Chem. 2005, 48 (12), 4172–4176. 10.1021/jm049459k. [DOI] [PubMed] [Google Scholar]

[ref41] Alvarez D. JUWELS Cluster and Booster: Exascale Pathfinder with Modular Supercomputing Architecture at Juelich Supercomputing Centre. J. Large Scale Res. Facil. 2021, 7 (A138), A183. 10.17815/jlsrf-7-183. [DOI] [Google Scholar]

[ref42] Abraham M. J.; Murtola T.; Schulz R.; Páll S.; Smith J. C.; Hess B.; Lindahl E. Gromacs: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 2015, 1–2, 19–25. 10.1016/j.softx.2015.06.001. [DOI] [Google Scholar]; https://www.sciencedirect.com/science/article/pii/S2352711015000059

[ref43] Van Rossum G.; Drake F. L.. Python 3 Reference Manual; CreateSpace: Scotts Valley, CA, 2009.

[ref44] Harris C. R.; Millman K. J.; van der Walt S. J.; Gommers R.; Virtanen P.; Cournapeau D.; Wieser E.; Taylor J.; Berg S.; Smith N. J.; et al. Array programming with NumPy. Nature 2020, 585 (7825), 357–362. 10.1038/s41586-020-2649-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref45] Hunter J. D. Matplotlib: A 2d graphics environment. Comput. Sci. Eng. 2007, 9 (3), 90–95. 10.1109/MCSE.2007.55. [DOI] [Google Scholar]

[ref46] Gabriel E.; Fagg G. E.; Bosilca G.; Angskun T.; Dongarra J. J.; Squyres J. M.; Sahay V.; Kambadur P.; Barrett B.; Lumsdaine A.; et al. Open MPI: Goals, concept, and design of a next generation MPI implementation; Springer: Budapest, Hungary, 2004, pp 97–104.Proceedings, 11th European PVM/MPI Users’ Group Meeting

[ref47] Rogowski M.; Aseeri S.; Keyes D.; Dalcin L. mpi4py.futures: Mpi-based asynchronous task execution for python. IEEE Trans. Parallel Distr. Syst. 2023, 34 (2), 611–622. 10.1109/TPDS.2022.3225481. [DOI] [Google Scholar]

[ref48] Dalcin L.; Fang Y.-L. L. mpi4py: Status update after 12 years of development. Comput. Sci. Eng. 2021, 23 (4), 47–54. 10.1109/MCSE.2021.3083216.33967632 [DOI] [Google Scholar]

PERMALINK

Toward the Evolutionary Optimisation of Small Molecules Within Coarse-Grained Simulations: Training Molecules to Hide Behind Lipid Head Groups

Sebastian Lütge

Maximilian Krebs

Herre Jelger Risselada

Abstract

1. Introduction