Design of proteins by parallel tempering in the sequence space

Preet Kalani; Vojtěch Spiwok

doi:10.1002/pro.70246

. 2025 Sep 24;34(10):e70246. doi: 10.1002/pro.70246

Design of proteins by parallel tempering in the sequence space

Preet Kalani ¹, Vojtěch Spiwok ^1,^✉

PMCID: PMC12459223 PMID: 40990840

Abstract

Computational design of new proteins is often performed by optimizing the amino acid sequence. This sequence is characterized by an energy (lower energy means better propensity to form the desired 3D structure) that is sampled and minimized. Here, we use the parallel tempering algorithm to accelerate this task. ESMfold was used to predict the structures of the sampled proteins and calculate energy. Starting from random amino acid sequences, each sequence was sampled using the Monte Carlo method at one of a series of temperatures, and these replicas were being exchanged by the parallel tempering method. A series of 100 or 200 residue proteins was designed to maximize confidence in structure prediction and globularity and minimize surface hydrophobic residues. We show that parallel tempering is a viable alternative to Monte Carlo sampling without replica exchanges and simulated annealing or related energy‐based protein design methods, especially in the situation where a continuous flow of designed sequences is desired.

Keywords: ESMfold, machine learning, Monte Carlo, parallel tempering, protein design, replica exchange

1. INTRODUCTION

The design of new proteins has become a viable strategy to obtain new functional (e.g., binding or catalytic) proteins, as an alternative to the exploration of natural sources and protein engineering (Listov et al., 2024). Many of the protein design approaches are based on iterative modification of the amino acid sequence from, for example, a random sequence, until desired properties of the protein are reached. Such protein design campaigns are based on two components.

First, we need a function to determine whether the protein of the given amino acid sequence folds into the desired 3D structure and fulfills other requirements for the target application. These functions may be energy‐based (based on potential or free energy) versus knowledge‐based (based on known sequences and 3D structures), traditional physics versus machine learning, or free (favoring any compactly folded 3D structure) versus targeted (favoring only proteins with desired 3D structure, symmetry, etc.).

Second, it is necessary to employ an algorithm that samples various sequences and optimizes the above described function. Again, various approaches can be used in this stage, including physics‐inspired algorithms such as the Monte Carlo method and simulated annealing (Hie et al., 2022; Liu & Kuhlman, ²⁰⁰⁶; Verkuil et al., ²⁰²²; Wicky et al., ²⁰²²) as well as machine learning methods (Dauparas et al., 2022; Watson et al., ²⁰²³).

Despite recent success in this field, the success rate of protein design projects is not 100%. Usually, multiple proteins must be designed to obtain one that passes the experimental evaluation. Furthermore, some features can be easily controlled during the design process, whereas some other properties are very difficult to control, thus requiring intensive experimental testing. Current protein design methods usually converge toward one designed sequence in one optimization run, rather than a continuous flow of designed sequences. Methods providing a continuous flow of designed sequences may be more suitable in the situation when multiple protein design candidates must be tested experimentally.

Monte Carlo (Metropolis et al., 1953) and Monte Carlo‐simulated annealing have been widely used to evolve sequences in protein design (Hie et al., 2022; Liu & Kuhlman, ²⁰⁰⁶; Verkuil et al., ²⁰²²; Wicky et al., ²⁰²²). In general, the sequence undergoes small changes that can be accepted or rejected. In this work, random single‐point changes were made in every step. A change leading to a protein with better properties (lower energy $E$ ) is always accepted. A change leading to a protein with worse properties is accepted or rejected with the probability $p$ calculated by Metropolis criterion (Metropolis et al., 1953):

p = \min (1, \exp (- β Δ E)),

where $Δ E$ is the energy difference and $β$ is an inverse temperature ( $1 / kT$ , where $k$ is Boltzmann constant and $T$ is temperature). The fact that even sequences with higher energy can be accepted ensures that the system can escape a local minimum and proceed towards the global one (the optimal sequence in the case of protein design).

Monte Carlo sampling can be extended as simulated annealing. High temperature accelerates sampling, whereas low temperature favors low‐energy states. Simulated annealing uses a predefined change of temperature to ensure that a wide sequence space is sampled at high temperature and, at the same time, a sequence close to the minimum of the sampled space can be identified at low temperatures. Most applications of Monte Carlo simulated annealing in protein design use a schedule with a single monotonous or stepwise decrease of temperature. Unfortunately, one such run leads to one designed protein, which may be inefficient in the situation when multiple designed proteins must be tested.

Here, we explore the application of the parallel tempering algorithm (Swendsen & Wang, 1986) (also known as the temperature replica exchange algorithm) in protein design. Parallel tempering is a popular method used in modeling physical and chemical systems. It can be combined with the Monte Carlo method or molecular dynamics simulation.

The application of parallel tempering to protein design is schematically depicted in Figure 1. In the beginning, a series of random protein sequences is generated (the first column in Figure 1). Next, they are subjected to single‐temperature Monte Carlo sampling, each at a different pre‐selected temperature. The temperature increases from $T_{0}$ to $T_{3}$ . After a predefined number of sampling steps (gray arrows in Figure 1), a replica exchange attempt is made. The probability of the exchange is calculated as:

p = \min (1, \exp (E_{i} - E_{j}) (β_{i} - β_{j})) .

Schematic depiction of protein design by parallel tempering.

The indexes $i$ and $j$ refer to the temperature IDs. The first replica exchange attempt (Figure 1) is made between temperatures 0 and 1 and between temperatures 2 and 3. Since the protein sampled at temperature 1 contains a secondary structure element, we can assume that its energy is lower than the energy sampled at temperature 0. This leads to $p = 1$ , that is, the replica exchange is accepted. After the exchange, the sequence previously sampled at $T_{0}$ is sampled at $T_{1}$ and vice versa. The same is true for replica exchange between $T_{2}$ and $T_{3}$ in Step 1, and $T_{1}$ and $T_{2}$ in Step 2.

In the replica exchange attempt number 3 (Figure 1), we can assume that the energy of the protein sampled at $T_{1}$ is slightly higher than that of that sampled at $T_{0}$ . This leads to $p$ between 0 and 1. At this point, a uniformly distributed random number between 0 and 1 is drawn. If this number is lower than $p$ , replicas are exchanged. This ensures that replica exchanges are performed with probability $p$ .

At the same replica exchange attempt, we can assume that the energy of the protein sampled at $T_{3}$ is significantly higher than that of that sampled at $T_{2}$ . This leads to $p$ close to zero. Therefore, the exchange is not performed and the sampling continues at the respective temperatures.

Parallel tempering is in general significantly more efficient in exploration of different states of the system and in optimization than a series of equally long single‐temperature sampling or simulated annealing runs. This is due to the fact that a sequence corresponding to an unstable protein (with high energy $E$ ) is likely to be exchanged for higher‐temperature replicas and thus climb on the temperature scale. At high temperature, it can (due to faster sampling) reach a sequence corresponding to a stable protein (with low energy $E$ ). This sequence tends to decline on the temperature scale. Stable sequences, therefore, accumulate at low temperature. In other words, parallel tempering actively “pulls” promising sequences from higher to lower temperatures, and it “pushes” poor sequences from low to high temperatures.

Here, we combine parallel tempering with protein design for the first time. Protein sequences were optimized to provide stable folded proteins. Parallel tempering was used as part of this algorithm. Our approach is based on evolutionary scale modeling (ESM) (Rives et al., 2021), which is a large language model used in the 3D structure prediction tool ESMfold (Lin et al., 2023). The modified version of the “protein programming language” of ESMfold (Hie et al., 2022) was used as an engine that generates and scores sequence candidates and runs the Monte Carlo method. The “protein programming language” makes it possible to easily define properties of the desired protein (e.g., compactness, few hydrophobic residues on the surface, secondary structure content, or similarity to a reference 3D structure).

We demonstrate our protocol in the free (untargeted) design of proteins with 100 and 200 amino acid residues. We show that our method is efficient in the continuous design of folds differing in overall shape and secondary structure composition.

2. RESULTS

The design of proteins by parallel tempering used a modified version of ESMfold, namely its protein design module (“protein programming language”) (Hie et al., 2022). This module uses the Monte Carlo simulated annealing method to minimize a score calculated from the sequence. This score can be defined by a modular language to combine features of the designed protein. Using a Python program, we introduced parallel tempering into this code. The initial sequences were generated randomly with the same probability of all amino acids except cysteine. They were subjected to Monte Carlo sampling with replica exchanges made, in most cases, every 100 Monte Carlo steps.

Figure 2 presents the progress in the design of 100‐residue proteins (1000 replica exchange attempts, that is, 100,000 Monte Carlo steps). Figure 2a shows the profile of replica exchanges. Each color represents one replica. The selected replica (starting from temperature ID 11) is highlighted by a thick line. It is clear that replica exchanges were frequent and there were no problematically selected temperatures. Figure 2b shows the evolution of the score. At first, the amino acid sequences were set randomly with equal amino acid preferences. Therefore, the score was very high at all temperatures, indicating the unstructured nature of the proteins. After relatively few steps (approx. 5000 steps, 50 replica exchange attempts), stable designs were formed at low temperatures. The proteins remained poorly structured at high temperatures. Figure 2c shows evolution of mean Cα predicted local distance difference test (pLDDT) values. It shows a trend similar to Figure 2b. This figure also highlights the demultiplexed design profile (red trace) with representative 3D structures predicted by ESMfold.

Design of 100‐residue proteins by parallel tempering. (a) Replica exchanges. Selected replica starting from temperature ID 16 is highlighted in tick. (b) Evolution of the score. (c) Evolution of mean Cα‐pLDDT. The replica starting from temperature ID 16 is highlighted by red boxes. Snapshots of ESMfold models from demultiplexed trace are depicted at top (colored by pLDDT, numbers indicate the mean Cα‐pLDDT). (d) Boxplot of the mean Cα‐pLDDT at different temperatures (first 50 replica exchange attempts skipped, sampled at replica exchange points). (e) Refinement of models highlighted in C by simulated annealing (numbers indicate the mean Cα‐pLDDT). pLDDT, predicted local distance difference test.

Demultiplexing (also referred to as demuxing) is a reconstruction of a continuous evolution of the sequence, regardless of the temperature. For example, the red profile Figure 2c started at the temperature number 11. In the next replica exchange attempt (after 100 Monte Carlo steps), it was exchanged for the replica at temperature number 10, and so forth. The demultiplexed evolution thus joins structures at temperatures 11 in Step 0, 10 in Step 100, and so forth.

Representative structures depicted in Figure 2c show that the initial protein with a random sequence was unstable (yellow on the pLDDT scale). After approximately 20 replica exchange attempts, a stable design (blue/purple in the pLDDT scale) was obtained. This was associated with a decrease in temperature. The resulting designs contained both $α$ ‐helices and $β$ ‐sheets. At replica exchange attempt approximately 120 it destabilized again (this time it was associated with an increase of temperature), and at replica exchange attempt approx. 480 it formed a stable $β$ ‐sheet‐rich design. It was destabilized again in the 510th replica exchange attempt and then started exploring another design composed mostly of $β$ ‐sheets. Finally, it was destabilized in the 800th replica exchange attempt. In this (1 of 17) demultiplexed design profile, the system visited three conformational families. In contrast, conventional single‐temperature or simulated annealing Monte Carlo would converge toward one conformational family. The plots for all 17 replicas can be obtained as Supporting Information (Figures S1–S17).

Figure 2d shows a boxplot with mean Cα‐pLDDT values. It further supports the notion that designs at low temperatures are stable (pLDDT around 90%), whereas they are unstable at high temperatures.

It is possible that the lowest temperature used in our replica exchange scheme is not low enough and that the designed proteins can be further optimized by reducing the temperature. To address this, we took snapshots in Figure 2c and subjected them to Monte Carlo simulated annealing optimization to reduce temperature. This refinement led to some improvement in the mean Cα‐pLDDT, between 0 and 5 percent points (Figure 2d).

The common problem of application of replica exchange methods is the low replica exchange probability. For this reason, we tested a higher number of replica exchange attempts. Replica exchange attempts were made every 20 steps, instead of every 100 steps. The results are presented in Supporting Information (Figure S18). The increase in replica exchange frequency did not significantly increase overall design efficiency.

The question is how efficient the parallel tempering algorithm is compared to conventional methods, namely Monte Carlo sampling at fixed temperatures and simulated annealing. First, we performed sampling at fixed temperatures. The calculation had exactly the same setup as for parallel tempering (same temperatures, 100,000 Monte Carlo steps sampled every 100 steps, the same definition of the score, etc.), except that replica exchanges were switched off. This allowed for evaluation of the effect of replica exchanges (parallel tempering). Next, the structures sampled in parallel tempering and fixed‐temperature sampling were pooled, the Cα atoms were superimposed onto a common reference structure and their Cartesian coordinates were analyzed by t‐SNE (van der Maaten & Hinton, ²⁰⁰⁸). The result is depicted in Figure 3.

Comparison of parallel tempering with the sampling at fixed temperatures. Each point represents one sampled design. (a) Plot colored by sampling temperature. (b) Plot colored by Cα‐pLDDT. (c) Plot sampled depending on whether it was sampled by parallel tempering or sampling at fixed temperatures. pLDDT, predicted local distance difference test.

Each point in Figure 3 represents a single designed structure. Similar designs are located close to each other. Figure 3a shows that the structures sampled at high temperatures are located in the center of the plot. Structures sampled at low temperature are located in the corona. Figure 3b shows that the structures in the corona are likely to fold, as indicated by a high ESMfold pLDDT. Finally, Figure 3c shows that the high‐temperature structures were sampled equally well by both methods. However, all clusters corresponding to stable designs (the corona) were explored by parallel tempering, but only some of them were explored at fixed temperatures. This clearly shows the advantage of parallel tempering. This trend was further promoted with the progress of sampling. Parallel tempering continues to sample new folds, whereas fixed‐temperature sampling is stuck in local minima.

Another alternative to parallel tempering is simulated annealing. The temperature in this type of sampling is not fixed. Instead, it is changing with a predefined scenario. Exponential cooling is the most common scenario in protein design. In this work, we tested different cooling rates $r$ (Figure 4). The temperature started from 1.0 and decreased with the rate $r$ in each Monte Carlo step ( $T_{i + 1} = {rT}_{i}$ ). Rates of 0.97–0.9999 were tested.

Design by simulated annealing. Evolution of scores for different cooling rates. Models of designs sampled at the end of each run are also depicted.

Small rates (0.97–0.999) led to premature designs due to too rapid cooling. Some runs resulted in low scores, whereas others got stuck in local minima. Higher rates (0.9995 or 0.9999) were necessary to obtain mature designs. However, this was associated with a large number of steps. In conclusion, to design a mature protein, thousands to tens of thousands of Monte Carlo steps are necessary. A single replica in an equally long parallel tempering run provides multiple diverse designs at the same computational cost.

After successful application to 100‐residue proteins, we applied our pipeline to 200‐residue proteins. The results are depicted in Figure 5. It is clear that our approach is efficient for larger proteins. It was necessary to use a larger number of replicas (24 instead of 17). Figure 5a shows that the replica exchange rate for temperatures 0–3 and 14–18 was lower than for 100‐residue proteins; nevertheless, it was still sufficient.

Design of 200‐residue proteins by parallel tempering. (a) Replica exchanges. Selected replica starting from temperature ID 22 is highlighted in tick. (b) Evolution of the score. (c) Evolution of pLDDT. The replica starting from temperature ID 20 is highlighted by red boxes. Snapshots of ESMfold models from demultiplexed trace are depicted at top (colored by pLDDT, numbers indicate the mean Cα‐pLDDT). (d) Boxplot of the mean Cα‐pLDDT at different temperatures (first 50 replica exchange attempts skipped). (e) Refinement of models highlighted in (c) by simulated annealing (numbers indicate the mean Cα‐pLDDT). pLDDT, predicted local distance difference test.

As expected, a higher number of steps were needed for equilibration, that is, to get to the state at which the distributions of energy at all temperatures become stable. Similarly to 100‐residue proteins, the protein sequences sampled at the lowest temperature had favorable pLDDT values from ESMfold (Figure 5c,d). Refinement of sequences by simulated annealing slightly increased pLDDT (by 0–4 percent points). Demultiplexed plots for all replicas can be obtained in Supporting Information (Figures S19–S42).

In order to further verify designed sequences, we predicted their structures using Alphafold 2 (Jumper et al., 2021). Sequences of 100‐residue proteins sampled at the lowest temperature (temperature ID 0, between replica exchange attempt 50 and the end, sampled every 10 replica exchanges) were submitted to Alphafold 2 (Jumper et al., 2021) (the version for monomers) and the resulting pLDDTs were compared. The results are depicted in Figure 6. As expected, the Alphafold 2 pLDDT values were, in general, lower than the ESMfold pLDDT values because we optimized the ESMfold pLDDT. Alphafold 2 pLDDT values ranged from 36% to 91% with a median of 60%. In total, 11% had pLDDT greater than 80, and 23% had pLDDT greater than 70, indicating a stable fold. The best Alphafold 2 score is highlighted in Figure 6.

pLDDT values of 100‐residue proteins by Alphafold and ESMfold (scatterplot with histograms). The sequence with the highest Alphafold pLDDT score is depicted as Alphafold and ESMfold model colored by pLDDT (pLDDT indicated). pLDDT, predicted local distance difference test.

3. DISCUSSION

An interesting feature of our approach is that a demultiplexed replica presents an interesting evolution of the sequence and structure (Figures 2c and 3c, https://youtu.be/4n4awaRnEOY). Figure 2c nicely shows episodes of increased temperature associated with destabilization and decreases in temperature associated with the stabilization of designed proteins. This can be prolonged as necessary. Therefore, design by parallel tempering provides a continuous flux of designed sequences.

Many replicas in our protein design campaigns followed this pattern; however, there were some demultiplexed replicas that stayed most of the time at low or high temperatures. Videos showing the evolution in Figures 2c and 3c can be found on YouTube (https://youtu.be/vGT44WrFPek, https://youtu.be/4n4awaRnEOY).

It must be kept in mind that in parallel tempering not only temperature influences protein stability, but also stability (energy $E$ ) influences temperature. High temperature naturally leads to faster sampling and destabilization. In contrast, low temperature favors low‐energy states. In parallel, increasing $E$ causes the replica to climb the temperature scale. Decrease of $E$ leads to descent on the temperature scale. These two effects take place simultaneously.

Parallel tempering is in general significantly more efficient than a series of equally long single‐temperature Monte Carlo or molecular dynamics simulations. The gain in efficiency depends strongly on the studied system and the setup of the method. Yamamoto and Kob (Yamamoto & Kob, 2000) used the parallel tempering (replica exchange) Monte Carlo method to accelerate the molecular structure of supercooled water. They observed that, depending on the setup, parallel tempering is 10 to 100 times more efficient than a series of equally long simulations. Figure 3 shows that parallel tempering sampled more diverse sequences than a series of equally long fixed‐temperature Monte Carlo sampling.

A critical step of the parallel tempering method is the choice of temperatures. They must be chosen in such a way that the highest temperatures allow for efficient sampling and, simultaneously, the lowest temperatures sample stable proteins. Neighboring temperatures $T_{i}$ and $T_{i + 1}$ must be close enough so that the distributions of the energy values sampled $E_{i}$ and $E_{i + 1}$ overlap. The overlap of the energy distributions ensures that $E_{i + 1}$ is occasionally lower than $E_{i}$ or at least comparable, allowing for replica exchanges.

We used a trial‐and‐error approach to choose temperatures. In most cases, it is useful to use the constant ratio $T_{i + 1} / T_{i}$ . We used the highest temperature equal to 1 (in $kT$ units), which we know from simulated annealing as a temperature at which sequences are sampled rapidly and lead to unstable proteins (data not shown).

We set the ratio $T_{i + 1} / T_{i}$ equal to 2 (temperatures 1, 0.5, 0.25, etc.) as a first attempt. Next, we performed a short parallel tempering trial and calculated replica exchange rates (successful vs. total number of replica exchange attempts). The replica exchange rate of approximately 30% is recognized as optimal in molecular simulations. For some adjacent temperatures, it was necessary to add extra temperatures to increase the replica exchange rate. Finally, we reached replica exchange rates 25%–43% for design of proteins of 100 residues and 10%–47% for 200 residues.

We used different temperatures and different number of temperatures for the design of 100‐ and 200‐residue proteins. This is analogous to the number of atoms in a system in a parallel tempering molecular dynamics simulation. Larger systems require a higher number of replicas because potential energy in a large system is averaged, and its variance is therefore lower. Lower variance (narrower energy distribution) means lower energy overlap of neighboring replicas, and thus lower replica exchange rate.

In addition to the size of the protein, the choice of optimized variables (pLDDT, predicted template modeling [pTM], surface hydrophobicity, etc.) and their weights may influence the choice of temperatures. However, temperatures can be set using a simple trial‐and‐error approach described above. In the future, we may develop a robust method for the choice of temperatures.

Both parallel tempering runs underwent an equilibration phase necessary to reach the steady distribution of the score among temperatures. Naturally, this phase required more steps for the design of 200 residue proteins than for the design of 100 residue proteins, due to the higher complexity of the task.

Parallel tempering has been extremely successful in connection with molecular dynamics simulation to model the folding of various small fast‐folding proteins (Earl & Deem, 2005). A fundamental difference between parallel tempering in protein design and protein folding simulation (or any other biomolecular simulation) is the fact that the former aims at finding the global or close to the global minimum of $E$ , whereas the latter aims at determining the distribution of states at the biologically relevant temperature. In protein design, we want to identify the sequence with the lowest possible $E$ . In contrast, in protein folding simulations, we want to determine the fraction of states (e.g., unfolded and folded) at the biological temperature.

We see the potential for the application of parallel tempering in a manner similar to that in molecular simulations. For example, a user can ask the following question: What is the distribution of net charge of 100‐residue proteins designed using ESMfold? This question can be answered by a large number of simulated annealing runs, extremely long fixed‐temperature Monte Carlo sampling or, more efficiently, by parallel tempering. The advantage of parallel tempering is that it is designed not to disturb the distribution of states (e.g., net charges).

The choice of the lowest temperature in parallel tempering simulation of protein folding is usually straightforward and it is the biological temperature. In our application, we tried to overcome the fact that we cannot set the lowest temperature to 0 K by applying simulated annealing to selected designs (Figures 1e and 2e). The temperature dropped in these simulated annealing runs below $10^{- 6} kT$ . In general, there was an improvement in pLDDT.

The advantage of parallel tempering is its inherent parallelism. The design in each replica can proceed independently of the others, so they can be performed at different nodes of a parallel computer. Communication between nodes is necessary only in replica exchange attempts. Because we did not have access to a parallel computer with 17 or 24 GPUs, we performed all simulations in a serial manner. Users of parallel clusters may use a combination of message passing interface with Python to parallelize the task.

The proteins designed in this article were not verified experimentally, that is, by recombinant production of designed proteins and their structural characterization. We understand that this may discourage potential users of our approach. However, we believe that the quality of protein design campaigns is mainly determined by the quality of scoring of the designed sequences, rather than by the optimization method.

As an alternative, we performed prediction of structures of designed proteins by Alphafold 2 (Jumper et al., 2021). We identified proteins predicted to be stable by ESMfold as well as Alphafold 2 (Figure 6). Many other designed sequences were predicted to be stable by ESMfold but not Alphafold 2. This may be explained by failure of the design, that is, prediction by ESMfold was wrong and Alphafold 2 was correct. Another explanation is that the design was successful and that, for example, ESMfold is more general than Alphafold 2. Hie et al. observed a similar trend for proteins designed by the design module of ESMfold (see fig. 2B,C in Hie et al., ²⁰²²).

To our knowledge, the protein design engine used in this work (“protein programming language”) (Hie et al., 2022) has not been verified experimentally. A similar engine, also based on ESMfold, (“language model design”) (Verkuil et al., 2022) from the same laboratory has been extensively experimentally verified and a high success rate (67%) was reported. We initially tested our approach on the design of the “language model design,” however, we observed that over‐optimization of the designed proteins leads to low diversity α‐helix bundles. The authors of this engine report a similar finding for a suboptimal setup of the annealing scheme.

We demonstrate here that parallel tempering can be applied in protein design. We must acknowledge that the current development of machine learning programs for protein design leads to very fast tools such as Protein Message Passing Neural Network (Dauparas et al., 2022), Alphafold backpropagation (Frank et al., 2024; Goverde et al., ²⁰²³) or ProteinGenerator (Lisanza et al., 2024). These tools can design stable proteins in a very short time. In many applications, we can expect that these tools can outperform our approach, that is, a series of independent Protein Message Passing Neural Network or ProteinGenerator runs may be more efficient than our parallel tempering design.

However, we see great potential in the combination of parallel tempering protein design with some other physics‐based approaches developed to improve sampling in molecular simulations or other applications (Spiwok et al., 2015). We will follow this path in the future.

4. METHODS

ESMfold was obtained from the GitHub repository (github.com/facebookresearch/esm). We used the code of “protein design programming language.” (Hie et al., 2022) This code starts from a random amino acid sequence. The user can define key structure descriptors such as pLDDT (Jumper et al., 2021), pTM (Jumper et al., 2021), number of nonpolar surface residues, root mean square deviation from a target structure, symmetry descriptors and other variables. These variables are combined into a score (a lower score means higher agreement with desired properties). It uses a Monte Carlo simulated annealing to optimize the amino acid sequence.

The code was modified to provide start from a user‐supplied amino acid sequence and to improve monitoring of the evolution of amino acid sequences. Parallel tempering was performed using a Python script.

The proteins were designed to maximize pLDDT (Jumper et al., 2021), maximize pTM (Jumper et al., 2021), minimize the number of surface hydrophobic residues, and maximize globularity (with weights 1, 1, 1 and 0.1, respectively). All these measures are available in the protein design module of ESMfold (Hie et al., 2022). Globularity is calculated as the variance of distances of the atoms from the centroid.

We performed design of proteins with 100 and 200 amino acid residues. The initial sequences were generated randomly. The temperatures were selected using a trial and error approach. We used 17 replicas (temperatures 0.00984, 0.0124, 0.0156, 0.0197, 0.0248, 0.0313, 0.0394, 0.0496, 0.0625, 0.0787, 0.099, 0.125, 0.157, 0.198, 0.25, 0.5, and 1.0 in $kT$ units) for proteins with 100 residues. We used 24 replicas (temperatures 0.00164, 0.00195, 0.00232, 0.00276, 0.00328, 0.00391, 0.00465, 0.00552, 0.00657, 0.00781, 0.00929, 0.0110, 0.0131, 0.0156, 0.0221, 0.0313, 0.0442, 0.0625, 0.0884, 0.125, 0.177, 0.25, 0.5, and 1.0 in $kT$ units) for proteins with 200 residues. Replica exchange attempts were made every 100 Monte Carlo steps, unless otherwise stated. There were exchange attempts between replicas 0 and 1, 2 and 3, and so forth. in odd exchange attempts and between replicas 1 and 2, 3 and 4, and so forth. in even exchange attempts. Due to restriction in available hardware, we performed design campaigns as jobs consisting of 25 or 10 exchange attempts, for proteins of 100 and 200 residues, respectively. There were no replica exchanges between the end of one job and the beginning of the subsequent job.

A standard Monte Carlo simulated annealing was performed on selected structures (Figures 2e and 3e) to improve their stability. This consisted of 1000 steps. The initial temperature was set to the temperature at which the design was sampled. The temperature was reduced by a factor of 0.97 after each step.

For visualizations, we used ESMfold (Lin et al., 2023), Python, VMD (Humphrey et al., 1996), and R. All material (codes, raw data, visualization codes, etc.) can be obtained at Zenodo (DOI: 10.5281/zenodo.15738587).

5. CONCLUSIONS

We extend the parallel tempering algorithm to protein design. It can be applied as an extension of methods commonly used in protein design, namely Monte Carlo sampling and simulation annealing. Parallel tempering shows the potential to optimize protein sequences more efficiently than conventional methods. Our approach was not experimentally validated by protein expression and characterization. However, we believe that the concept can be used in different protein design engines (the way energy of a protein sequence is calculated) and that the performance in experimental validation is most likely determined by the performance of the protein design engine. The advantage of our approach is the fact that it can provide a continuous flow of designed sequences.

AUTHOR CONTRIBUTIONS

Preet Kalani: Methodology; investigation; writing – review and editing; visualization; software. Vojtěch Spiwok: Conceptualization; methodology; software; investigation; writing – original draft; writing – review and editing; visualization; data curation; resources; supervision; funding acquisition; project administration.

FUNDING INFORMATION

This research was supported by the COST and the Ministry of Education, Youth and Sports of the Czech Republic (ML4NGP, CA21160, LUC 24136, LM 2023055).

CONFLICT OF INTEREST STATEMENT

The authors declare no conflicts of interest.

Supporting information

Data S1. Supporting Information.

PRO-34-e70246-s001.pdf^{(18.4MB, pdf)}

ACKNOWLEDGMENTS

This work was supported by COST (project ML4NGP, CA21160). Participation in the COST project was supported by the Ministry of Education, Youth, and Sports of the Czech Republic (LUC 24136). Long‐term availability of the resulting tools and data is supported by ELIXIR CZ (LM 2023055). Open access publishing facilitated by Vysoka skola chemicko‐technologicka v Praze, as part of the Wiley ‐ CzechELib agreement.

Kalani P, Spiwok V. Design of proteins by parallel tempering in the sequence space. Protein Science. 2025;34(10):e70246. 10.1002/pro.70246

Review Editor: Nir Ben‐Tal.

DATA AVAILABILITY STATEMENT

The data that support the findings of this study are openly available in Zenodo at https://zenodo.org, reference number 15738587.

REFERENCES

Dauparas J, Anishchenko I, Bennett N, Bai H, Ragotte RJ, Milles LF, et al. Robust deep learning–based protein sequence design using ProteinMPNN. Science. 2022;378(6615):49–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
Earl DJ, Deem MW. Parallel tempering: theory, applications, and new perspectives. Phys Chem Chem Phys. 2005;7:3910–3916. [DOI] [PubMed] [Google Scholar]
Frank C, Khoshouei A, Fuβ L, Schiwietz D, Putz D, Weber L, et al. Scalable protein design using optimization in a relaxed sequence space. Science. 2024;386(6720):439–445. [DOI] [PMC free article] [PubMed] [Google Scholar]
Goverde CA, Wolf B, Khakzad H, Rosset S, Correia BE. De novo protein design by inversion of the AlphaFold structure prediction network. Protein Sci. 2023;32(6):e4653. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hie B, Candido S, Lin Z, Kabeli O, Rao R, Smetanin N, et al. A high‐level programming language for generative protein design. bioRxiv. 2022, 10.1101/2022.12.21.521526. [DOI] [Google Scholar]
Humphrey W, Dalke A, Schulten K. VMD – visual molecular dynamics. J Mol Graph. 1996;14:33–38. [DOI] [PubMed] [Google Scholar]
Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lin Z, Akin H, Rao R, Hie B, Zhu Z, Lu W, et al. Evolutionary‐scale prediction of atomic‐level protein structure with a language model. Science. 2023;379(6637):1123–1130. [DOI] [PubMed] [Google Scholar]
Lisanza SL, Gershon JM, Tipps SWK, Sims JN, Arnoldt L, Hendel SJ, et al. Multistate and functional protein design using RoseTTAFold sequence space diffusion. Nat Biotechnol. 2024, 10.1038/s41587-024-02395-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
Listov D, Goverde CA, Correia BE, Fleishman SJ. Opportunities and challenges in design and optimization of protein function. Nat Rev Mol Cell Biol. 2024;25:639–653. [DOI] [PMC free article] [PubMed] [Google Scholar]
Liu Y, Kuhlman B. RosettaDesign server for protein design. Nucleic Acids Res. 2006;34(Suppl 2):W235–W238. [DOI] [PMC free article] [PubMed] [Google Scholar]
Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E. Equation of state calculations by fast computing machines. J Chem Phys. 1953;21(6):1087–1092. [Google Scholar]
Rives A, Meier J, Sercu T, Goyal S, Lin Z, Liu J, et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc Natl Acad Sci USA. 2021;118(15):e2016239118. [DOI] [PMC free article] [PubMed] [Google Scholar]
Spiwok V, Sucur Z, Hosek P. Enhanced sampling techniques in biomolecular simulations. Biotech Adv. 2015;33(6, Part 2):1130–1140. BioTech 2014 and 6th Czech‐Swiss Biotechnology Symposium. [DOI] [PubMed] [Google Scholar]
Swendsen RH, Wang JS. Replica Monte Carlo simulation of spin‐glasses. Phys Rev Lett. 1986;57:2607–2609. [DOI] [PubMed] [Google Scholar]
van der Maaten L, Hinton G. Visualizing Data using t‐SNE. J Mach Learn Res. 2008;9:2579–2605. [Google Scholar]
Verkuil R, Kabeli O, Du Y, Wicky BIM, Milles LF, Dauparas J, et al. Language models generalize beyond natural proteins. bioRxiv. 2022, 10.1101/2022.12.21.521521. [DOI] [Google Scholar]
Watson JL, Juergens D, Bennett NR, Trippe BL, Yim J, Eisenach HE, et al. De novo design of protein structure and function with RFdiffusion. Nature. 2023;620:1476–4687. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wicky BIM, Milles LF, Courbet A, Ragotte RJ, Dauparas J, Kinfu E, et al. Hallucinating symmetric protein assemblies. Science. 2022;378(6615):56–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
Yamamoto R, Kob W. Replica‐exchange molecular dynamics simulation for supercooled liquids. Phys Rev E. 2000;61:5473–5476. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data S1. Supporting Information.

PRO-34-e70246-s001.pdf^{(18.4MB, pdf)}

Data Availability Statement

The data that support the findings of this study are openly available in Zenodo at https://zenodo.org, reference number 15738587.

[pro70246-bib-0001] Dauparas J, Anishchenko I, Bennett N, Bai H, Ragotte RJ, Milles LF, et al. Robust deep learning–based protein sequence design using ProteinMPNN. Science. 2022;378(6615):49–56. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro70246-bib-0002] Earl DJ, Deem MW. Parallel tempering: theory, applications, and new perspectives. Phys Chem Chem Phys. 2005;7:3910–3916. [DOI] [PubMed] [Google Scholar]

[pro70246-bib-0003] Frank C, Khoshouei A, Fuβ L, Schiwietz D, Putz D, Weber L, et al. Scalable protein design using optimization in a relaxed sequence space. Science. 2024;386(6720):439–445. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro70246-bib-0004] Goverde CA, Wolf B, Khakzad H, Rosset S, Correia BE. De novo protein design by inversion of the AlphaFold structure prediction network. Protein Sci. 2023;32(6):e4653. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro70246-bib-0005] Hie B, Candido S, Lin Z, Kabeli O, Rao R, Smetanin N, et al. A high‐level programming language for generative protein design. bioRxiv. 2022, 10.1101/2022.12.21.521526. [DOI] [Google Scholar]

[pro70246-bib-0006] Humphrey W, Dalke A, Schulten K. VMD – visual molecular dynamics. J Mol Graph. 1996;14:33–38. [DOI] [PubMed] [Google Scholar]

[pro70246-bib-0007] Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro70246-bib-0008] Lin Z, Akin H, Rao R, Hie B, Zhu Z, Lu W, et al. Evolutionary‐scale prediction of atomic‐level protein structure with a language model. Science. 2023;379(6637):1123–1130. [DOI] [PubMed] [Google Scholar]

[pro70246-bib-0009] Lisanza SL, Gershon JM, Tipps SWK, Sims JN, Arnoldt L, Hendel SJ, et al. Multistate and functional protein design using RoseTTAFold sequence space diffusion. Nat Biotechnol. 2024, 10.1038/s41587-024-02395-w. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro70246-bib-0010] Listov D, Goverde CA, Correia BE, Fleishman SJ. Opportunities and challenges in design and optimization of protein function. Nat Rev Mol Cell Biol. 2024;25:639–653. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro70246-bib-0011] Liu Y, Kuhlman B. RosettaDesign server for protein design. Nucleic Acids Res. 2006;34(Suppl 2):W235–W238. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro70246-bib-0012] Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E. Equation of state calculations by fast computing machines. J Chem Phys. 1953;21(6):1087–1092. [Google Scholar]

[pro70246-bib-0013] Rives A, Meier J, Sercu T, Goyal S, Lin Z, Liu J, et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc Natl Acad Sci USA. 2021;118(15):e2016239118. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro70246-bib-0014] Spiwok V, Sucur Z, Hosek P. Enhanced sampling techniques in biomolecular simulations. Biotech Adv. 2015;33(6, Part 2):1130–1140. BioTech 2014 and 6th Czech‐Swiss Biotechnology Symposium. [DOI] [PubMed] [Google Scholar]

[pro70246-bib-0015] Swendsen RH, Wang JS. Replica Monte Carlo simulation of spin‐glasses. Phys Rev Lett. 1986;57:2607–2609. [DOI] [PubMed] [Google Scholar]

[pro70246-bib-0016] van der Maaten L, Hinton G. Visualizing Data using t‐SNE. J Mach Learn Res. 2008;9:2579–2605. [Google Scholar]

[pro70246-bib-0017] Verkuil R, Kabeli O, Du Y, Wicky BIM, Milles LF, Dauparas J, et al. Language models generalize beyond natural proteins. bioRxiv. 2022, 10.1101/2022.12.21.521521. [DOI] [Google Scholar]

[pro70246-bib-0018] Watson JL, Juergens D, Bennett NR, Trippe BL, Yim J, Eisenach HE, et al. De novo design of protein structure and function with RFdiffusion. Nature. 2023;620:1476–4687. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro70246-bib-0019] Wicky BIM, Milles LF, Courbet A, Ragotte RJ, Dauparas J, Kinfu E, et al. Hallucinating symmetric protein assemblies. Science. 2022;378(6615):56–61. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro70246-bib-0020] Yamamoto R, Kob W. Replica‐exchange molecular dynamics simulation for supercooled liquids. Phys Rev E. 2000;61:5473–5476. [DOI] [PubMed] [Google Scholar]

PERMALINK

Design of proteins by parallel tempering in the sequence space

Preet Kalani

Vojtěch Spiwok

Abstract

1. INTRODUCTION

FIGURE 1.

2. RESULTS

FIGURE 2.

FIGURE 3.

FIGURE 4.

FIGURE 5.

FIGURE 6.

3. DISCUSSION

4. METHODS

5. CONCLUSIONS

AUTHOR CONTRIBUTIONS

FUNDING INFORMATION

CONFLICT OF INTEREST STATEMENT

Supporting information

ACKNOWLEDGMENTS

DATA AVAILABILITY STATEMENT

REFERENCES

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Design of proteins by parallel tempering in the sequence space

Preet Kalani

Vojtěch Spiwok

Abstract

1. INTRODUCTION

FIGURE 1.

2. RESULTS

FIGURE 2.

FIGURE 3.

FIGURE 4.

FIGURE 5.

FIGURE 6.

3. DISCUSSION

4. METHODS

5. CONCLUSIONS

AUTHOR CONTRIBUTIONS

FUNDING INFORMATION

CONFLICT OF INTEREST STATEMENT

Supporting information

ACKNOWLEDGMENTS

DATA AVAILABILITY STATEMENT

REFERENCES

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases