Abstract
We explore the potential application of quantum annealing to address the protein structure problem. To this end, we compare several proposed ab initio protein folding models for quantum computers and analyze their scaling and performance for classical and quantum heuristics. Moreover, we introduce a novel encoding of coordinate-based models on the tetrahedral lattice, based on interleaved grids. Our findings reveal significant variations in model performance, with one model yielding unphysical configurations within the feasible solution space. Furthermore, we conclude that current quantum annealing hardware is not yet suited for tackling problems beyond a proof-of-concept size, primarily due to challenges in the embedding. Nonetheless, we observe a possible scaling advantage over our in-house simulated annealing implementation, which, however, is only noticeable when comparing performance on the embedded problems.
Subject terms: Mathematics and computing, Physics
Introduction
The prediction of a protein’s three-dimensional structure from its amino acid sequence has long been a central challenge in computational biology. Beyond the fundamental question of how a protein folds, a protein’s structure governs most of its interactions and is therefore critical for applications such as virtual ligand screening1 and the study of protein–protein interactions2, both of which are of great importance for modern in silico pharmacology. Recent advances in artificial intelligence (AI) models have enabled the successful prediction of structures for a diverse array of proteins3–5. However, structures with no known homologues6 or the estimation of the physical folding pathway7 still remain challenging. Furthermore, incorporating non-canonical amino acids into the bioengineering process offers new opportunities and provides a dataset for which little training data exists8. In contrast, physics-based approaches, either simulating the physical folding process or searching for a conformation that minimizes a physics-inspired energy function, struggle with the immense conformational space and the rugged free-energy landscape characteristic of proteins9,10. As a result, efforts to solve physics-inspired problems have increasingly focused on heuristic algorithms designed to efficiently navigate complex energy landscapes. Finding the energy minimum of complex systems is a well-studied problem in statistical physics and combinatorial optimization. Naturally, several of the proposed heuristics to estimate the global minimum of a complex energy function have been adapted to the protein folding problem. Prominent examples include simulated annealing11 and parallel tempering (which is often also denoted replica exchange method in the literature)10,12,13.
The principal obstacle of this strategy is the rugged nature of the free-energy landscape: deep wells are separated by steep energy barriers, so gradient-based optimizers often stall in sub-optimal minima9. In the software Rosetta14 this difficulty is partly alleviated by progressively ramping up the strength of repulsive terms, which helps the search to escape isolated wells.
An alternative route could be given by the advent of new quantum technologies, especially quantum annealing, an algorithm originally derived as a quantum analogue to simulated annealing15. By utilizing quantum tunneling, quantum annealing can potentially overcome energy barriers more rapidly than classical algorithms, accelerating the optimization process16,17. Since proteins possess notoriously rugged free-energy landscapes, quantum annealing could be pivotal for solving larger or novel protein structures, especially those with which AI-based models currently struggle6.
Perdomo-Ortiz et al18. were the first to suggest a model to tackle the protein structure problem (PSP) on a quantum annealer. Due to the limitations of current hardware these models are restricted to coarse-grained models, folding the protein on a discrete lattice. Subsequent work has refined the approach through more efficient problem encodings19 or alternative lattice architectures20. This has given rise to multiple variants, each offering its own benefits and compromises18–24. Apart from the PSP, recent studies have been performed to determine the feasibility of quantum computing approaches for both protein design25 and protein–peptide docking26. However, current implementations have not yet escaped the proof-of-principle stage. In this work, we investigate the scaling of these models both in terms of their resource requirements and as the projected scaling of the time it takes to find the native fold. Our findings highlight that the correct choice of model is crucial in the noisy intermediate-scale quantum (NISQ) era and even beyond.
Related work and main contributions The application of quantum computing for protein folding has recently attracted significant attention due to its widespread applications. Despite current quantum computers not yet being capable of handling the complexities of relevant protein sizes, many studies have investigated their potential future use in this field. For example, Boulebane et al.27 investigated the potential of the quantum approximate optimization algorithm (QAOA)28 for the protein structure problem, finding rather negative results in comparison to classical methods. Further approaches to the problem using digitized counterdiabatic protocols as presented by Chandarana et al.29 or Romero et al.30 which show more promising results than QAOA. Outeiral et al.31 explored the potential for limited quantum speedups by examining the scaling of the spectral gap for a dense problem encoding as the peptide chain length increases, finding exponentially quickly closing gaps for worst-case examples but only polynomially closing gaps in the average case. They also compared the performance of simulated annealing with ideal quantum annealing through direct numerical simulations of the Schrödinger equation for short peptide sequences. Furthermore, Linn et al.32 conducted a resource estimation for various approaches to the protein folding problem using gate-based quantum computers and QAOA. Doga et al.6 investigated the protein structure prediction problem with a focus on practical applications. They developed a framework to identify proteins that could benefit from quantum computing-based approaches, particularly those with a rugged free-energy landscape and limited homologues, to demonstrate an advantage over AI-based methods. Notably, they demonstrated that for a proof-of-principle protein (PDB: 5GJB) the quantum computing approach combined with classical post-processing could lead to lower root-mean-square errors at the all-atom resolution than AlphaFold2.
Our work builds on previous research by focusing specifically on the paradigm of quantum annealing. To this end, we compare and revise several proposed formulations of the coarse-grained lattice protein folding problem in terms of their scaling in resource cost. We aim to identify which of these models could benefit from a quantum annealing approach by calculating the spin overlap distribution, a proxy for the complexity of the free-energy landscape. Finally, we compare the performance of classical heuristic solvers with quantum annealing hardware. To the best of our knowledge, our study is the first to perform a scaling analysis for multiple protein sequences on real, currently available quantum hardware and the first formulation-dependent comparison.
By performing benchmark calculations, we found that the model of Ref20. produces non-physical folds with lowest energy, when the amino acid sequence is longer than 10 residues, see Fig. 6 and Appendix B for more details. Apart from this we introduce a novel encoding scheme for the PSP on quantum computers based on the works of Robert et al.20 in combination with the work of Babbush et al. and Irbäck et al.21,23. We find that for the shorter sequences considered in this work, our encoding provides the best observed performance. Furthermore, with the presented encoding we are able to embed larger sequences of up to 18 amino acids onto current-gen quantum hardware. However, we were not able to solve them on the hardware using standard quantum annealing procedures.
Fig. 6.
Example of an unphysical ground state configuration obtained from the turn-based tetrahedral model next to the ideal physical configuration. The considered sequence is given by HPPPPHPPPPH in the HP-model. (a) Unphysical lowest energy fold where beads 0 and 11 overlap. Since in the HP-model there is no interaction between H and P beads the chain can self-intersect without sacrificing energy. (b) Alternative ground state without overlap. Both folds have the same energy rendering them simultaneous ground states of the model.
The remainder of this article is structured as follows. In Section Methods, we provide the details of the considered methods, such as the models and solvers, used in this study. Section Resource estimation for quantum annealing aims to establish the suitability of the considered problem formulations by investigating their resource scaling as well as the spin overlap distributions. In Section Quantum annealing vs. simulated annealing, we compare the scaling of the time to solution for the best projected model between simulated annealing and quantum annealing. Finally, we conclude the study in Section Conclusion and outlook.
Methods
Quadratic unconstrained binary optimization
Quadratic Unconstrained Binary Optimization (QUBO) is the task of finding the minimal configuration for the problem
![]() |
1 |
where
are Boolean variables, and
is the real-valued QUBO matrix.
A closely related problem is the Ising spin glass problem from condensed matter physics. In this case, one seeks low-energy configurations for the generic Ising Hamiltonian
![]() |
2 |
where the spin variables
represent the two possible states of a spin. The Boolean variables
and the spin variables
are related through the linear transformation
. This transformation allows the Ising Hamiltonian to be directly mapped to a QUBO matrix, making the two formulations mathematically equivalent. This relation between QUBOs and spin glasses triggered the development of several physics-inspired optimization algorithms, which originally were used to tackle many-particle problems in condensed matter physics. The most prominent examples, which we also use in this study, are simulated annealing (see Sec. Simulated annealing), quantum annealing (see Sec. 2.4) and parallel tempering (see Sec. Quantum annealing). Finding the ground state of Ising Hamiltonians is a notoriously difficult optimization problem and is generally known to be NP-hard33. Throughout this manuscript, we will use the convention that
refers to variables in Ising space and
refers to variables in QUBO space.
In many applications the optimization problem that needs to be solved does not naturally take the form of a QUBO or Ising formulation and may contain higher-order terms
![]() |
3 |
For the later sections it will become important to map these higher-order unconstrained binary optimization (HUBO) problems to QUBO problems. For example, in Boolean space, this can be accomplished using Rosenberg’s polynomial34, which can be used to reduce the order of a term by one at the cost of introducing additional variables. This way a term of degree three,
![]() |
4 |
can be transformed into a 2-local term,
, by introducing an auxiliary variable
. To ensure that the auxiliary variable
, if and only if both
and
, an additional penalty term needs to be added in the form of Rosenberg’s polynomial
![]() |
5 |
The strength
is crucial for the formulation and must be chosen sufficiently large to ensure that the original structure of the energy landscape is conserved. That is, there should not be a potential energy gain when the condition is not satisfied. By applying this method iteratively, any HUBO can be reduced to a QUBO at the cost of additional variables.
Coarse-grained protein folding
To find the native fold of a given protein on current NISQ hardware, a problem formulation that adheres to the restrictions of the hardware must be adapted. The most commonly used approach involves formulating the problem on a coarse-grained lattice18–24. In this formulation, the protein is represented as a chain of multiple beads, where each bead corresponds to a single or multiple amino acids (see Fig. 1).
Fig. 1.
Example of a 10 amino acid mini protein folded on two different lattices. (a) A 2-dimensional Cartesian grid and (b) A three-dimensional tetrahedral/diamond grid. Each amino acid corresponds to a single bead in the chain. Interactions between amino acids are established via nearest-neighbor interactions. Images were produced using the NGL viewer37.
The positions a bead can take are discretized and interactions between amino acids are modeled according to nearest- (or next
-nearest-) neighbor interactions on the lattice sites. The free energy of a given fold is derived from pairwise interactions of the amino acids either by a simple hydrophobic-hydrophilic (HP) model, the interaction matrix derived by Miyazawa and Jernigan35 or Lennard-Jones-type potentials27. Using these coarse-grained problem representations, finding the lowest energy configuration reduces to a discrete optimization problem (which can be mapped to a QUBO), which is even in the simplest HP case, known to be NP-complete36.
Since the first formulation of the problem18 various model improvements have been made. For example, two different ways of encoding the positions of the amino acids on the lattice have emerged; either direct encoding as coordinates on a finite size lattice of size L, often denoted as a coordinate-based models18, or as a set of turns the polypeptide chain has taken, denoted turn-based models19.
In this paper, we focus on the most promising near-future candidates that can be efficiently mapped onto a quantum annealer, specifically those models with bounded locality, that is models that have a limited number of qubits participating in an interaction. The models we investigate are:
A turn-based model on a three-dimensional Cartesian grid21,22,
a turn-based model on a tetrahedral grid20,
an adaptation of the coordinate-based model on the tetrahedral grid, which to the best of our knowledge has not yet been discussed in the literature.
The novel model proposed in this work adapts the coordinate-based encoding of Babbush et al.21 and Irbäck et al.23 to a tetrahedral grid, as described in Robert et al.20. In contrast to the other encodings presented, it aims to deliver a best-of-both-worlds approach: it preserves the native 2-local encoding of the coordinate-based model while leveraging a grid with a sparser interaction structure. The model maintains the underlying structure of the coordinate-based encoding and uses two interleaved grids derived from the face-centered cubic lattices, offset by a quarter diagonal to realize the tetrahedral grid. Compared with the approach of Brubaker et al.26, our method does not enlarge the underlying grid and does not rely on penalty terms to enforce grid conformity, making it more resource-efficient. The multi-grid strategy could be readily extended to problems such as conformational docking, enabling efficient encoding of folding processes on two grids (e.g., a reaction pocket in a larger protein). More details on the encoding are presented in Appendix ATetrahedral lattice. A review of all considered models, including some minor adjustments, can be found in Appendix A. In Tab. 1 we summarize the properties of the different models before the mapping to 2-local terms. Note, that this mapping, as described above, increases the number of qubits.
Table 1.
Scaling properties of the models considered in this study, before the reduction to a 2-local model. Note that we do not extend the models beyond nearest-neighbor interactions in order to stay as resource efficient as possible. For the coordinate-based approaches the lattice sizes L need to be scaled so that the whole sequence can fit onto the lattice
.
Simulated annealing
Simulated annealing (SA) is a meta-heuristic algorithm that has been adapted from statistical mechanics to the field of optimization11. The algorithm employs a temperature-based Markov chain Monte Carlo (MCMC) method to sample low-energy states, mimicking the physical annealing process of metals. Its primary advantage over naive Monte Carlo approaches lies in efficient sampling through the Metropolis criterion38. Starting from an initial state with energy
, a new state with energy
is proposed. If the energy of the proposed state is lower than that of the current state, the transition is accepted. If not, the transition is accepted with a probability given by
![]() |
6 |
where
denotes the inverse temperature1 and
is the energy difference. This probability allows the algorithm to escape from local minima where it might otherwise become trapped. To ensure convergence to a minimum, the temperature T is lowered with an exponential cooling schedule at a selected cooling rate
, such that after an attempt to flip each spin the temperature is reduced according to
![]() |
7 |
In our implementation, the start temperature
is automatically selected based on n random spin flips performed on a random state vector with the following algorithm proposed by Atiqullah39:
![]() |
8 |
where
is the number of accepted spin flips for n tries,
represents the sample mean and
being the sample standard deviation of the energy difference per flip.
For parallelization, the algorithm employs a multi-flip procedure by performing the spin flips on each independent set of the QUBO matrix graph in parallel, similar to the method described in Ref40. by Imanaga et al. The main difference is that we apply the Deveci graph coloring heuristic41 to determine independent sets in the QUBO graph. The sparser the graph, the more the algorithm can use GPU parallelism, usually resulting in more independent nodes per set and therefore a higher parallelization potential.
Since usually a single run of the algorithm will not return the global energy minimum, the algorithm is repeated several times to sample a distribution of low energy states. To speed up the sampling, we use a GPU-parallelized SA implementation running on two NVIDIA A100 GPUs. This setup enables the sampling of 432 separate instances in parallel across all considered problem sizes. We want to highlight that the implicit parallelization speedup is considered in all results of this manuscript.
Quantum annealing
Simulated quantum annealing, as introduced in Refs42,43., is a quantum-inspired algorithm to solve combinatorial optimization problems, that runs on classical hardware. The actual physical implementation, i.e., quantum annealing (QA) on dedicated hardware, is a non-universal form of quantum computing aimed at solving combinatorial optimization (CO) problems that are classically difficult to tackle.
Quantum annealers solve optimization problems (quasi-)adiabatically by initializing an easy-to-prepare ground state and gradually ramping up a problem Hamiltonian while ramping down the initial Hamiltonian. The Hamiltonian can be written as
![]() |
9 |
where
is a dimensionless parameter that characterizes the Hamiltonian at each time t during the annealing process, with maximal time
.
The functions
and
are amplitudes that typically satisfy the boundary conditions
and
, ensuring that at the end of the annealing process, only the problem Hamiltonian contributes to the energy landscape. Unlike the broader concept of adiabatic quantum computing, QA only realizes stoquastic Hamiltonians44, making it a non-universal form of quantum computing. In current hardware, the problem Hamiltonian
is encoded as the Ising Hamiltonian
![]() |
10 |
which consists of the programmable parameters
, denoting the inter-qubit couplings, as well as the single qubit biases
.
Parallel tempering and problem hardness
Parallel tempering (also known as replica exchange Monte Carlo) is another temperature-based heuristic for locating low-energy configurations in an Ising spin glass. Unlike simulated annealing, which cools a single system along a predefined schedule, parallel tempering runs multiple copies (replicas) of the system in parallel at fixed temperatures
. However, for each replica, the Monte Carlo sweeps are performed using the same spin flip acceptance probability as described in Eq. (6) for simulated annealing, at the corresponding replica temperature. Additionally, after sweeping through all spins in all replicas, one performs a swap of the assigned temperatures between two neighboring replicas (i.e., replicas with close temperatures) with the probability
![]() |
11 |
where E,
and T,
are the energies and temperatures of the two replicas, respectively. Parallel tempering is parallelized in the same way based on graph coloring as described in Sec. Simulated annealing for SA.
We use parallel tempering to estimate the hardness of the different protein folding models by scanning the energy landscape for local minima. Specifically, to quantify the usefulness of utilizing QA for a given problem, we use the order parameter
![]() |
12 |
from spin-glass theory2as discussed in Refs45,46.. The index
runs over the individual spins of the problem, while the superscripts denote the index of two replicas of the problem with the same disorder, i.e., at the same temperature. Thus, to calculate q, we have to run two independent parallel tempering instances in parallel.
The order parameter serves as a measure of the average thickness of the barriers between local minima in the energy landscape. If the problem has many local minima, which can be reached from one another with only a few spin flips, the probability P(q) of measuring an overlap
is high. In this scenario, Ref46. argues that quantum annealing has an advantage, as the quantum tunneling effect can help the optimizer to jump between minima. Conversely, if local minima are far from each other, i.e., many flips are required to jump from one to another, the spin overlap is closer to 0. In this case, the problem has thick barriers and is notoriously difficult to solve, both for classical algorithms and quantum annealing.
The distribution P(q) of the order parameter is especially interesting for the PSP since it can be interpreted as a proxy for the free-energy landscape45.
Resource estimation for quantum annealing
In this section we investigate the scaling of the different protein folding models. We first compare different metrics like the number of qubits a quantum annealer needs to run these models, the density and the required resolution of the couplers (Sec. Model scaling). We then investigate the structure of the free-energy landscape to see if the models are amenable to quantum speedup due to quantum tunneling (Sec. Spin overlap distributions). Finally, we investigate the influence of the embedding process on the models (Sec. 3.3). In Sec. 3.4 we give a short summary and discussion of all results.
Model scaling
To investigate the scaling properties we generate QUBO instances for each model ranging from
to
amino acids. Since the D-Wave devices only support 2-local couplings, we reduce the locality of all HUBOs (turn-based models) using Rosenberg’s polynomial via the PyQUBO library47. As mentioned earlier, we have to choose the penalty strength for these additional variables to ensure that Rosenberg’s polynomial conserves the energy landscape of the original problem. For our analysis we consider
![]() |
13 |
which is a worst-case scaling, to ensure that the ancilla variables obey the constraints. In the general case this leads to large QUBO coefficients which can be detrimental to performance. For example, a better choice could be made following the ideas of Ref48.. However, the general scaling of the magnitudes of the coefficients with the sequence length will still remain.
We evaluate which model most effectively maps onto current-gen quantum annealers based on the scaling of various relevant properties of the models. These properties include the required number of logical qubits3, the density of the QUBO matrix, the average number of required couplers per qubit, and the minimal required coupler resolution
. The results are presented in Fig. 2. Due to a steep increase in the computational time we only consider the turn-based model up to approximately 16 amino acids. Beyond this sequence length we find that the generation of the QUBO matrix, especially regarding the reduction to a 2-local model, takes too long to be considered feasible. We find that for the execution on a 2-local quantum annealer this effect alone could preclude any possible quantum advantage for the turn-based Cartesian model.
Fig. 2.
Scaling of the considered metrics. We show the required number of qubits when reducing the problem to a 2-local QUBO (top left), the density of the QUBO matrix
(top right), the average required number of qubit-qubit couplings per qubit (bottom left) and the maximum coupling strength divided by minimum coupling strength (bottom right). All results depict the model metrics before embedding onto a given hardware graph. The stepwise increases in the coordinate-based models indicate points at which the grid size was adjusted. We investigated the turn-based Cartesian model only up to a sequence length of 16 amino acids, since the reduction to a 2-local model became too time-consuming for larger sequences.
The coordinate-based model is defined on a finite grid with
lattice sites (see Appendix ACoordinate-based model), requiring the grid size to be specified prior to generating the QUBO matrix. For simplicity, we limit our analysis to symmetric grids where
. However, in certain scenarios, asymmetric grids (where
) may be more advantageous. Further, to maximize resource efficiency, we restrict our analysis to the minimal lattice size. Since the size of the native fold (i.e., the minimum number of lattice sites needed to accommodate it) is not known a priori, we start with the smallest lattice capable of supporting the entire sequence,
. As this is often too restrictive, we increment the grid length L by 1 to provide additional degrees of freedom. For a Cartesian lattice, this corresponds to a minimal grid size of
, while for the tetrahedral lattice, it is
.
For the considered parameters we find a roughly equivalent scaling in the number of qubits for three out of the four models with the turn-based model on the Cartesian grid being the outlier. Conversely, the turn-based model on the tetrahedral grid performs surprisingly well, even considering the additional resources that are required for the mapping of higher-order terms to 2-local terms.
The density
of the QUBO matrix relates to the number of qubit-qubit interactions required, relative to the maximum possible number of interactions. Generally, it is conjectured that QA performs best for QUBOs with low density16. Our findings show that, across all considered models, apart from the turn-based model on the Cartesian grid, the density decreases as the number of amino acids increases. Additionally, the data reveals that the turn-based model on the tetrahedral grid yields the sparsest QUBO matrix, making it potentially more suitable for quantum annealing.
Unlike QUBO density, the average number of couplers per qubit directly reflects the connectivity a device needs, to host the models without embedding. For every model studied, this value increases with sequence length, indicating a corresponding rise in embedding overhead.
Finally, we investigate the required coupler resolution
for each of the models, which is given by the absolute value of the quotient of the largest programmable coupler strength in relation to the lowest non-zero coupler strength. We find that, while the resolution is constant for the coordinate-based models, the resolution needs to be increasingly high for the turn-based models. As already studied in Ref21., this effect is mostly induced by the reduction to a 2-local model, which increases the couplings due to the multiplication of penalty terms.
Spin overlap distributions
To determine if the given problems are suitable for quantum annealers, we estimate the distribution of the order parameter q or spin overlap distribution (SOD) P(q) for each problem formulation. To improve the simulation we chose the penalty terms/value of
for the turn-based models lower than in the scaling analysis. We provide details on the chosen penalties in Appendix ATetrahedral lattice. For the considered coordinate-based models we chose a constant grid size, consisting of
lattice sites for the Cartesian grid as well as
sites for the tetrahedral grid. The SOD is estimated using parallel tempering as described by Katzgraber et al.46. We calculate the spin overlap from two parallel runs of parallel tempering using
Monte Carlo sweeps, with 400 different temperature instances distributed geometrically between
and
(see Appendix CParallel Tempering for the concrete values). The overlap distribution P(q) is estimated by computing the spin overlap over
sweeps, which are performed after an initial thermalization period of
. This thermalization period ensures convergence to a local minimum for the lowest temperature instances.
The spin overlap is then extracted from the replicas corresponding to the lowest temperature of both instances. More details of the PT parameter choices for all simulations are provided in Appendix CParallel Tempering.
To see how the SOD evolves with growing sequence length we investigate growing sections of increasing sequence length of the M-Ras protein (PDB ID : 9C1A) for three discrete values of 10, 16 and 22 amino acids. Since the turn-based model on the Cartesian grid required a substantially larger amount of resources, both in compute time as well as QUBO matrix size we restrict the SODs for this model to 6, 8 and 10 amino acids. The results of the estimated spin overlap distributions are presented in Fig. 3. As highlighted by the data the overlap distributions take vastly different forms for the considered models even though they encode the same protein.
Fig. 3.
Spin overlap distribution for sections of increasing sequence length of the 189 amino acid protein M-RAS. The choice of protein is arbitrary and serves as a mere indication of the model differences for the same protein. The results show the SOD for the coordinate-based model on the tetrahedral (top left panel) and Cartesian grid (top right panel), as well as the turn-based models on the tetrahedral (bottom left panel) and Cartesian grid (bottom right panel). Areas highlighted in red indicate the range of thick barriers, where no quantum speedup due to quantum tunneling is expected46.
We briefly analyze the measured SODs for the various models. All SODs lie predominantly in the regime
. This results from the fact that for the lowest-temperature replica, the system relaxes into a local minimum that satisfies the penalty terms of the original formulation. Depending on the structure of the problem the overlap between two configurations that obey the constraints is generally larger. As shown in Fig. 3, the coordinate-based encodings produce a sharply peaked, discrete SOD. This arises from their one-hot encoded structure where each solution vector is partitioned into blocks in which exactly one spin is in the
state while the rest are in the
state. The overlap between two such blocks can therefore take only a few discrete values, corresponding to either all spins aligned or at most two spins counter aligned, yielding the observed discrete spikes.
The turn-based encodings display a broader, less-structured SOD. Although the turn variables also discretize the landscape, additional qubits (like the interaction qubits or those introduced by Rosenberg’s polynomial) are subject to less structured penalties. As a result of these additional degrees of freedom, the overlap spectrum flattens into the diffuse profile observed.
Following the definitions of Ref46. we evaluate the hardness of instances by considering the distribution of measured peaks in the SOD where we consider instances with all peaks in the regime
as instances with thin barriers and instances with peaks outside this regime as instances with thick barriers. The region of thick barriers is indicated as a red shaded region.
The measured SODs for all but the turn-based model on the Cartesian grid lie in the regime of thin barriers where most peaks are located at
. It is important to note that this effect stems in part from the fact that we chose a denser encoding for this model, as explained in Appendix ACartesian lattice. The coordinate-based models clearly exhibit the most rugged energy landscape, as indicated by the closely spaced peaks. This suggests that the coordinate-based formulation is more suitable for leveraging quantum advantage through tunneling effects compared to the turn-based models.
Embeddings
A further restriction of currently available quantum annealers is the limited connectivity of the qubits. In physical systems, not all 2-local interactions
can be set because some qubits do not share a physical coupling. To solve problems requiring interactions between qubits not present in the hardware connection graph, an additional step called minor-embedding must be utilized49. The (minor-) embedding process involves finding a mapping from a given problem graph to the hardware graph by allowing for the contraction and removal of edges from the hardware graph until it matches the problem graph. While this allows for the solution of denser problems, it comes at the cost of an increased number of qubits as a chain of several physical qubits encode a single logical qubit. To ensure that all qubits in the chain are in the same state, the qubits are coupled ferromagnetically with a tunable chain strength. The correct choice of this chain strength can generally have a large impact on the solver performance.
Finding a graph minor is NP-hard when the goal is to minimize the number of nodes50. Consequently, practical applications rely on heuristics such as D-Wave’s minor-embedding algorithm, MinorMiner51.
A key advantage of the considered models is their uniform structure across all proteins, with only the QUBO coefficients varying. This enables the reuse of embeddings, allowing a single efficient embedding to be applied to all proteins of the same size.
To investigate the effect of the embedding for the different formulations, we generate 1000 embeddings for each protein size using D-Wave’s minor-miner for the Advantage 2 prototype. The Advantage 2 prototype is an annealer with roughly 1200 qubits based on the so-called Zephyr topology52,53.
Due to the device restrictions, we focus the analysis on shorter sequences, ranging from 6 to 9 amino acids for the tetrahedral grid and 4 to 7 amino acids for the Cartesian grid. We specifically chose this range as 4 (6) is the minimal sequence length to establish a nearest-neighbor contact between two amino acids on the Cartesian (tetrahedral) grid. All data regarding the coordinate-based models are taken with respect to the minimal grid that supports the native fold, as we found that after increasing the grid size we were not able to find a valid embedding.
Due to the steep resource costs we omit the turn-based model on the Cartesian grid from the embedding analysis. The scaling of the embeddings for the Advantage 2 prototype are presented in Fig. 4. As shown the embedding greatly increases the resource cost for all models. Generally we found that sparser models require fewer physical qubits after the embedding.
Fig. 4.
Required number of Qubits after the embedding for different sequence lengths and models. The data was taken for 1000 calculated embeddings, the error bars indicate best and worst case instances. Even for these short sequence lengths the embeddings can vary by more than a hundred qubits.
The error bars indicate the range between the worst and best case instances in the number of qubits. Some additional information regarding the distribution of the embeddings is presented in Appendix CQuantum Annealing.
A further problem of the minor embedding process is that typically the embedded problem is more complex to solve in contrast to the direct problem due to the larger required number of qubits. To investigate if the embedding has an effect on the SOD and thus the ruggedness of the free-energy landscape, we embed a protein consisting of 7 amino acids using the coordinate-based as well as turn-based encodings on the Zephyr graph53 as an exemplary test case. We chose the chain strength of the embedding as half the largest (absolute) value of the QUBO matrix. We found that this choice in chain strength conserves the ground state energies while leading to improved performance in comparison to unnecessarily larger chain strengths.
The results are presented in Fig. 5 and show the influence of the embedding on the SOD P(q) for the coordinate as well as turn-based models on the tetrahedral grid. Our findings highlight a general broadening of the measured SOD compared to the original problem. The increase in degrees of freedom seems to affect the SOD which can in some cases shift the model to the area associated with thicker energy barriers.
Fig. 5.
Influence of the embedding on the spin overlap for the models on the tetrahedral grid for an example of a protein with sequence length 7. The dashed line indicates a spin overlap of 0.5 as specified in Ref46.. As shown, the embedding process leads to an increase in thickness of the energy barriers for the turn-based model. For the coordinate-based model this effect seems less pronounced.
Discussion
To conclude this section, we discuss the obtained results and evaluate whether the problem, in its current form, is suitable for a quantum annealing approach using D-Wave’s sparsely connected hardware. During the tests, we thoroughly investigated the proposed models beyond the regimes in which they were initially tested. We identified several flaws in the models that may prohibit their use with quantum annealers. Below, we provide a brief review of the main drawbacks of each of the tested models.
Turn-based CartesianThroughout our analysis, we found that the turn-based model on the Cartesian grid performed the worst across nearly all considered metrics. We discovered that mapping the model to a 2-local Hamiltonian requires a large number of auxiliary qubits and results in a dense QUBO matrix, further increasing the qubit count in the embedding. Additionally, we noted that the coupler resolution increases with problem size, requiring several orders of magnitude in resolution. As highlighted in Ref21., this large resolution is a consequence of reducing the 8- local model to a 2-local one. The drawback of the required resolution is twofold. First, classical (temperature-based) optimizers often struggle to traverse steep energy barriers. While this issue can be mitigated by selecting sufficiently high temperatures, other terms (such as the MJ interaction energies) have much lower magnitudes, meaning the height of these barriers becomes significant only in the later stages when the temperature is sufficiently low, hence making it difficult to explore new folds while also optimizing their energy.
The second drawback arises from the fact that couplers in a D-Wave device are affected by integrated control errors (ICE). These errors indicate that a coupler
can only be set with some integrated error
. Such errors can significantly degrade the performance of the quantum annealing approach, an effect known as J-chaos54. Especially for problems which require a resolution beyond the magnitude of these errors, they can be detrimental for performance.
Turn-based tetrahedral We found that the turn-based tetrahedral model performs surprisingly well across all considered metrics. Although originally proposed for use with a gate-based quantum computer, the derived 2-local models result in comparable qubit counts to the natively 2-local coordinate-based models, while being considerably sparser. However, due to the necessary scaling of the penalty terms, the model shares the same drawback of requiring high coupler resolution, which can limit performance on both quantum annealers and classical temperature-based solvers.
Given the vastly different performance, let us summarize the improvements of the turn-based tetrahedral model over the turn-based Cartesian model. For a detailed discussion of the models, refer to Appendix A. First, the grid change reduces the number of possible turns from six directions on the Cartesian grid to only four on the tetrahedral grid. Consequently, the number of conformation qubits per chain length N changes from 6N (sparse encoding) or
(dense encoding) to 4N (sparse encoding) or
(dense encoding). In addition, the grid structure is considerably sparser, so fewer possible amino acid interactions have to be considered on the tetrahedral grid compared with the Cartesian grid. Apart from these grid-related advantages, improvements also arise from the penalty terms used in Ref20. to exclude overlapping folds. In the Cartesian model from Refs21,22., for each pair of beads j and k that could overlap, a (squared) distance function D(j, k) is introduced to ensure that only configurations with
are feasible. The construction of these penalty terms is non-trivial and requires auxiliary qubits (slack variables) to transform the inequality into an equality (see Appendix ACartesian lattice).
In contrast, the tetrahedral model penalizes overlaps only in the direct vicinity of possible nearest-neighbor interactions. This local formulation allows the overlap constraints to be absorbed into the energy function, avoiding the need for additional qubits in the form of slack variables. Lastly, the turn-based Cartesian model generally has higher locality than the turn-based tetrahedral model (see Appendix Tetrahedral lattice and Sec. Methods). Since higher locality typically increases the qubit overhead via the reduction to a 2-local model, the reduced locality of the turn-based tetrahedral approach contributes to its improved performance relative to the turn-based Cartesian one. Together, these effects explain the superior performance of the tetrahedral model over the Cartesian formulation.
Even though the model presents considerable improvements over the Cartesian grid, we would like to highlight one issue: the model fails to adequately penalize overlaps, which can result in unphysical solutions within the feasible solution space. This can in some instances include the ground state leading to wrong folds. The root cause of this issue lies in the mathematical structure of the encoding. Part of the model’s better performance comes from its treatment of amino acid overlaps. It incorporates the overlap penalty into the interaction energy function, meaning that overlaps are only penalized near an interaction (see Appendix ATetrahedral lattice)
![]() |
14 |
where D(i, j) is the distance function between beads i and j, the first Lagrangian multiplier
ensures that the interacting beads are nearest neighbors and the second multiplier
ensures that the neighboring beads are at distance 2 on the grid. In this formulation, the overlap is penalized only when two amino acids are close to a contact, and it is not penalized otherwise. While this approach scales much better than penalizing all possible overlaps, it has a significant drawback: the penalty is controlled by the interaction qubit
. The main issue with this form of penalization is that by turning off the interaction qubit (e.g. setting
), the penalty can be completely avoided. Hence, by “sacrificing” one interaction energy
, the peptide chain can overlap. In most cases, this isn’t an issue, as it’s typically more energetically favorable to find a configuration where the interaction energy is utilized. However, in some configurations, it may be more advantageous for the chain to self-cross and establish a better interaction later in the sequence. We demonstrate the consequence of this on a minimal artificial example in Fig. 6.
Coordinate-based Cartesian/tetrahedral When performing the scaling analysis of the different models, we found that the coordinate-based model performed better than the turn-based ones. The coordinate-based approach appears to be the most promising for quantum annealing. The native 2-local problem formulation enables an efficient representation on current-gen quantum annealers and does not require introducing additional qubits for locality reduction. Apart from requiring more qubits, the locality reduction also increases the strength of the penalty terms, which in turn demands higher coupler resolutions, something the coordinate-based approach avoids entirely. Finally, at the current stage of hardware development, dense models are more costly to embed. Because the tetrahedral grid yields a sparser interaction matrix and allows smaller grids, the coordinate-based approach on the tetrahedral grid stands out as the most promising for current and near-term quantum annealers.
Even though the coordinate-based approach appears to be the most promising, we found that the proposed models are still too dense to be efficiently embedded onto the annealer topology for peptide sizes beyond a proof-of-principle calculation of
5–20 amino acids. Moreover, although the QUBO matrix becomes sparser as sequence length or lattice size increases, the number of required couplings per qubit rises, indicating that embeddings get more complex with longer chains. Since minor-embedding remains the principal computational bottleneck, these results show that future quantum annealers must offer a hardware graph with much higher connectivity, such that embedding the models is possible.
In summary, we find that currently none of the models appear suitable for large-scale implementation on quantum annealers, although the coordinate-based models being more promising, however. Each proposed model is limited, either by having overly dense QUBO matrices or by scaling issues, such as the increasing qubit connectivity required with longer peptide chains or large required coupler resolutions.
Quantum annealing vs. simulated annealing
We now turn our attention to a performance comparison for the four different models using simulated annealing and, due to limited access to the D-Wave hardware, compare the scaling of quantum annealing for the most promising model (coordinate-based on a tetrahedral grid) with simulated annealing.
Dataset
To perform the benchmark, we generate 100 random instances of proteins for sequence lengths of 10 residues, uniformly sampled from the 20 naturally occurring amino acids. To compare the scaling we consider subsections of increasing length ranging from
up to
. In contrast to Ref31., we generate the sequences randomly without post-selecting those with a unique energy minimum. We make this choice because we do not intend to capture the expected behavior of real proteins, instead, we merely wish to compare the performance of the different formulations.
The estimation of the time-to-solution requires the ground state energy of each protein. The energy was determined via our implementation of the parallel tempering algorithm. While parallel tempering itself is a heuristic algorithm, it is extremely unlikely that lower energy states exist due to its fast convergence for these small problem instances. All PT simulations were performed with 400 temperatures for an increasing number of sweeps ranging from
to
, where for most instances no new best configurations were found after approximately
sweeps.
Time-To-Solution metric
With the dataset defined, we shift our focus to investigate the performance of the models using a set of selected solvers. The comparison of the models is possible if they use the same lattice structure. Although the models differ in formulation, they encode the same problem and thus share the same ground state energy. We benchmark the problems according to a well-known performance metric used to compare quantum annealing with other heuristic solvers, called the time-to-solution (TTS). The TTS defines the expected time, which the algorithm requires to find the ground state within a selected probability, usually chosen to be
. The TTS is calculated by multiplying the average runtime
for a single iteration of the algorithm by the expected number of runs
![]() |
15 |
As has been stated in different works46,55, the TTS suffers from one major drawback. Generally, there is a trade-off between increasing the probability of finding the ground state by extending the search time and increasing the total number of runs while utilizing shorter individual run times per search. This leads to the issue that an observed scaling advantage can be misleading if the success probability is too high for a given problem. To alleviate this issue the TTS needs to be optimized for each data point.
Simulated annealing
As a baseline heuristic to compare with, we investigate the scaling of the models using our in-house GPU-accelerated simulated annealing implementation. To this end we compare the performance of the generated data set between the turn-based and coordinate-based models. Supplementary information regarding the optimized cooling rate can be found in Appendix CSimulated Annealing.
The results for the performance of the models for simulated annealing are presented in Fig. 7. Note for the runtime
of a simulated annealing run, we only account for the time required for sampling and do not consider other additional timings (e.g., for the graph coloring) as this is negligible compared to the runtime of the SA heuristic itself. For visual clarity, we display the results for tetrahedral grids in the right panel and the results for Cartesian grids in the left panel. As described in Sec. Methods, reducing the problem to 2-local interactions with the chosen penalty strength
has a substantial impact on algorithm performance. In contrast to the scaling analysis in Sec. Model scaling, here we study the turn-based tetrahedral model with a heuristically fine-tuned penalty strength. Since we were not able to find the same fine tuning for the turn-based Cartesian model we relied on the methods of Ref21. for near-optimal penalty strengths. The specific parameter choices of
and penalty strengths are reported in more detail in Appendix A at the end of each section.
Fig. 7.
TTS scaling of the proposed models under simulated annealing. The data is taken over 100 randomly generated amino acid sequences. Results are shown for a Cartesian grid (left panel) and a tetrahedral grid (right panel). Due to free choice of lattice sizes the coordinate-based models have been evaluated on the minimal lattice size, such that the ground state still fits on the grid, as well as one size above this size.
To investigate the effect of the underlying lattice size of the coordinate-based models, we consider two different lattice sizes for both grids. Somewhat unsurprisingly, we find that the effect of a larger grid seems to result in a constant offset in the TTS making the problem more difficult to solve without changing the expected scaling.
The data indicates that the coordinate-based approach outperforms the turn-based approach for the TTS. Contrary to our expectation, this trend also holds for the turn-based model on the tetrahedral grid, even though it requires fewer qubits and has a less dense QUBO matrix. The most likely explanation for this effect is the significant disparity in the magnitudes of the QUBO matrix elements. At higher temperatures, the algorithm can easily traverse the energy barriers associated with the constraints. However, in this regime the temperature is too high for the interaction energies to play a crucial role in the folding process. Our findings demonstrate that, in addition to resource requirements, the overall structure of the model exerts a significant influence on its performance.
Quantum annealing
In the previous subsection, we analyzed the scaling of the proposed models for the classical simulated annealing algorithm. Here, we shift our focus to quantum annealing, specifically examining the scaling behavior of two generations of D-Wave quantum annealers: the Advantage 1 and the Advantage 2 prototype. As previously mentioned, the limited connectivity of quantum annealers requires embedding the problem onto the hardware graph. The two systems differ in their underlying connectivity, with the Advantage 1 using the Pegasus and the Advantage 2 prototype using the Zephyr architecture. To account for these differences, 1000 separate embeddings were computed per sequence length for each architecture. As discussed in Sec. Embeddings, embeddings can be reused. Therefore, for all peptides of a given sequence length N, the embedding with minimal number of qubits was selected.
To determine the optimal TTS for both systems, we performed an annealing time sweep ranging from
to
for Advantage 1 and from
to
for the Advantage 2 prototype since we did not find substantial improvements beyond this range. For both devices, the TTS decreases steeply up to approximately
, after which it plateaus. The optimal annealing times were found to be
for the Advantage 1 and
for the Advantage 2 prototype, explaining the order-of-magnitude advantage. Additional details on the optimal annealing times are provided in Appendix CQuantum Annealing. Further details on the embeddings, including an analysis of the obtained chain length distribution and measured chain break frequencies are provided in Appendix CChain break analysis. We want to highlight here that for the results in Fig. 9 we did not correct any chain breaks. In Appendix CChain break analysis we quantify the effect of how much the TTS improves, if one uses majority voting to correct broken chains. we found that the effect is negligible, in particular for the scaling analysis.
Fig. 9.
Left panel: Scaling comparison of quantum annealing and simulated annealing. The blue curve shows the data obtained from simulated annealing on the problem before embedding it onto the annealer. The red curve shows the solution obtained from the Quantum annealer embedded on the Zephyr hardware graph. The green curve shows the results of simulated annealing on the embedded problem. Right panel: TTS scaling for the top 5%, bottom 10% and median percentiles. For the considered data, GPU-parallelized SA outperforms QA by several orders of magnitude. When considering the performance of the solution on the embedded problem, QA seems to outperform our in-house implementation of SA.
Figure 8 illustrates the TTS scaling for both devices, focusing on the most promising model identified, the coordinate-based model on the tetrahedral grid with sequence lengths ranging from
to
. Both quantum annealers successfully solved all problem instances. Notably, the Advantage 2 prototype outperformed the Advantage 1 by roughly an order of magnitude, underscoring the performance improvements between hardware generations. However, it remains unclear whether this improvement is primarily due to the enhanced hardware connectivity, since the embeddings differ significantly in qubit requirements, or the reduction in error rates. Nevertheless, these results demonstrate the potential for further TTS reductions through future hardware advancements.
Fig. 8.
TTS scaling for the two tested quantum annealers: Advantage 1 (left panel) and Advantage 2 prototype (right panel). The data shows the expected TTS for the coordinate-based model on the tetrahedral grid. Results indicate that the Advantage 2 prototype achieves approximately an order of magnitude improvement over the Advantage 1.
Comparison
Finally, we conclude with a direct performance comparison of QA and SA on the chosen model. The results for all considered sequences are presented in Fig. 9 as a violin plot with additional information regarding the 90th, 5th, and median percentiles. We find that our GPU-parallelized implementation of SA significantly outperforms QA, with the performance offset being approximately proportional to the parallelization factor of 432.
While our analysis of the SOD in section Spin overlap distributions showed that the models lie in a regime where quantum annealing due to tunneling could be advantageous, we did not observe a direct scaling advantage of quantum annealing over simulated annealing. However, it is important to note that the quantum annealer solves the problem after the embedding, which can be considerably harder to solve than the direct problem.
To further assess the impact of the embedding, we evaluate SA performance also on the embedded problem. To this end we used the same QUBO matrix that was also solved by QA. The simulation results reveal that when the annealer solves the exact same problem as our SA implementation, QA outperforms our implementation of SA, leading to faster solutions even considering the parallelization speedup, as well as a possible scaling advantage for the sequence lengths considered in this work.
Whether QA can achieve a speedup on the problem before the embedding remains to be seen in the future. It is important to note that our results merely serve as an indication for the potential of quantum annealing and are by no means a rigorous scaling analysis. We finally address important caveats that need to be considered.
First, our results compare “off-the-shelf” versions of quantum as well as simulated annealing. This means that the tested algorithms do not utilize any prior knowledge of the problem, such as leveraging the one-hot-encoding structure for the placement of the amino acids, which could drastically speed up the computation time56. Second, we did not consider any improvements to quantum annealing such as error correction schemes57 or the reverse annealing protocol58. Notably, Ref59. was able to identify a scaling advantage for some optimization problems using error correction protocols. This highlights that reduced error rates can further improve solution quality as well as the scaling behavior.
Finally, our results are limited to very short peptide sequences. The considered sequence lengths in a range of
amino acids are two short to draw reliable conclusions for the asymptotics. As sequence length grows beyond the current proof-of-concept, the exact scaling behavior remains uncertain. We expect the time-to-solution (TTS) to increase approximately exponentially with sequence length, although a super-exponential increase is not ruled out. While this growth is detrimental in principle, Ref31. notes that many clinically relevant proteins lie in the 300–1000 residue range, providing an empirical upper bound on the expected compute time. Even a modest quantum speedup could render such sequences accessible beyond what is achievable with classical computing.
Conclusion and outlook
We investigated and compared several of the proposed ab initio models to solve the coarse-grained protein folding problem on classical and quantum solvers. We evaluated these models in terms of their resource requirements, potential quantum advantage, and performance using simulated annealing and quantum annealing. Our scaling investigation reveals that the coordinate-based approach seems more favorable for implementation on a quantum annealer, whereas the turn-based approach is limited by the locality reduction.
By performing the benchmark, we identified several issues, the most critical being the turn-based tetrahedral model from Robert et al.20 producing unphysical configurations in the solution space. We further identified one more pressing bottleneck regarding all models: the number of qubit-qubit couplings required, which increases for all considered models with the sequence length. This number indicates how well a problem is suited for embedding onto an annealer’s hardware graph, such as the Pegasus or Zephyr graphs. We found that for all considered models, this number increases, making it progressively more difficult to find embeddings as the number of amino acids in the protein increases. Another issue is the required coupler resolution of the turn-based models. As the sequence length increases, the ratio between the largest and smallest coupling strength slowly increases. For larger sequences, this will necessitate an increasingly high coupler resolution, which is not supported by current-generation devices.
Additionally, we examined whether the proposed models are amenable to quantum speedup from tunneling by analyzing the spin overlap distribution, which serves as a proxy for the complexity of the free energy landscape. Our findings reveal that, to a large extent, the energy landscape is shaped by the problem encoding, particularly the constraints enforcing the qubits to represent a valid fold. While all models apart from the turn-based model on the Cartesian grid appear to operate in a regime where quantum speedup through the quantum tunneling effect is possible, we also observed that the embedding can significantly impact the spin overlap distribution.
Finally, we calculated the time-to-solution of simulated annealing for all models and compared with quantum annealing for the most promising one, the coordinate-based tetrahedral model. In terms of scaling of SA, the coordinate-based model outperformed the turn-based models when expressed as QUBO problems. However, this advantage could shift in favor of turn-based models when formulated as HUBO problems. Our results show that simulated annealing and quantum annealing exhibit similar scaling behavior, but our GPU-parallelized implementation of simulated annealing outperforms quantum annealing by several orders of magnitude. Nevertheless, quantum annealing could, in principle, also be parallelized. When comparing performance on the same problem, specifically the version embedded onto the quantum annealer, quantum annealing appears to scale better than our implementation of simulated annealing.
These findings indicate that, although there is currently no clear quantum advantage, quantum annealing could, in principle, achieve faster time-to-solutions than simulated annealing if the hardware can be improved, offering lower error rates and higher qubit connectivity.
Acknowledgements
We are grateful to Johannes Mueller-Roemer and Paul Haubenwallner for helpful discussions and comments on the manuscript. Furthermore, we would like to thank Philipp Quoss for valuable assistance with the implementation, which contributed to the technical aspects of this work.
Appendix A: Models
In this appendix, we give a brief review of the PSP models considered in this work. The review here is by no means meant to be exhaustive. Additional details on the models can be found in the respective publications, which we cite at the start of each section. Further, we adjust some of the models to make them either more comparable or to reduce the number of qubits required while mapping to a 2-local problem. While we aim to solve each model on a quantum annealer, it is important to note that some of these models were developed to be tackled with a gate-based quantum computer and might thus not perform optimally on a QA device.
Throughout this appendix, we will introduce the models in Boolean space. Although the variables are generic Boolean variables, we will refer to them as qubits q.
Turn-based models
Turn-based models encode the configuration of a protein by using coordinates relative to the origin of a coordinate system. The positions of the beads follow from a set of turns the polypeptide chain has taken. To prohibit the formation of unphysical configurations such as the chain folding back on itself or beads occupying the same lattice position, additional penalty terms have to be added. The main advantage of turn-based models compared to coordinate-based ones is that the configuration can be stored in a linear amount of qubits. However, to model interactions and formulate the penalties, additional variables need to be introduced.
Cartesian lattice
We start this section by presenting one of the first turn-based models, which was introduced in Ref19. and refined in Refs21,22.. Due to the bound locality of the QUBO matrix, a more efficient embedding on current QA devices can be performed. The derivation in this appendix closely follows Ref22., where the model has been considered on a three-dimensional Cartesian grid.
Turn-based models encode the folding of a protein as a self-avoiding walk by encoding the direction of a turn the amino acid chain takes. For example, a peptide chain on a three-dimensional Cartesian grid can grow in six possible directions, which must be encoded into qubits.
This encoding can be performed in two ways; either using a dense or by using a sparse encoding. The dense encoding has the advantage of using fewer qubits: the possible directions a turn can take are directly encoded using binary variables. Thus, encoding six possible spatial directions requires
qubits.
In this case the configuration of an amino acid is defined by the solution string
![]() |
16 |
Due to symmetry reasons the first turn and most of the qubits from the second turn can be fixed. For mathematical simplicity we label the fixed qubits as if they were not restricted.
In the sparse encoding each direction is one-hot encoded and requires as many qubits as there are directions for each turn. In this case the configuration of an amino acid is defined by the solution string
![]() |
17 |
In this work we consider the dense encoding, since it leads to favorable performance. The reason for this can be seen in Fig. 10, which shows that while the sparse encoding requires a lower number of qubits, it leads to denser QUBO matrix and hence worse solver performance.
Fig. 10.
Scaling difference for the sparse and dense encoding. While the dense encoding leads to less qubits in the original problem it requires more qubits after the reduction to a 2-local model.
Since it is not possible to directly infer any information of the absolute position of the amino acids it is helpful to define turn-indicator functions. These boolean functions are used to evaluate in which direction a specific turn along the peptide chain has been taken and evaluate to True if and only if the turn has been taken in the respective direction. In our case the indicators are given by
![]() |
18 |
where
evaluate to 0 (False) or 1 (True) and indicate if the turn j has been taken in the positive or negative x, y, z-direction. Furthermore, additional turn indicators can be defined for the two qubit configurations which do not encode a valid turn
![]() |
19 |
To ensure that the configuration encodes a valid set of turns an energy penalty is introduced
![]() |
20 |
which is only applied to ensure that qubits are not in one of the two states which do not encode a turn. To prohibit the peptide chain from folding back onto itself, an additional energy penalty is implemented utilizing the turn indicators as follows:
![]() |
21 |
where
denotes the logical AND, which is mapped to binary multiplication in the QUBO formulation.
Apart from penalizing back folding, the main reason to introduce turn indicators is to allow for the calculation of the absolute position of the amino acids, which is given by the sum of the number of positive and negative turns along an axis. For example the coordinates of the m-th amino acid is given by
![]() |
22 |
with the first amino acid occupying the origin (0, 0, 0).
From the positions we are able to calculate the distance between two amino acids, which we need to calculate the configuration energy. To avoid a square root in the calculation the squared distance
![]() |
23 |
is customary used.
To penalize nonphysical overlaps of the protein, we have to ensure that
for all possible pairs of beads (j, k). To include this inequality in the optimization problem, it is transformed into an equality via the introduction of slack variables. First, it is important to notice that
, that is, the maximum distance between two beads is at most the square of all possible turns taken in one direction.
To ensure that
, a slack variable
is introduced for any pair (i, j), with
![]() |
24 |
Using this definition of
it follows that for all possible distances
the equality
![]() |
25 |
can be fulfilled for a specific integer value of
.
The realization of
is made by introducing additional binary variables. The amount of additional variables needed can be calculated by considering the number of bits the binary representation of the maximum possible distance requires:
![]() |
26 |
The second factor ensures that only additional variables are introduced if the beads are separated by an even number of turns, as beads separated by an odd number of turns cannot overlap by construction.
From this definition, the slack variable
can be defined as
![]() |
27 |
By constructing the slack variable
is bounded by
, in contrast to the desired relation
. Thus, to ensure that the equality can be satisfied, Eq. 25 needs to be restructured to
![]() |
28 |
Finally, this allows us to formulate the overlap penalty term
![]() |
29 |
where
is a positive constant. Since the penalty needs to be applied to each pair of qubits that could possibly overlap, the full overlap Hamiltonian is given by:
![]() |
30 |
where the last factor ensures that an overlap penalty is applied only to beads that can possibly occupy the same lattice site.
Finally, the model needs to be able to assign correct interaction energies to adjacent amino acids. We consider only nearest-neighbor interactions, hence we wish to ensure that the interaction energy is only applied when beads are on adjacent lattice sites. To implement this, an additional interaction qubit
for each possible interaction is introduced. This qubit is in the state
if two amino acids interact and in the state
otherwise. The construction of the energy function is thus
![]() |
31 |
where
defines the interaction energy between the two amino acids. If the interaction energies
are chosen to be manifestly negative (as is the case for HP- and Miyazawa-Jernigan-type interactions), this formulation guarantees that for all distances
, the term becomes positive, so that by flipping the interaction qubit to the
state, no penalty is applied. This interaction term is then applied to all interacting amino acids
![]() |
32 |
It follows that the final Hamiltonian is given by
![]() |
33 |
For all calculations considered in this work, we choose
, since we found that these penalties still led to correct results while being as small as possible. Apart from the resource estimates, the reduction to 2-local was performed using the methods of Ref48..
Tetrahedral lattice
We now present the turn-based model on the tetrahedral grid following the derivation in Ref20., where it has been introduced for the first time. As stated in the main text, this model can produce unphysical states, which in some instances includes the ground state. We present one minimal example in Appendix B. In its original formulation the considered model incorporates next
-nearest neighbor interactions as well as a side-chain component. To ensure comparability, we restrict this model to backbone folding and nearest neighbor interactions. In contrast to the Cartesian model, we chose the sparserencoding introduced in Ref20..
We make this choice as for this model the sparser encoding leads to a lower number of qubits as well as sparser QUBO matrices as presented in Fig. 11. In the sparser, one-hot encoding, a turn is represented by one qubit for each possible turn the polypeptide chain takes. For the tetrahedral model, this corresponds to four possible directions:
![]() |
34 |
Due to symmetry reasons, the first two turns can be fixed, leading to some resource reduction. To infer the positions of the amino acids, it is again practical to introduce turn indicators. For the chosen encoding, these indicators are given by
![]() |
35 |
With these turn indicators defined, it is possible to calculate the distance between any two beads i and j by counting the number of turns separating the beads along the chain
![]() |
36 |
where the factor of
keeps track of whether the turn has been made originating from an even or odd lattice site. Finally, the total distance between two beads can be calculated by taking the sum of the squared distances over the four axes:
![]() |
37 |
With these definitions, the penalty functions of the model can be defined. Since we chose the sparser one-hot encoding, we need to ensure that the qubits will be in a state that encodes a turn. To achieve this, the first penalty term
is introduced:
![]() |
38 |
which ensures that only one of the qubits remains in the state
and penalizes all states where more than 1 qubit in a one-hot block are in the
state.
Fig. 11.
Scaling difference for the sparse and dense encoding. While the dense encoding leads to fewer qubits in the original problem it requires more qubits after the reduction to a 2-local model.
To prohibit configurations that are unphysical, i.e., two beads occupying the same lattice site, turns that lead to two beads overlapping need to be penalized. One possibility for two beads to overlap is back folding. To penalize two consecutive turns from growing in opposite directions, an additional growth constraint penalty
![]() |
39 |
is added to restrict the growth of the chain to only include turns without back folding.
Finally, the interaction Hamiltonian, responsible for ranking the folds and prohibiting overlap not occurring from back folding, is introduced. In the original model this Hamiltonian is decomposed as
![]() |
40 |
where the terms
correspond to n-th nearest neighbor interactions. In this work we restrict the investigation to only nearest neighbor interactions, hence
.
The nearest neighbor interaction between two beads applies the interaction energy
if and only if two beads are nearest neighbors on the lattice.
For each pair of beads i and j as in the turn-based model on the tetrahedral grid, there exists one interaction qubit
, which is in the state
if and only if the two beads are in contact. To ensure that the energy is only applied if the beads are in contact, an additional penalty
is added. The penalty term
ensures that the interaction qubit is only in the state
if the term in the parentheses vanishes. Note that D(i, j) cannot be equal to 0 for two beads that are separated by an odd number of turns.
A further task of the interaction Hamiltonian is to penalize configurations where overlaps between two beads occur. As stated in Ref20., an overlap can only occur in the vicinity of a nearest neighbor interaction. The penalization of overlaps is then applied in a form such that, if a contact between two beads is established, another penalty term is added to ensure that the beads before and after bead i do not overlap with bead j, and the neighboring beads to bead j do not overlap with bead i
![]() |
41 |
with
![]() |
42 |
To ensure that the term in parentheses in Eq. 42 remains positive when the two beads, i and j, are not in contact, the penalty terms must be chosen appropriately. Since we do not consider side-chains in our implementation, so it suffices to choose
. The full Hamiltonian of the system is then described by
![]() |
43 |
For the simulations considered, we choose
and scale the values of
as a global penalty strength
to improve the performance of the solvers. That is, we scale the coefficient size with the chain length. We found that scaling the coefficients in this way leads to better performance than using constant factors. Namely for the calculation of the SODs we select a scaling of
, which leads to correct penalization of configurations that violate penalty terms. For the TTS experiments we consider a scaling of
, which leads to smaller coefficients of the QUBO matrix but does not lead to a correct penalization of longer sequences. This scaling has the benefit that it allows us to approximately scale the parameter of Rosenberg’s polynomial with the same magnitude as the penalty terms
. We would like to stress that this improvement originates from the reduction to 2-local and we do not expect there to be any benefit of scaling the coefficients when working with the model in HUBO-form.
We found that the same procedure cannot be applied to the turn-based model on the Cartesian grid, hence we restrict this method to this model only.
Coordinate-based model
Coordinate-based models describe the fold of a protein by finding a mapping of the positions of the individual amino acids onto bits/qubits. Since in a direct encoding the amino acids can take any position, unphysical configurations need to be eliminated from the feasible solution space via penalty terms. In the following, we give a brief summary of the coordinate-based models considered in this work.
Cartesian lattice
We review the coordinate-based model presented in Ref23. on a Cartesian (chessboard) lattice. In the model, an amino acid sequence
is placed on a lattice
consisting of either
sites in the 2-dimensional or
sites in the 3-dimensional case.
To encode a configuration, one qubit for each amino acid is introduced at each lattice site. The qubit is in the
-state if and only if the specific amino acid is placed at the specified lattice site. The total number of variables thus amounts to
for a 2-dimensional grid or
for a 3-dimensional grid.
By choosing the further simplification that, on the Cartesian lattice, amino acids with an even index are positioned at even lattice sites (and vice versa for odd amino acids), the total number of variables can be reduced to
(or
).
The encoding of the state of a fold then takes the following form
![]() |
44 |
where the first product considers the beads on even lattice sites and the second product considers the beads on the odd lattice sites and
refers to the total number of lattice sides.
Since the formulation allows for unphysical configurations, i.e., multiple occurrences of the same amino acid or multiple amino acids on the same lattice site, three additional penalty terms are added to the energy function 
![]() |
45 |
where
is the interaction energy of the amino acids, and the terms
are the three positive penalty terms with the factor
denoting the relative strength of the penalty.
Each of these three penalty terms ensures a different constraint that a physical configuration needs to fulfill. The first term
![]() |
46 |
penalizes each configuration where a bead is located on more or less than one lattice site. Here, the first sum runs over all amino acids in the chain whereas the second sum runs over all lattice sites in the lattice
. The second term
![]() |
47 |
is used to prohibit two different amino acids
and
from being placed on the same position. Finally, the third term
![]() |
48 |
is introduced to ensure that all amino acids lie on a chain. The last sum runs over all lattice sites s and
which are not nearest neighbors on the lattice
.
Apart from the penalty terms, whenever two beads are nearest neighbors an interaction energy is applied
![]() |
49 |
One direct positive aspect of the model is that the locality of the interactions is bounded by 2. Further, the penalty strengths
do not scale with N allowing for a direct implementation on a quantum annealer without the need of consideration for properties such as coupler resolution.
For all simulations considered we choose
,
,
, which is a heuristic choice inspired from the parameters chosen in Ref23. adapted to the 3-dimensional grid and the Miyazawa-Jernigan interaction matrix. We would like to note that the results could likely be further improved by fine tuning the parameters.
Tetrahedral lattice
To directly compare the model with the turn-based models presented earlier, we transition the coordinate-based model from a Cartesian lattice to a tetrahedral one. To this end, we propose a multi-grid implementation of the tetrahedral structure to adapt the coordinate-based model for this arrangement.
Specifically, we define two interleaved face-centered cubic grids, A and B, as illustrated in Fig. 12. Each lattice is parametrized by three coordinates
corresponding to the lattice vectors
![]() |
50 |
for sub-lattice A and the vectors
![]() |
51 |
for sub-lattice B. Note that the lattices are related by a shift of a quarter diagonal. From this, the coordinates of a bead can be derived as
![]() |
52 |
for the first and
![]() |
53 |
for the second sub-lattice. Taken together, these sub-lattices form the tetrahedral structure, with vertices alternating between the two grids.
Fig. 12.
Example image of the two lattices forming the tetrahedral grid. Even beads are placed on lattice A while odd beads are placed on lattice B.
With these definitions the PSP can be adapted from the coordinate-based model. The amino acid sequence is again split into even and odd beads with the even beads living on sub-lattice
, whereas the odd beads live on sub-lattice
. From this the penalty and interaction terms can be derived in a similar fashion as for the Cartesian grid:
![]() |
54 |
Using this lattice split, it is important to define when two amino acids are adjacent to one another. On the tetrahedral grid, each lattice site generally has four nearest neighbor sites. For our model, we specify that for a grid position on the sub-lattice A with coordinates
, the sites with coordinates
,
,
, and
are nearest neighbor sites.
With these definitions the interaction energy is given by
![]() |
55 |
This approach represents a straightforward adaptation of the coordinate-based PSP onto the tetrahedral lattice, offering potential for broader applications. Our model employs a multi-grid implementation of the coordinate-based protein folding problem, allowing for generalization to more than two grids, which could further optimize resource utilization. Additionally, the multi-grid framework extends naturally to conjoint protein folding, where one protein is confined to one lattice and another to a separate lattice. This formulation enables efficient folding while inherently restricting folding domains and preventing overlap, making it applicable also to protein docking problems. The future potential of coordinate-based models remains open for exploration.
For all simulations considered we again choose
,
,
. It is likely that the performance can be further optimized if the parameters are fine-tuned.
More efficient penalization
The penalty term
can be constructed in a more efficient form leading to a less dense QUBO. Hereby we do not penalize disconnected chain configurations but instead energetically favor configurations that are connected
![]() |
56 |
Note that, to counteract the negative energy obtained by connecting two beads, we add a constant energy shift
to the energy function. A similar approach has also been realized in Ref21..
Appendix B: Example of protein with unphysical fold on tetrahedral lattice
In this appendix we show how the result of obtaining unphysical folds with the turn-based tetrahedral lattice can be reproduced. For this purpose, we utilized open-source code available in the Qiskit Community repository60. As discussed in Sec. Discussion, the smallest protein for which we observe a non-physical fold as the ground state of the model consists of 11 amino acids and is represented in the HP model by the sequence HPPPPHPPPPH. Since60 only supports MJ-type interactions, we consider the sequence LKKKKLKKKKL. This sequence mimics the behavior of the HP model, with strong interaction between the amino acids Leucine (L) - Leucine (L) and weaker interactions between the amino acids Lysine (K) - Lysine (K) and Lysine (K) - Leucine (L).
We calculated the Hamiltonian for this protein using the code mentioned above. We find in total 8 different states with the same ground-state energy of
. The corresponding solution vectors are given by:
- correct
- correct
- correct
- correct
- wrong
- wrong
- wrong
- wrong
out of which 50% describe a correct fold and 50% describe an unphysical configuration.
Appendix C: Supplementary data for the simulations
Parallel Tempering
The parameters for the parallel tempering simulation for each experiment are depicted in Tab. 2. The parameters include the number of temperatures, the minimum and maximum temperature and the number of sweeps. The number of temperatures is chosen to be 400 for all considered experiments as this is the number of replicas that can be run on the GPU without additional overhead influencing the run time. We chose a geometric spacing of the temperatures where the ith temperature is given by
![]() |
57 |
This spacing is customarily used to ensure a higher density of temperatures on the lower end of the range and lower density of temperatures on the higher end. For the calculation of the spin overlaps in Sec. Spin overlap distributions, all PT runs were performed with a total number of
. However, to ensure thermal equilibration, only the last
samples were used to produce the plots.
Table 2.
Parameters for the parallel tempering runs.
| Model | T_min | T_max |
|---|---|---|
| Coordinate-based Cartesian | 1 | ![]() |
| Coordinate-based tetrahedral | 1 | ![]() |
| Turn-based Cartesian | 1 | ![]() |
| Turn-based tetrahedral | 1 | ![]() |
To produce reference solutions for the dataset considered in Sec. Quantum annealing vs. simulated annealing, we only ran PT simulations on the coordinate-based models, as the turn-based models have the same ground-state energy by design, with sweeps ranging from
to
, as mentioned earlier.
Simulated Annealing
The only free parameter for the simulated annealing runs in our in-house implementation is the cooling rate
, cf. Sec. Simulated annealing. We optimized
for each problem instance by sweeping over different values. The results are shown in Figs. 13 (coordinate-based for the tetrahedral grid), 14 (coordinate-based for the Cartesian grid) and 15 (turn-based for tetrahedral and Cartesian grid). For the numerical experiments in Sec. Quantum annealing vs. simulated annealing we used the cooling rate leading to the minimal TTS for a given sequence length.
Fig. 13.
Optimal cooling rate for the coordinate-based model on the tetrahedral grid for problem instances with varying amino acid sequence lengths N.
Fig. 14.
Optimal cooling rate for the coordinate-based models on the Cartesian grid for problem instances with varying amino acid sequence lengths N.
Fig. 15.
Optimal cooling rate for the turn-based models for problem instances with varying amino acid sequence lengths N. Since we had to choose cooling rates asymptotically close to 1, we plot 1 minus the cooling rate for the turn-based model on the Cartesian grid.
Quantum Annealing
We first benchmark the embedding process for all considered models. For each model we consider 1000 embeddings and check the number of qubits needed on the Pegasus or Zephyr topology. The results are shown in Fig. 16 for the tetrahedral grid and Fig. 17 for the Cartesian grid.
Fig. 16.
Embedding data for all models on the tetrahedral grid.
Fig. 17.
Embedding data for all models on the Cartesian grid. The last data point for
on the Zephyr graph shows only 3 data points. This is because only 3 out of 1000 conducted embedding processes returned a valid embedding on the prototype.
Since the coordinate-based model on the tetrahedral grid turned out to be the most promising, we only run real hardware experiments for this model. The relevant hyper-parameter to optimize the TTS for quantum annealing is the annealing time
used for a given instance. We optimized
for different instance sizes for the coordinate-based tetrahedral model. The results for the Advantage 1 (based on the Pegasus topology) as well as for the Advantage 2 prototype (based on the Zephyr topology) are shown in Fig. 18.
Fig. 18.
Optimal anneal time to minimize the TTS for the D-Wave Advantage 1 and Advantage 2 prototype systems. Since the anneal times show largely similar behavior, the optimal anneal times were chosen for all sequence lengths to be the same value of
for Advantage 1 and
for the Advantage 2 prototype.
Chain break analysis
Not all samples from the quantum annealing runs yield results corresponding to a correct embedding. In this section, we provide a short analysis of chain breaks, including the distribution of chain lengths in the embedded problems as well as the observed chain break frequency. For each sequence length, the data is averaged over all sequences considered. The analysis is shown in Fig. 19 for the Advantage 1 system and in Fig. 20 for the Advantage 2 system. Our results show that some chains are substantially more prone to breaking than others, while the correlation with chain length is only mild. We attribute this to either degraded qubit performance at specific locations on the annealer or to more intricate dynamics of the specific qubits, which render these chains more likely to break. Finally, in Fig. 21 we show the influence of different chain break corrections on the TTS. Specifically, we show that using a majority voting scheme, as it is provided in D-Wave Ocean SDK61, leads to only an insignificant improvement in the TTS compared to the uncorrected results, where only results with unbroken chains are counted.
Fig. 19.
Distribution of the chain lengths of the embedded logical qubits (left) as well as the observed chain break frequencies (right) on the Advantage 1 annealer. The data is sorted by chain length as well as by chain break frequencies.
Fig. 20.
Distribution of the chain lengths of the embedded logical qubits (left) as well as the observed chain break frequencies (right) on the Advantage 2 annealer. The data is sorted by chain length as well as by chain break frequencies.
Fig. 21.
TTS improvements obtained by correction of chain breaks in the sample for the Advantage 2 prototype. The left/blue part of the violin plot shows the obtained distribution before the majority vote, the right/orange part shows after the correction. While a slight improvement can be seen, the correction does not change the scaling in a meaningful way.
Reliability of ground-state probability of the quantum annealer
To estimate how reliably the annealer is able to identify the correct ground state among the vast configurational space of the qubits of up to
, we estimated the probability of measured ground states as well as measured valid folds, that is solutions that do not violate the penalty terms. The results are shown in Fig. 22 and 23 and show that while initially the annealer can find the ground state reliably with a probability of over 10%, for larger sequences this probability decreases steeply. This means that in order to estimate the results accurately either the ground state probability needs to be increased, by for example increasing the anneal time, or the number of shots needs to be scaled in order to resolve low ground state probabilities. Since for D-Wave quantum annealers the anneal time is upper-bounded by around
, this means that for larger sequences the number of shots on the quantum annealer needs to be increased.
Fig. 22.
(Left panel) Relative probabilities of finding the ground state as well as finding a valid state for the Advantage 1 annealer. (Right panel) The number of ground states found per valid state. As expected this number decreases exponentially given the increase of the configurational space of the polypeptide.
Fig. 23.
(Left panel) Relative probabilities of finding the ground state as well as finding a valid state for the Advantage 2 prototype. (Right panel) The number of ground states found per valid state. As expected this number decreases exponentially given the increase of the configurational space of the polypeptide.
Author contributions
TS and MH designed the project. AG implemented the simulated annealing and parallel tempering code. TS implemented the protein folding model and performed the experiments and numerical simulations. TS and MH performed the analysis. All authors contributed to the writing of this manuscript.
Funding
Open Access funding enabled and organized by Projekt DEAL. This work was supported by the research project Zentrum für Angewandtes Quantencomputing (ZAQC), which is funded by the Hessian Ministry for Digital Strategy and Innovation and the Hessian Ministry of Higher Education, Research and the Arts.
Data availability
All data regarding this publication are available in the corresponding GitLab repository (https://gitlab.cc-asp.fraunhofer.de/scheiber1/exploringquantumannealing4cgproteinfolding). The data contain all plottable data, the raw measurement results from the D-Wave devices as well as all QUBO matrices used in this study.
Code availability
The underlying code regarding our implementations of simulated annealing or parallel tempering is not publicly available for proprietary reasons.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
The usage of k has a physical context and ensures equal units for temperature and energy. For the simulation we utilize units of
.
In some parts of this work the letter q is utilized to denote qubits. The meaning should be clear from context.
By logical qubits we refer to the numbers of physical qubits needed if the device would support all-to-all connectivity.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Wu, M.-H., Xie, Z. & Zhi, D. A folding-docking-affinity framework for protein-ligand binding affinity prediction. Commun. Chem.8(1), 1–9 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Keskin, O., Gursoy, A., Ma, B. & Nussinov, R. Principles of protein- protein interactions: what are the preferred ways for proteins to interact?. Chem. Rev.108(4), 1225–1244 (2008). [DOI] [PubMed] [Google Scholar]
- 3.Jumper, J. et al. Highly accurate protein structure prediction with alphafold. nature596, 7873, 583–589, (2021). [DOI] [PMC free article] [PubMed]
- 4.Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science373(6557), 871–876 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Abramson, J. et al. Accurate structure prediction of biomolecular interactions with alphafold 3. Nature 1–3 (2024). [DOI] [PMC free article] [PubMed]
- 6.Doga, H. et al. A perspective on protein structure prediction using quantum computers. J. Chem. Theory Comput.20(9), 3359–3378 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Outeiral, C., Nissley, D. A. & Deane, C. M. Current structure predictors are not learning the physics of protein folding. Bioinformatics38(7), 1881–1887 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Birch-Price, Z., Hardy, F. J., Lister, T. M., Kohn, A. R. & Green, A. P. Noncanonical amino acids in biocatalysis. Chem. Rev.124(14), 8740–8786 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Zhuravlev, P. I. & Papoian, G. A. Protein functional landscapes, dynamics, allostery: a tortuous path towards a universal theoretical framework. Q. Rev. Biophys.43(3), 295–332 (2010). [DOI] [PubMed] [Google Scholar]
- 10.Trebst, S., Troyer, M. & Hansmann, U. H. Optimized parallel tempering simulations of proteins. J. Chem. Phys.10.1063/1.2186639 (2006). [DOI] [PubMed] [Google Scholar]
- 11.Kirkpatrick, S., Gelatt Jr, C.D. & Vecchi, M.P. “Optimization by simulated annealing,” science, 220, 4598, 671–680, (1983). [DOI] [PubMed]
- 12.Swendsen, R. H. & Wang, J.-S. Replica monte Carlo simulation of spin-glasses. Phys. Rev. Lett.57(21), 2607 (1986). [DOI] [PubMed] [Google Scholar]
- 13.Agostini, F. P., Soares-Pinto, D. D. O., Moret, M. A., Osthoff, C. & Pascutti, P. G. Generalized simulated annealing applied to protein folding studies. J. Comput. Chem.27(11), 1142–1155 (2006). [DOI] [PubMed] [Google Scholar]
- 14.Alford, R. F. et al. The rosetta all-atom energy function for macromolecular modeling and design. J. Chem. Theory Comput.13(6), 3031–3048 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.de Falco, D., Apolloni, B. & Cesa-Bianchi, N. A numerical implementation of quantum annealing. Stochastic Processes, Physics and Geometry324, (1988).
- 16.Kim, S. et al. Quantum annealing for combinatorial optimization: a benchmarking study. npj Quantum Inf., 11(1), 1–8 (2025).
- 17.Denchev, V. S. et al. What is the computational value of finite-range tunneling?. Phys. Rev. X6(3), 031015 (2016). [Google Scholar]
- 18.Perdomo, A., Truncik, C., Tubert-Brohman, I., Rose, G. & Aspuru-Guzik, A. Construction of model hamiltonians for adiabatic quantum computation and its application to finding low-energy conformations of lattice protein models. Phys. Rev. A78(1), 012320 (2008). [Google Scholar]
- 19.Perdomo-Ortiz, A., Dickson, N., Drew-Brook, M., Rose, G. & Aspuru-Guzik, A. Finding low-energy conformations of lattice protein models by quantum annealing. Sci. Rep.2(1), 1–7 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Robert, A., Barkoutsos, P.K., Woerner, S., & Tavernelli, I. Resource-efficient quantum algorithm for protein folding npj Quantum Inf.7(1), 38 (2021).
- 21.Babbush, R., Perdomo-Ortiz, A., O’Gorman, B., Macready, W. & Aspuru-Guzik, A. Construction of energy functions for lattice heteropolymer models: Efficient encodings for constraint satisfaction programming and quantum annealing. Adv. Chem. Phys.155, 201–244 (2014). [Google Scholar]
- 22.Babej, T. et al. Coarse-grained lattice protein folding on a quantum annealer. arXiv preprint arXiv:1811.00713 (2018).
- 23.Irbäck, A., Knuthson, L., Mohanty, S. & Peterson, C. Folding lattice proteins with quantum annealing. Phys. Rev. Res.4(4), 043013 (2022). [Google Scholar]
- 24.Pamidimukkala, J. V. et al. Protein structure prediction with high degrees of freedom in a gate-based quantum computer. J. Chem. Theory Comput.20(22), 10223–10234 (2024). [DOI] [PubMed] [Google Scholar]
- 25.Irbäck, A., Knuthson, L., Mohanty, S. & Peterson, C. Using quantum annealing to design lattice proteins. Phys. Rev. Res.6(1), 013162 (2024). [Google Scholar]
- 26.Brubaker, J.K. et al. Quadratic unconstrained binary optimization and constraint programming approaches for lattice-based cyclic peptide docking. arXiv preprint arXiv:2412.10260 (2024). [DOI] [PMC free article] [PubMed]
- 27.Boulebnane, S., Lucas, X., Meyder, A., Adaszewski, S. & Montanaro, A. Peptide conformational sampling using the quantum approximate optimization algorithm. npj Quantum Inf.9(1), 70 (2023).
- 28.Farhi, E., Goldstone, J. & Gutmann, S. A quantum approximate optimization algorithm. arXiv preprint arXiv:1411.4028 (2014).
- 29.Chandarana, P., Hegade, N. N., Montalban, I., Solano, E. & Chen, X. Digitized counterdiabatic quantum algorithm for protein folding. Phys. Rev. Appl.20(1), 014024 (2023). [Google Scholar]
- 30.Romero, S.V. et al. Protein folding with an all-to-all trapped-ion quantum computer. arXiv preprint arXiv:2506.07866 (2025).
- 31.Outeiral, C. et al. Investigating the potential for a limited quantum speedup on protein lattice problems. New J. Phys.23(10), 103030 (2021). [Google Scholar]
- 32.Linn, H., Brundin, I., García-Álvarez, L. & Johansson, G. Resource analysis of quantum algorithms for coarse-grained protein folding models. Phys. Rev. Res.6(3), 033112 (2024). [Google Scholar]
- 33.Barahona, F. On the computational complexity of Ising spin glass models. J. Phys. A Math. Gen.15(10), 3241 (1982). [Google Scholar]
- 34.Rosenberg, I.G. Reduction of bivalent maximization to the quadratic case. Cahiers du Centre d’Études de Recherche Opérationnelle, Vol. 17. 71–74 (1975).
- 35.Miyazawa, S. & Jernigan, R. L. Estimation of effective interresidue contact energies from protein crystal structures: Quasi-chemical approximation. Macromolecules18(3), 534–552 (1985). [Google Scholar]
- 36.Berger, B. & Leighton, T. Protein folding in the hydrophobic-hydrophilic (hp) is np-complete. in Proceedings of the second annual international conference on Computational molecular biology, pp. 30–39, (1998). [DOI] [PubMed]
- 37.Rose, A. S. & Hildebrand, P. W. Ngl viewer: A web application for molecular visualization. Nucleic Acids Res.43(W1), W576–W579 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. & Teller, E. Equation of state calculations by fast computing machines. J. Chem. Phys.21(6), 1087–1092 (1953). [DOI] [PubMed] [Google Scholar]
- 39.Atiqullah, M.M. An efficient simple cooling schedule for simulated annealing. in Computational Science and Its Applications - ICCSA 2004 (T. Kanade, ed.), vol. 3045 of Lecture Notes in Computer Science, pp. 396–404, Berlin/Heidelberg: Springer Berlin Heidelberg, (2004).
- 40.Imanaga, T. et al. Solving the sparse qubo on multiple gpus for simulating a quantum annealer. in 2021 Ninth International Symposium on Computing and Networking (CANDAR), pp. 19–28, (2021).
- 41.Deveci, M., Boman, E.G., Devine, K.D. & Rajamanickam, S. Parallel graph coloring for manycore architectures. 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 892–901, (2016).
- 42.Kadowaki, T. & Nishimori, H. Quantum annealing in the transverse Ising model. Phys. Rev. E58(5), 5355 (1998). [Google Scholar]
- 43.Santoro, G. E., Martoňák, R., Tosatti, E. & Car, R. Theory of quantum annealing of an Ising spin glass. Science295(5564), 2427–2430 (2002). [DOI] [PubMed] [Google Scholar]
- 44.Bravyi, S., Divincenzo, D.P., Oliveira, R.I. & Terhal, B.M. The complexity of stoquastic local hamiltonian problems arXiv preprint quant-ph/0606140, (2006).
- 45.Yucesoy, B., Machta, J. & Katzgraber, H. G. Correlations between the dynamics of parallel tempering and the free-energy landscape in spin glasses. Phys. Rev. E87(1), 012104 (2013). [DOI] [PubMed] [Google Scholar]
- 46.Katzgraber, H. G., Hamze, F., Zhu, Z., Ochoa, A. J. & Munoz-Bauza, H. Seeking quantum speedup through spin glasses: The good, the bad, and the ugly. Phys. Rev. X5(3), 031026 (2015). [Google Scholar]
- 47.Zaman, M., Tanahashi, K. & Tanaka, S. Pyqubo: Python library for mapping combinatorial optimization problems to qubo form. IEEE Trans. Comput.71(4), 838–850 (2021). [Google Scholar]
- 48.Babbush, R., O’Gorman, B. & Aspuru-Guzik, A. Resource efficient gadgets for compiling adiabatic quantum optimization problems. Ann. Phys. (Berl.)525(10–11), 877–888 (2013). [Google Scholar]
- 49.Choi, V. Minor-embedding in adiabatic quantum computation: I. The parameter setting problem. Quantum Inf. Process.7, 193–209 (2008). [Google Scholar]
- 50.Gomez-Tejedor, A., Osaba, E. & Villar-Rodriguez, E. Addressing the minor-embedding problem in quantum annealing and evaluating state-of-the-art algorithm performance. arXiv preprint arXiv:2504.13376 (2025).
- 51.Cai, J., Macready, W.G. & Roy, A. A practical heuristic for finding graph minors. arXiv preprint arXiv:1406.2741, (2014).
- 52.McGeoch, C., Farre, P., & Boothby, K. The D-Wave Advantage2 Prototype. https://www.dwavequantum.com/media/eixhdtpa/14-1063a-a_the_d-wave_advantage2_prototype-4.pdf.
- 53.King, A.D. et al. Zephyr: A next-generation topology for large-scale quantum annealing. arXiv preprint arXiv:2207.14786, (2022).
- 54.Pearson, A., Mishra, A., Hen, I. & Lidar, D. A. Analog errors in quantum annealing: doom and hope. npj Quantum Inf.5(1), 107. (2019). [Google Scholar]
- 55.Rønnow, T. F. et al. Defining and detecting quantum speedup. science345(6195), 420–424 (2014). [DOI] [PubMed] [Google Scholar]
- 56.Okada, S., Ohzeki, M. & Taguchi, S. Efficient partition of integer optimization problems with one-hot encoding. Sci. Rep.9(1), 13036 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Pudenz, K. L., Albash, T. & Lidar, D. A. Error-corrected quantum annealing with hundreds of qubits. Nat. Commun.5(1), 3243 (2014). [DOI] [PubMed] [Google Scholar]
- 58.Chancellor, N. Modernizing quantum annealing using local searches. New J. Phys.19(2), 023024 (2017). [Google Scholar]
- 59.Bauza, H.M. & Lidar, D.A. Scaling advantage in approximate optimization with quantum annealing. arXiv preprint arXiv:2401.07184 (2024). [DOI] [PubMed]
- 60.Sung, K.J. & Bello, L. quantum-protein-folding (GitHub repository). https://github.com/qiskit-community/quantum-protein-folding?tab=coc-ov-file (2024). Accessed: 2025-08-14.
- 61.D-Wave Ocean SDK. https://github.com/dwavesystems/dwave-ocean-sdk. Accessed: 2026-03-11.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
All data regarding this publication are available in the corresponding GitLab repository (https://gitlab.cc-asp.fraunhofer.de/scheiber1/exploringquantumannealing4cgproteinfolding). The data contain all plottable data, the raw measurement results from the D-Wave devices as well as all QUBO matrices used in this study.
The underlying code regarding our implementations of simulated annealing or parallel tempering is not publicly available for proprietary reasons.
























































































