Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Jul 9.
Published in final edited form as: Nat Comput Sci. 2025 Apr 24;5(4):279–291. doi: 10.1038/s43588-025-00788-8

Physics-based Modeling in the New Era of Enzyme Engineering

Christopher Jurich 1,#, Qianzhen Shao 1,#, Xinchun Ran 1, Zhongyue J Yang 1,2,3,4,5,*
PMCID: PMC12239909  NIHMSID: NIHMS2093455  PMID: 40275092

Abstract

Enzyme engineering is entering a new era characterized by the integration of computational strategies. While bioinformatics and artificial intelligence (AI) methods have been extensively applied to accelerate the screening of function-enhancing mutants, physics-based modeling methods, such as molecular mechanics and quantum mechanics, are essential complements in many objectives. In this Perspective, we highlight how physics-based modeling will help the field of computational enzyme engineering reach its full potential by exploring current developments, unmet challenges, and emerging opportunities for tool development.

Introduction

Enzyme engineering concerns leveraging enzymes to suit our catalytic needs for synthesis, therapeutics, and sustainability. Industrial appetite for engineered enzymes is strong, with predicted compound annual growth rates ranging from 5 to 6% for the next decade.1 A desirable future of enzyme engineering hinges on the development of computational protocols capable of pinpointing functional raw enzymes and their engineered variants with quantitative accuracy, where biocatalytic development can be achieved with minimal screening efforts, as well as associated economic and environmental cost. Historically, directed evolution-based (DE) protocols have dominated the field and been routinely applied to create enzymes which assist in chemical synthesis, environmental pollutant degradation or upcycling, and therapeutics.24

Despite clear and established success, the reliance of DE-based protocols on high throughput experimental screening precludes its application to a selection of enzyme systems and engineering objectives. When side reactions are non-negligible (such as the case in proteases5) and specialized apparatus are needed for the assay (for instance, photo-enzymes6), screening protocols cannot be easily constructed. Miniaturization, the objective of reducing an enzyme’s size while maintaining high activity, cannot be reliably achieved through the high-throughput deletions or truncations sampled with DE.7 The establishment of such miniaturization strategies will improve therapeutic efficacy8, 9, as well as reduce production costs and resource consumption. Engineering enzymes to perform optimally in high and low temperature environments is another critical, recurring task in industrial biosynthesis.10 Although efficient extremophile enzymes create a clear path for reducing both cost and environmental impact, the commonplace mismatch between biological and industrial conditions (such as temperature, pH, etc.) makes extreme system engineering considerably less addressable with high throughput screening.

Beyond intractable engineering objectives, the prevalent use of bacterial expression systems in high throughput screening makes working with plant-based and mammalian enzymes challenging or impossible, albeit their innumerable biosynthetic functionality or reduced immunogenicity.11 Eventually, although often viewed a strength in engineering, the choice of DE to treat catalysis as a black box process introduces the potential for evolutionary dead ends that cannot be escaped without structurally or mechanistically-derived detours11 and prevents the quantitative prediction of enzymatic activity. When trapped in such a dead end, screening an additional 109 variants was unable to improve the efficiency of a human kynureninase.11 The logistical and theoretical limitations associated with DE and high throughput experimental screening highlight the need for complimentary techniques that overcome these limitations.

Computational approaches offer a definitive path for enzyme engineering to realize its full potential by expanding the scope of addressable enzyme properties and systems (Figure 1). Despite the growing use of bioinformatics and artificial intelligence to achieve these goals,12, 13 physics-based molecular modeling techniques remain indispensable due to the ubiquitous insufficiency in both the quantity and quality of relational enzyme sequence-structure-function data.14 Quantum mechanics (QM) and molecular mechanics (MM) can, in theory, be applied to measure experimentally-relevant functions for arguably arbitrary systems with an atom-resolved, three-dimensional structure, regardless of the enzyme’s origin or preferred operational conditions. Leveraging physics-based modeling, de novo enzyme design showcased the ability of first-principle approaches to create artificial enzymes that catalyze new-to-nature reactions.1519 Though relying on DE protocols to optimize the designed scaffolds15, 20 and frequently reported to hit evolutionary dead ends,11 de novo enzyme design demonstrates that virtual, physics-based design complements conventional screening-based techniques by providing artificial scaffolds uniquely available through rational design, representing a conceptual milestone for computational enzyme engineering.

Figure 1: Physics-based computational methods as an approach to realize enzyme engineering’s full potential.

Figure 1:

Conventional enzyme engineering techniques (middle column) excel at improving enzymatic efficiency for bacterial, non-membrane enzymes. Physics-based computational enzyme engineering (right column) is a more general approach capable of avoiding common pitfalls associated conventional methods (top) as well addressing systems (middle) and functional objectives (bottom) that are likewise difficult to work with using DE. Conventional enzyme engineering protocols are prone to being stuck in local minima of fitness landscapes called evolutionary dead ends, whereas physics-based methods can escape these minima (Protocol Pitfalls). The reliance of conventional enzyme engineering methods on bacterial expression and in vitro assaying precludes some enzyme systems, with proteases, photoenzymes, and plant-based enzymes serving as incompatible examples owing to their autoproteolyzing nature, need for specialized assays which provide constant light without pollution, and need for heterologous expression, respectively (System Compatibility). Physics-based enzyme engineering is compatible with most enzyme systems, although full atom structural models are typically required. Optimizing functional objectives other than catalytic rate is challenging for conventional techniques, with the need for observable activity necessitating relatively high similarity between the native and target substrates during substrate tuning, and cold adaptation and miniaturization campaigns relying on specialized assays (Functional Objectives). Using specialized metrics, physics-based enzyme engineering is capable of virtually modelling non-native substrates or removing structural regions to tune substrate specificity or miniaturize enzymes, respectively, and robust thermodynamics theory enables screening of mutants which promote cold adaptation.

Besides enabling de novo enzyme design, molecular modeling approaches have been extensively applied to elucidate an enzyme’s mechanism2124 and interpret the origin of efficiency25, 26 and selectivity27 through TS or reaction barrier calculations.28 Thorough computational analysis of efficient systems or function-enhancing mutants often yields quantitative relations or engineering principles that score and rank candidate enzyme scaffolds and mutants to achieve desired enzymatic functions. Furthermore, the resulting engineering principles inspire the design of descriptors and architectures for building machine learning (ML) models to accurately predict enzyme efficiencies, regioselectivity, substrate affinity, and other functions,2932 complementary to existing ML models trained solely from sequence or multiple sequence alignment (MSA) by enhancing molecular expressiveness of protein data.14 ML methods additionally enhance physics-based modelling, performing dimension reduction on complicated molecular dynamics (MD)-derived datasets and helping identify catalytically relevant modes or global conformations. Augmented with physics-based modeling, ML models may serve as an optimal pathway as the field works towards comprehensive models of catalysis.

In this Perspective, we discuss the paradigm of physics-based modelling as a means for enzyme engineering to achieve the goal of optimizing characteristics of enzyme systems. We describe principles-based design and highlight the growing influence of high throughput molecular modeling workflows, and the progress of principles focused enzyme engineering by increasing the number of enzymes and mutations analyzed. We also consider the fusion of ML and physics-based enzyme engineering, detailing the mutually beneficial impact of these two approaches on each other. We conclude by highlighting the challenges in computational enzyme engineering, in particular the urgent need for high-quality benchmark datasets to critically evaluate the accuracy of existing tools.

2. The Role of Physics-Based Modelling in Enzyme Catalysis

Molecular insights into enzyme catalysis, as derived from physics-based molecular modeling, guide the identification and deployment of beneficial mutants to manipulate the activity, promiscuity, size, stability, and temperature preferences of enzymes (Figure 2). Design principles are often formulated from observing and investigating known, experimentally characterized enzymes. In this sense, creating design principles allows the field to leverage molecular evolution from nature or the laboratory to inform our enzyme engineering efforts. Just as natural enzymes owe their catalytic efficiency and selectivity to several sources, design principles survey many aspects of enzyme scaffolds, including topology, enzyme electrostatics, flexibility and residue networks, or heat capacity. Researchers increasingly elucidate correlations between these molecular simulation-derived features and experimentally characterized kinetics and binding data.3335 These phenomenological models present an avenue for rapid scoring of mutation effects and improvement of catalytic functions. Due to the complexity of enzyme catalysis, there are yet undiscovered physical principles underlying enzymes’ extraordinary catalytic efficiency and selectivity. The investigation of these principles provides a direct path to further derive insights into catalytic activity as the field works towards a more holistic view of catalysis.

Figure 2: The life cycle of physics-based principles.

Figure 2:

Physics-based principles are derived through observation of natural and engineered sources with desired functional profiles such as high efficiency, or cold adaptation (top left rectangle). Shown are an efficient, engineered KE (top left rectangle, left, gray, PDB ID: 8usi), a cold adapted Adenylate Kinase (top left rectangle, middle, blue, PDB ID: 1p3j), and an efficient, naturally occurring enzyme human erythrocyte catalase (top left rectangle, right, grey, PDB ID: 1dgf). Individual principles and physical phenomena are identified, quantified, and better understood through physics-based computational simulations leveraging QM, MD, and QM/MM (top right rectangle). MD simulations typically model the holoenzyme complex with explicit solvent (top right, right), QM simulations are often applied to a reduced QM cluster version of the enzyme active site (top right rectangle, middle), and QM/MM simulations apply multiple levels of theory to different regions of an enzyme (top right rectangle, right). After identification in multiple systems, design principles are codified into generalized rational design rules which create definite, quantitative functional predictions (bottom right rectangle). Design rules are applied to rank beneficial mutations (shown as red spheres on a grey enzyme, bottom left rectangle, right), allowing for recommendations to achieve a given functional objective such as improved efficiency through TS stabilization or ground state destabilization (bottom left rectangle, left).

2a. Structure and Topology

Structure-informed enzyme engineering is uniquely convenient as beneficial mutants can be rationalized visually and AlphaFold2 has made it trivial to produce sufficiently accurate three-dimensional models for sequences of soluble proteins.36 Computational analysis of enzyme structures suggests that catalytic efficiency toward a specific substrate is improved when active sites show shape complementarity for that substrate. For example, conserved guanine binding sites broadly drive ribozyme selectivity,37 a single residue in catechol O-methyl transferase (COMT) positions its S-Adenosyl Methionine (SAM) cofactor to achieve a preferred donor-acceptor distance,38 and that the active site residues of bacterial arylmalonate decarboxylase (AMDase) drive substrate specificity by tuning the size of a hydrophobic pocket to accommodate various substrates.39 Active site mutations have been deployed to improve substrate complementarity and enzymatic efficiency. Such mutations have enhanced an O-methyl transferase’s ability to synthesize the pharmaceutical pinostilebene,40 improved substrate specificity of an acyltransferase in an unfavorable aqueous environment,41 and transformed a non-enzymatic protein to a kemp eliminase (KE)42

Topological engineering focuses on selecting mutations to favor substrate binding, or to improve tunnel accessibility for rapid diffusion of associated reactants or products. Topological engineering has been leveraged to alter substrate specificity.43, 44 By mutating residues in tunnels connecting active sites to the enzyme surface, the ability of substrates and water to travel to the active site can be reduced,45, 46 or improved.47 This design principle has been widely validated experimentally,48 and leveraged for the de novo design of tunnels.49 Beyond mutations within the protein interior, structure-informed insights can also guide the engineering of surface residues to tune enzyme functions. For instance, changing the number of charged surface residues can tune enzymes’ pH optimality.50 Most enzymes evolved to prefer environments closer to neutral pH’s, and tolerance for less biologically common conditions opens the door for working with reactions that proceed more rapidly in a basic or acidic environment. Altering optimal pH is a largely unutilized engineering strategy for improved enzymatic efficiency.

Structure-informed enzyme engineering relies on knowledge of a specific system gained by analyzing substrate-enzyme complexes across key functional conformations, particularly those where the substrate adopts a pre-reaction conformation. While AlphaFold3 can predict substrate-enzyme complexes, merely stabilizing ground-state interactions is insufficient—an enzyme must also ensure that coordinated interactions position the substrate in a reactive conformation capable of generating products.29 Although methods like idpGAN and AF-Cluster have been recently developed to create structural ensembles from sequence data alone,5153 their inability to model holoenzymes and corresponding reactive states signals a potential area of growth for computational enzyme engineering. Beyond technical considerations, applying crowdsourcing to analyze enzyme structure remains an untapped avenue to maximize the potential of structure-informed enzyme engineering. Though low-throughput, human intuition is known to be effective and creative at optimizing various biomolecular problems as evidenced through successful efforts to game-ify protein folding, RNA folding, and drug design.5457 Applying Web3 block-chain technology could enhance massively parallel efforts to study and design on the basis of enzyme structure using an open science approach.58 One could even envision a public ledger that stores individual structural analyses and serves as an intermediary to develop consensus amongst citizen scientists.

2b. Electrostatics

Enzyme electrostatics, such as electrostatic potential and electric field (EF), mediate chemical reactivity that involves the change of ionic states or charge separation. Linus Pauling proposed that enzymes achieve catalysis by stabilizing the transition state, while Ariel Warshel, using multiscale molecular simulations, later demonstrated that pre-organized electrostatic effects largely contribute to this stabilization.59, 60 Boxer et al. experimentally measured EF using vibrational Stark Shift spectroscopy61 62 and demonstrated that the strength of EF has a quantitative connection to transition state stabilization. EF is convenient to model using molecular simulations, making them popular in the computational enzyme engineering community for decades.61, 63, 64 Specifically, EF can be calculated by applying the binomial theorem to the Coulomb potential yielding the infinite multipole series whose first terms is the monopole contribution to EF, second term is the dipole contribution to EF, and so on. EF is typically approximated using Coulomb’s law, the first term of the multipole series, based on the atomic charges derived from fixed-charge MM, polarizable MM, or QM methods, and sometimes using higher order terms when corresponding multipoles are available.27, 65

Projecting the computed EF onto the dipole moments of reacting bonds reveals stabilization energies, which quantify an enzyme’s ability to facilitate bond breaking and formation in the transition state. This principle has been extensively validated through MD and QM/MM studies on enzymes such as ketosteroid isomerases, KEs, P450s, dihydrofolate reductase (DHFR), glycine N-methyltransferases (GNMTs), 20S proteasome, and catechol-O-methyltransferase (COMT).66 Seminal work by the Teresa Head-Gordon group converted the understanding of enzyme electrostatics into a design principle.65, 67, 68 They established that individual mutations of KE can effectively fine-tune the magnitude of EF projected onto catalytically relevant chemical bonds.65 This insight enabled the design of the highly efficient KE15.68 Beyond KE, observation of an electrostatic basis for efficient hydrolases motivated the inclusion of an aspartate residue which transformed the Bacillus subtilis esterase Bs2 into an amidase through electrostatic stabilization of the TS.6971

Developing gold standard methods for calculating EF remains an open challenge, and providing relevant technical infrastructure stands to benefit the field at large. Simple EF calculations treat the enzyme scaffold as a collection of point charges. Polarizable force fields like AMOEBA offer enhanced accuracy over EF estimation of fixed charge force fields and rival that of QM-derived EFs.72 However, GPU-accelerated polarizable force field calculations are primarily supported in Tinker-HP73, and have not been broadly accessible by other popular MD packages such as AMBER, GROMACS, etc. Researchers often calculate the electric stabilization energy (EES) by projecting an enzyme’s EF (Fenz) only along relevant bonds of the reactant and transition state and taking its dot product with associated bond dipoles (ubond) (Equation 1). An alternative method to calculate EES is by integrating the product of the electron density (ρr) and the electric field potential (Venz) over the entire reactant or transition state molecule (r) (Equation 2).74

EES=-Fenzubond (1)
EES=ρrVenzrd3r (2)

Rational engineering relies on deploying point mutations to improve EF strength or stabilization effects, but critically assumes that mutants retain the global fold and substrate positioning dynamics of the wild-type (WT) protein. The latter is especially pertinent as minor changes to substrate orientation may quickly abolish anticipated electrostatic gains. Analysis of residue coupling based on sidechain mutual information presents a potential means to identifying EF-mediating residues unlikely to perturb substrate dynamics, but methods are not generalized across enzymatic systems and require further refinement.67 Improving the accuracy of enzyme EF calculations through advances in computational methodology stands to promote the use of these design principles, especially when it can be predicted which EF-mediating residues are unlikely to alter local and global conformational changes.

2c. Protein Dynamics

Dynamics-inspired enzyme engineering emphasizes how conformational changes mediate catalytic activity, offering physical insights and design hypothesis that are not accessible through the analysis of static structures. Though convenient, a PDB-derived single enzyme geometry does not inform how enzyme restructuring or reorganization influences substrate positioning for barrier crossing.75 This single geometry also fails to inform the impact of mutation on catalysis via rearranging active site residues or even dynamic allostery.76, 77 Attempts have been made to overcome this limitation through improved experiments, such as room temperature X-ray crystallography which allows multiple conformations to be more easily populated,78 or post-processing of structural data, namely the creation of pseudo-ensembles from enzymes structures of the same or adjacent families.79 As a purely computational remedy, MD simulations are frequently employed to sample enzymatic conformational ensembles and measure structural or energetic features across trajectories.

Besides virtual screening for function-enhancing mutants by evaluating enzyme properties—such as binding affinity, chemical selectivity, and reaction barriers—MD simulations reveal the impact of conformational shifts on catalytic efficiency, especially when combined with QM or QM/MM methods80. For example, earlier studies have shown that liver alcohol dehydrogenase (LADH) experiences ps-ns conformational changes near its TS to facilitate hydride transfer,81 adenylate kinase involves a broad ensemble that likely contributes to increased entropy of activation and catalytic activity,82 and ketol-acid reductoisomerase (KARI) has specific conformational regions associated with higher reactivity.83 Measuring the evolving ratio of substrate to active site (solvent-accessible surface area) SASA through the substrate positioning index (SPI) in KE mutants provides a more nuanced view that mutations can reduce catalytic activity when the resulting active site is too loose or tight, thereby implying a Goldilocks “sweet spot” of active site size for a given enzyme-substrate system.33

Furthermore, MD-derived principles help identify rate-enhancing mutants. For example, in the engineering of ancestral luciferase AncHLD-RLuc, MD analysis informs that loop mutations modify flexibility, improving ligand binding and enzymatic activity.84 The observation that rigid catalytic residues enhance enzymatic activity was leveraged to create the efficient HG4 KE, whose mutations promote both rigidity and active site organization for favorable catalysis.19 Analyzing correlated residue movement has led to the creation of the shortest path map (SPM) model, which uses MD trajectory data to identify residues that are likely instrumental to catalytically relevant conformational switches.85

Despite demonstrated success, logistical and theoretical limitations prevent universal application and adoption of dynamics-based principles in enzyme engineering campaigns. Sufficient conformational sampling, alongside the QM-based reaction barrier calculation, is necessary to identify catalytically relevant, near attack conformation (NAC) states which allow ground state reactants to favorably proceed to the TS.86, 87 While computational costs continue to cheapen, conformational sampling remains a rate-limiting step in computational enzyme engineering efforts,88, 89 particularly in cases where enhanced sampling or Markov-State modeling is needed. Generative models show great potential to achieve low-cost conformational sampling directly from input sequences, but their application to enzyme engineering is impeded by insensitivity to point mutations, though AlphaFold2 demonstrates potential to predict the impact of mutations on inter-residue distance distributions in epidermal growth factor receptor (EGFR).53

Another challenge for MD-guided enzyme engineering is the lack of generalizable quantitative metrics that represent the impact of protein dynamics on chemical reactivity. Like structure-based principles, MD-based design principles are often system specific. SPI demonstrates a volcano-like piecewise linear correlation with free energy barrier in lactonase SsoPox35 and KE33. However, it remains unknown how to identify the SPI corresponding to the optimal activity a priori. Investigations of soybean lipoxygenase (SLO) have demonstrated the critical role that distal loop motions play in thermally activated enzymes, where the enthalpy of activation of hydrogen-deuterium exchange (HDX) energy and activation barrier energy are unexpectedly positively correlated.90 The fundamental nature of this observation paves the way for new dynamics-inspired enzyme engineering principles that are potentially generalizable across systems.75 Additionally, entropy has long been known to play a critical role in catalysis, but quantitatively factoring the role of entropy from conformational ensemble remains an open question.91 There is also increasing interest in predicting relative populations of reactive states capable of chemical productivity. Proximity alone is not enough for reaction procession and case studies suggest that discriminating reactive and non-reactive geometries is a non-trivial task in need of further development.83, 92 The potential power of accurately predicting reactive states was recently demonstrated in a successful de novo serine hydrolase campaign where the PLACER generative algorithm was used to create ensembles of reaction intermediates and designs were filtered based on the catalytic competency of their respective ensembles.93 Treating enzymes as conformational ensembles is a fundamentally robust approach, and further refinement and application of MD-guided design principles maximizes the potential of computational enzyme engineering.

2d. Heat Capacity

Considering heat capacity adds a unique dimension for engineering the temperature-dependent behavior of enzymes to suit our needs, such as cold-adaptation and thermostability. Negative heat capacity in a highly efficient KE is associated with stabilization of the TS state versus the ground state.94 Standard Arrhenius behavior assumes that enzyme activity increases with temperature until protein degradation occurs. The incompatibility of non-Arrhenius behavior, seen in cold-adapted α-amylase (AHA),95 ancient reconstructed adenylate kinase (ANC1),96 and others, has motivated the development of a heat capacity-based framework for calculating enzyme efficiency from MD simulations.97 Heat capacity models helped elucidate the emergence of inactive substrate-enzyme conformations at higher temperatures, explaining the temperature preference of AHA. In theoretical terms, AHA keeps activation enthalpies low and activation entropies more negative by preventing conformationally competent enzyme-substrate interactions.98 This principle was applied back to AHA, leading to the identification of mutations which shifted its thermal optima upward.99

Engineering enzyme temperature preference through heat capacity prediction is promising and untapped. With methods for predicting non-Arrhenius activity only being developed recently,100102 they remain largely unapplied despite potential applications in sugar, laundry, and textile manufacturing, as well as biocatalysis, and sustainable production. Fundamental and glaring issues arise when trying to apply lessons from monodomain AHA to other industrial enzymes, such as amylases and cellulases, because the majority of these enzymes have two domains, a catalytic domain and a carbohydrate-binding module.103, 104 Instead, it has been demonstrated that cold adaptation can be achieved via the introduction of linkers which increase domain separation index (DSI), an MD-derived descriptor which rigorously describes domain separation.105 Although this work provides a physical principle for engineering cold-adapted bidomain enzymes, its structural basis underlying the apparent non-Arrhenius activity requires further investigation.

2e. Complex Mechanisms that Require New Engineering Principles

Until every contributor of enzyme catalysis is catalogued and codified, there is still room for the development of new physics-based design principles. At present there are an abundance of mechanisms which enable enzymes to achieve high catalytic efficiency that have not been distilled into concise design principles. Hydrogen tunneling is critical to the rate limiting step of soybean lipoxygenase (SLO),75 and recent MD and QM simulations are only now beginning to unveil its mechanistic details.106 Before that, multi-dimensional tunneling analysis has also been applied to understand how hydrogen, proton, and hydride transfer occurs in various enzyme systems, although insights have largely not been applied to further enzyme engineering efforts.107 More broadly, proton coupled electron transfer (PCET) serves the basis of countless highly efficient enzymatic reactions, and studies have been conducted to elucidate how mutations affect PCET.106, 108 However, there has been limited understanding of how beneficial mutations can be predicted to gain desired functions such as activity, selectivity, etc. in an enzyme involving PCET.109 In addition, growing interest exists in understanding the role of femtosecond dynamical protein motions which can impact TS trajectories occurring on comparable time scales. Transition path analysis of purine nucleoside phosphorylase (PNP) highlights that a distal residue contributes one such rapid promoting vibration.110 QM/MM-based quasiclassical trajectory simulations illustrate how post-TS bifurcations determine product selectivity in SpnF-catalyzed Diels–Alder reactions, demonstrating the kinetic energy contribution of active-site hydrophobic residues in chemical activation, and how these systems coalesce to form chemical activation networks which guide reactivity.21 Fundamental understanding of enzymes’ chemical activation networks will suggest unexplored avenues to engineer biocatalysts. Clearly, there is a diverse ecosystem of physical phenomena critical to enzyme catalysis which are not yet applied to enzyme engineering efforts. In turn, physics-based enzyme engineering is an exciting field with considerable potential to grow as computational resources continue to cheapen and phenomena like PCET, Hydrogen tunneling, rapid enzyme motions, and post-TS bifurcations are studied in greater detail.

3. Discovery of New Enzymes and Variants

Automated workflows are an emerging means of fully leveraging physics-derived design principles to evolve the field of biochemistry and enzyme engineering. While there is a robust and validated selection of design principles, manually applying them to enzyme systems limits the field by lowering throughput-levels, reducing reproducibility, and keeping technical entry barriers high. The low-throughput level associated with manual system preparation hurts the field by both reducing the number of enzymes that are refined and limiting the exhaustiveness of sequence searches for those systems. Manual preparation of input files as well as analysis of simulations introduces innumerable failure points, translating into an escalated risk of error and reduced reproducibility when deploying mutations to a system. Computational enzyme engineering workflows typically feature multiple software packages, introducing a high technical barrier to entry and ultimately limiting the size of the community and the diversity of perspectives (Figure 3). Consequently, computational enzyme engineering workflows should aim to address these concerns.

Figure 3: The role of high throughput workflows in enzyme engineering.

Figure 3:

Conventional workflows for computational enzyme engineering follow a common pattern with the WT enzyme substrate complex serving as a starting point (A, top left, enzyme shown in blue, substrate in orange). A mutant library is constructed for this complex and each mutant is deployed to an individual structure (A, middle left, shown as a red dot). Conformational sampling is performed for each mutant (abbreviated Mut) and the WT using MD (A, middle left), and physics-based metrics such as RMSD, EF, and others are calculated for each conformer and averaged for each mutant and WT (A, bottom left). Mutants are ranked by conformationally averaged descriptor values and recommended (green check) or not recommended (red cross) based on comparison with WT values. Existing computational workflows are primarily focused on optimizing rate efficiency through mutations, leaving other objectives without robust workflows (B). Smart library construction (B, top left) is the task of creating a library of multiple mutations or sets of mutations (red circles) which are predicted to improve some function such as activity towards substrate (small blue lines), stability, or selectivity. Chimera enzyme fusion (B, top right) is the process of incorporating a new enzyme active site (green) to an existing enzyme (shown with white outline) to improve catalytic activity for a substrate (blue structures) or alter overall activity profiles. Engineering new to nature reactions (B, bottom left) focuses on the introduction of mutations which alter the WT reaction (change signified by yellow explosion), typically changing the substrate scope to a non-native molecule (shown in blue). Genome based enzyme discovery (B, bottom right) screens the same substrate (blue square) against multiple enzymes (shown in grid) to rank each sequence and select the most active enzyme (highlighted in yellow circle).

Physics-based enzyme engineering workflows were pioneered with the release of CADEE in 2017.111 As a workflow to perform computer-aided directed evolution, CADEE was the first platform designed for the sole purpose of ranking and recommending individual mutants based on the activation energy calculated via a standard empirical valence bond (EVB) free-energy perturbation and umbrella sampling. CADEE is a clear breakthrough in the field, but there are limitations regarding its sensitivity to the EVB force field’s parameterization quality and its reliance on expert input when relevant experimental data is not available. These limitations likely impede CADEE’s accuracy and generalizability, and could be addressed by implementing support for other techniques. Unfortunately, a recurring issue in the field is that software projects for enzyme engineering often have short active development lifetimes, resulting in inadvertent specialization. The field lacks a software package with robust support for numerous simulation techniques and design principles.

EnzyHTP89 was released in 2022 and fundamentally advanced the state of the art for high throughput computational workflows. EnzyHTP was conceptualized and developed as a robust and flexible tool for general purpose enzyme engineering. Notably, EnzyHTP automates every step of enzyme engineering including preparation, mutagenesis, geometry sampling, and post-hoc analysis. This python-based package advances the state of the art by supporting arbitrary molecular modelling tasks including MD, QM, ligand docking, trajectory analysis, and more. Exposing this functionality through modular python functions uniquely enables the creation of flexible analysis and engineering workflows tailored to the system at hand. Ultimately, EnzyHTP serves as a modular breadboard on which other workflows are built. For example,88 an EnzyHTP-based workflow was recently developed to perform computational DE. In this workflow, in silico analysis allowed hundreds of mutants to be screened for both thermostability and their ability to stabilize the breaking bond in the rate-limiting TS.43,44 When applied to KE07, a well-characterized and optimized KE, this EnzyHTP workflow successfully identified all 4 experimentally observed rate-enhancing mutants. This work highlights various pareto optimization concerns for such workflows as striking balances between computational cost and mutant ranking accuracy, smart library construction schemes, or functional scoring remain unsolved problems.

Narrow selection of targeted engineering objectives and insufficient incorporation of physics-based principles within workflows are two shortcomings in the field which both represent potential areas of growth. Protocols like CADEE, EnzyHTP, ASiteDesign112, and CASCO113 have demonstrated the power of high throughput workflows for the task of identifying rate-enhancing point mutations and modifying selectivity, but no analogous pipelines have been applied to solve challenges like tuning the substrate scope—a desirable task in the context of biosynthetic enzymes with applications in diversifying therapeutic peptides, industrial production of fine chemicals, and late-stage pharmaceutical functionalization. Other unaddressed functional challenges include accelerating genome-based enzyme discovery,114 engineering new-to-nature reactions in enzymes,115, 116 and assembling fused functional domains into chimera enzymes (Figure 3B).117 While these objectives are addressable with existing methods created to rank point mutations, the majority of physics-based design principles remain unimplemented in a manner which could be applied to address these tasks. Given the diversity of these functional objectives, it is almost certain that many more physics-based principles will need to be codified, quantitated, and applied in high throughput workflows to see meaningful progress.

Besides expanding the scientific relevance of high-throughput enzyme modeling workflows, a technical challenge lies in software engineering. Applying coding best practices is non-trivial and frequently neglected in the initial stage of software development, leading to extensive refactoring efforts when the inherent software architecture struggles or fails to accommodate expanded functionality. On the other hand, low code readability, over-simplified documentation, and poor co-development infrastructure are not uncommon, often limiting the development and applications of the software to the developers’ research group. As a result, maintenance may cease to continue as the main developer trainees move to the next career stage. The community lacks guidelines for software engineering, though analogous initiatives for data sharing have led to the FAIR principles and ELIXIR infrastructure provide a path to establish such standards for coding.118, 119 Moreover, groups like Loschmidt Lab serve as a role model for the community by making 15 software tools and 3 databases fully public.120 Initiatives like the Molecular Sciences Software Institute have made a huge impact on raising awareness within the community about the importance of implementing software design principles in molecular modeling software, training a large group of early-stage computational chemists and biologists specializing in software development. Looking ahead, the challenge lies in how to make software engineering initiatives a routine part of training programs in traditional science departments, thereby creating a cohort of developers equipped to tackle interdisciplinary challenges like protein engineering.

4. Symbiosis of Physics-Based Modelling and ML

ML has been extensively used to guide enzyme engineering, but model development has been hampered by the lack of robust integrated structure-function databases.121 Physics-based modelling is uniquely poised to address this limitation by providing an abundance of microscopic descriptors that establish concrete links between catalysis and enzyme-substrate interactions. For instance, incorporating MD-derived conformational descriptors improved the prediction of mutation effects on bovine enterokinase (EKB) activity compared to sequence-only models.32 Encoding structural features, such as active site-reactant interactions, enabled EnzyKR to predict favored enantiomers in hydrolase-catalyzed kinetic resolution.31 Docking scores, QM-derived charges, and other physics-based metrics allowed a classifier to accurately predict bacterial nitrilase substrate promiscuity.30 A multivariate regressor trained on geometric, dynamic, and charge descriptors achieved high accuracy in predicting ene-reductase GluER-T36A enantioselectivity.80 Similarly, integrating Rosetta energy terms and sequence identity into a structure-aware protein graph convolutional network (PGCN) improved protease specificity predictions (Figure 4).122

Figure 4: The symbiotic relationship of physics-based and ML modelling.

Figure 4:

Physics-based modelling (left) and ML modelling (right) are two distinct approaches which provide insights to mutually benefit each other. MD simulations, electronic structure theory, and other molecular modelling techniques (left) have been routinely applied to create descriptors like the Rosetta energy terms, binding energy, and electric stabilization energy (middle) which can be given directly to ML models as inputs (right). Physics based approaches have also inspired ML architectures which encode structural information to improve performance (top, middle) for example by representing the backbone (Structural Encoding, shown in grey wires) or creating ML models whose input layers correspond directly to residues in an enzyme (Structure Aware Architecture, top, middle). ML modelling aids the advancement of physics-based understanding of catalytic mechanisms by deconstructing high dimensional data that cannot be analyzed manually (bottom). Two common tasks are the aggregation of individual snapshots of a system (x0, x1, … xn) to create an overall probability (P(x)) of different geometries or collection of states described within, and the use of ML to identify geometric features critical to catalytic activity in a substrate or overall system (Identifying Reactive Geometries, bottom, middle)..

Considering the lack of explicit substrate-enzyme interaction information in one-dimensional protein sequence and substrate SMILES strings, the notion that structural features boost ML performance may seem obvious. The practical challenge, however, is acquiring quality enzyme-substrate complexes linked to experimentally characterized activity and selectivity data.

This task is further complicated by the combinatorial explosion that results when enzyme mutations and differing substrates are considered. The staggering size of the enzyme problem space makes reliance on naïve AI based methods impractical given readout volume of current experimental techniques, motivating the creation of datasets with microscopic, physics-based features which capture effects observed broadly across seemingly diverse systems. Large-scale datasets like ProteinGym123 provide fitness-related values but have limited physicochemical relevance. The development of integrated sequence-structure-function database, such as IntEnzyDB124, emerges as a solution, but the size of curated quality data still largely lags behind the community needs. Moreover, despite advances of reactive docking algorithms,125, 126 a comprehensive database of large-scale catalytically relevant pre-reaction complexes across diverse enzymes and substrates remains undeveloped. Equally urgent is the establishment of enzymology databases with physics-based modeling-derived descriptors. An indicative example is the BioFragment Database, a repository of QM-derived protein interaction energies that provides a template for how researchers can create generalized values which can be readily adapted as features for training ML models.127 Developing analogous resources containing QM and MD descriptors for enzyme-substrate complexes would substantially improve ML efforts.

Supplying ML models with multiple feature classes improves model performance today and paves the way for the design of electrostatically or dynamically optimized sequences. The introduction of ProteinMPNN has made it possible to generate expressible, soluble, and stable de novo enzyme scaffolds.128 Despite this success, ProteinMPNN’s focus on the scaffold stability while overlooking catalytically relevant properties—such as side-chain conformations, dynamics, and electric fields—limits its effectiveness in enzyme design. As a result, enzymes generated by this approach require additional rounds of catalytic engineering before practical applications. Conditioning generative models on catalytically relevant physical features offers a direct approach to designing enzymes with enhanced initial efficiency.

While physics-based modeling enhances ML-driven enzyme design by incorporating catalytically relevant features, ML also plays a crucial role in overcoming long-standing roadblocks in physics-based modeling. Generating accurate TS geometries inside an enzyme remains a major challenge for physics-based modeling, despite their importance in elucidating reaction mechanisms, predicting TS barrier heights, and inspiring rational design strategies. Recent work has demonstrated that equivariant diffusion models can generate highly accurate gas phase TS-geometries from structures of reactants and products alone.129 Considerable further effort will be needed to extend this technology to account for interactions with both active site residues and solvent molecules, but using ML algorithms to generate accurate TS-geometries is an elegant means to make barrier calculations possible for more systems. ML methods also present opportunities to perform dynamics simulations with accuracy rivaling pure QM approaches at significantly reduced costs, as exemplified by the AI2BMD framework which uses an ML potential trained on QM calculations of protein fragments.130 Such ML acceleration stands to promote enzyme engineering, though more work is needed to allow modelling of holoenzyme systems including substrates. Distilling catalytic meaning from high-dimensional MD simulations is another major challenge aided by ML integration. Looking to individual molecular coordinates, the high number of distances, angles, and dihedrals associated with even small ligand systems makes manual analysis potentially spurious and counterproductive (Figure 4). In the case of KARI, ML models analyzed substrate turnover events and identified measurables strongly associated with reactivity.29 Further application of this technique to systems beyond KARI stands to grow knowledge of intrinsic reactivity and paves the way for the distillation of a more generalized understanding of how ligand geometry impacts reactivity.

5. Challenges in Computational Tool Development

As we have discussed, the advancement of physics-based enzyme engineering will depend on methodological and technical improvements. Enzyme engineering’s reliance on computational tools provides both technical limitations at present and the potential for rapid gains as first-principles software packages see continued improvement. Computational cost is a primary bottleneck for most techniques in the field. MD and QM/MM are critical for conformational sampling and activation barrier height calculations, respectively, but both techniques can take several days to calculate values meaningful for enzyme engineering efforts. Addressing the challenge demands advancements in both computing hardware and algorithms. On the hardware front, quantum computing presents a promising engine to drive the next-generation electronic structure simulations, with early applications demonstrating its potential in the structure prediction of proteins131. The hybrid use of quantum computers alongside existing classical processing units could lead to substantial advances in modeling the rare events within enzyme systems, though the realization of a true quantum advantage remains uncertain. On the algorithmic side, the development of artificial intelligence (AI) to accelerate high-accuracy energy calculations and sampling is a thriving direction. Machine learning potentials have shown promise in facilitating QM/MM simulations, particularly for evaluating chemical barriers132. Furthermore, generative models are increasingly being used to map path-dependent free energy changes by leveraging information from end-point states, such as in targeted free energy perturbation studies133.

Although many tools have been developed, there is limited consensus on how to compare and titrate computational performance for enzyme engineering. In contrast to traditional computational chemistry tasks, such as thermochemistry predictions with well-established benchmark sets, computational enzyme engineering faces a constantly evolving target that depends on the intrinsic mutational landscape of the selected enzyme. Some model systems like KE have become de facto benchmarks, but the practice of drawing conclusions from a single enzyme is biased and prone to create analyses which hyper fixate on artifacts or aberrations of a specific enzyme. AlphaFold2 and its origins in the Critical Assessment of Structure Prediction (CASP) program offer a potential path to more sound methodology in the enzyme engineering. CASP uses a blind process in which researchers’ predictions are validated against unreleased protein structures and contain protein targets across different families. Such an approach creates a maximally robust validation environment and inspires the future creation of an initiative perhaps called Critical Assessment of Enzyme Functional Prediction. In 2023, the Protein Engineering Tournament facilitated a milestone competition wherein 7 teams predicted enzyme activities and all data was made public.134 Recurring enzyme engineering competitions stand to critique and elevate the community at large. Notably, ProteinGym123 offers a large dataset obtained from deep mutational screening (DMS) and clinical observation, mapping the protein sequence to the DMS score. While ProteinGym is a valuable dataset for ML model development, its fitness data lacks rigorous comparability, as it reflects different physical properties across assays conducted under varying screening conditions. For instance, within the “activity” category, only six entries correspond to actual enzyme activity, while others pertain to properties like pili display. Moreover, the database lacks essential information such as substrates, reaction mechanisms, and enzyme-substrate complex structures—critical factors for evaluating physics-based enzyme engineering tools. A gold standard of benchmarking in the field of enzyme engineering should comprise of enzyme data with a diverse range of wild-type sequences, mutations, reactions, reaction mechanisms, substrate types, experimental hit rates, as well as evaluation metrics and algorithms which critically assess the performance of software packages in a blind, unbiased manner.

Although the abilities of directed evolution and high-throughput screening are impressive, we only consider this approach as an intermediate step towards developing methods that can address any engineering objectives across any enzyme systems. Physics-based modeling plays an essential role in advancing the next generation of enzyme engineering methods due to its unique ability to directly predict experimental observables from first principles, elucidate molecular mechanisms, and identify key molecular descriptors as design principles. This approach is crucial for ultimately unlocking the full potential of enzyme engineering, leading us into a new era of enzyme innovation and discovery.

ACKNOWLEDGMENT

This research was supported by the startup grant from Vanderbilt University. C. Jurich is supported by Vanderbilt Chemistry-Biology Interface Training Grant (T32GM149371 and T32GM065086). Z. J. Yang is supported by the National Institute of General Medical Sciences of the National Institutes of Health under award number R35GM146982 and the Rosetta Commons Seed grant.

Footnotes

Ethics declarations

Competing interests. The authors declare no competing financial interest.

References

  • (1).Yadav V; Biswas S; Goyal A Enzymes of Industrial Significance and Their Applications. In Industrial Microbiology and Biotechnology: An Insight into Current Trends, Verma P Ed.; Springer Nature Singapore, 2024; pp 277–307. [Google Scholar]
  • (2).Liao C; Seebeck FP S-adenosylhomocysteine as a methyl transfer catalyst in biocatalytic methylation reactions. Nature Catalysis 2019, 2 (8), 696–701. DOI: 10.1038/s41929-019-0300-0. [DOI] [Google Scholar]
  • (3).Bell EL; Smithson R; Kilbride S; Foster J; Hardy FJ; Ramachandran S; Tedstone AA; Haigh SJ; Garforth AA; Day PJR; et al. Directed evolution of an efficient and thermostable PET depolymerase. Nature Catalysis 2022, 5 (8), 673–681. DOI: 10.1038/s41929-022-00821-3. [DOI] [Google Scholar]
  • (4).Rossi L; Pierigè F; Agostini M; Bigini N; Termopoli V; Cai Y; Zheng F; Zhan CG; Landry DW; Magnani M Efficient Cocaine Degradation by Cocaine Esterase-Loaded Red Blood Cells. Front Physiol 2020, 11, 573492. DOI: 10.3389/fphys.2020.573492 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (5).Ferrall-Fairbanks MC; Kieslich CA; Platt MO Reassessing enzyme kinetics: Considering protease-as-substrate interactions in proteolytic networks. Proceedings of the National Academy of Sciences 2020, 117 (6), 3307–3318. DOI: doi: 10.1073/pnas.1912207117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (6).Nicholls BT; Oblinsky DG; Kurtoic SI; Grosheva D; Ye Y; Scholes GD; Hyster TK Engineering a Non-Natural Photoenzyme for Improved Photon Efficiency. Angewandte Chemie International Edition 2022, 61 (2), e202113842. DOI: 10.1002/anie.202113842. [DOI] [PubMed] [Google Scholar]
  • (7).Emond S; Petek M; Kay EJ; Heames B; Devenish SRA; Tokuriki N; Hollfelder F Accessing unexplored regions of sequence space in directed enzyme evolution via insertion/deletion mutagenesis. Nature Communications 2020, 11 (1), 3469. DOI: 10.1038/s41467-020-17061-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (8).Shams A; Higgins SA; Fellmann C; Laughlin TG; Oakes BL; Lew R; Kim S; Lukarska M; Arnold M; Staahl BT; et al. Comprehensive deletion landscape of CRISPR-Cas9 identifies minimal RNA-guided DNA-binding modules. Nat Commun 2021, 12 (1), 5664. DOI: 10.1038/s41467-021-25992-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (9).Doudna JA The promise and challenge of therapeutic genome editing. Nature 2020, 578 (7794), 229–236. DOI: 10.1038/s41586-020-1978-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (10).Wu H; Chen Q; Zhang W; Mu W Overview of strategies for developing high thermostability industrial enzymes: Discovery, mechanism, modification and challenges. Critical Reviews in Food Science and Nutrition 2023, 63 (14), 2057–2073. DOI: 10.1080/10408398.2021.1970508. [DOI] [PubMed] [Google Scholar]
  • (11).Blazeck J; Karamitros CS; Ford K; Somody C; Qerqez A; Murray K; Burkholder NT; Marshall N; Sivakumar A; Lu WC; et al. Bypassing evolutionary dead ends and switching the rate-limiting step of a human immunotherapeutic enzyme. Nat Catal 2022, 5 (10), 952–967. DOI: 10.1038/s41929-022-00856-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (12).Menke MJ; Ao Y-F; Bornscheuer UT Practical Machine Learning-Assisted Design Protocol for Protein Engineering: Transaminase Engineering for the Conversion of Bulky Substrates. ACS Catalysis 2024, 14 (9), 6462–6469. DOI: 10.1021/acscatal.4c00987. [DOI] [Google Scholar]
  • (13).Yu T; Boob AG; Volk MJ; Liu X; Cui H; Zhao H Machine learning-enabled retrobiosynthesis of molecules. Nature Catalysis 2023, 6 (2), 137–151. DOI: 10.1038/s41929-022-00909-w. [DOI] [Google Scholar]
  • (14).Mazurenko S; Prokop Z; Damborsky J Machine Learning in Enzyme Engineering. ACS Catalysis 2020, 10 (2), 1210–1223. DOI: 10.1021/acscatal.9b04321. [DOI] [Google Scholar]
  • (15).Röthlisberger D; Khersonsky O; Wollacott AM; Jiang L; DeChancie J; Betker J; Gallaher JL; Althoff EA; Zanghellini A; Dym O; et al. Kemp elimination catalysts by computational enzyme design. Nature 2008, 453 (7192), 190–195. DOI: 10.1038/nature06879. [DOI] [PubMed] [Google Scholar]
  • (16).Jiang L; Althoff EA; Clemente FR; Doyle L; Röthlisberger D; Zanghellini A; Gallaher JL; Betker JL; Tanaka F; Barbas CF; et al. De Novo Computational Design of Retro-Aldol Enzymes. Science 2008, 319 (5868), 1387–1391. DOI: doi: 10.1126/science.1152692. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (17).Privett HK; Kiss G; Lee TM; Blomberg R; Chica RA; Thomas LM; Hilvert D; Houk KN; Mayo SL Iterative approach to computational enzyme design. Proceedings of the National Academy of Sciences 2012, 109 (10), 3790–3795. DOI: doi: 10.1073/pnas.1118082108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (18).Garrabou X; Beck T; Hilvert D A Promiscuous De Novo Retro-Aldolase Catalyzes Asymmetric Michael Additions via Schiff Base Intermediates. Angewandte Chemie International Edition 2015, 54 (19), 5609–5612. DOI: 10.1002/anie.201500217. [DOI] [PubMed] [Google Scholar]
  • (19).Broom A; Rakotoharisoa RV; Thompson MC; Zarifi N; Nguyen E; Mukhametzhanov N; Liu L; Fraser JS; Chica RA Ensemble-based enzyme design can recapitulate the effects of laboratory directed evolution in silico. Nature Communications 2020, 11 (1), 4808. DOI: 10.1038/s41467-020-18619-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (20).Jiang L; Althoff EA; Clemente FR; Doyle L; Röthlisberger D; Zanghellini A; Gallaher JL; Betker JL; Tanaka F; Barbas CF 3rd; et al. De novo computational design of retro-aldol enzymes. Science 2008, 319 (5868), 1387–1391. DOI: 10.1126/science.1152692. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (21).Yang Z; Yang S; Yu P; Li Y; Doubleday C; Park J; Patel A; Jeon B.-s.; Russell WK; Liu H.-w.; et al. Influence of water and enzyme SpnF on the dynamics and energetics of the ambimodal [6+4]/[4+2] cycloaddition. Proceedings of the National Academy of Sciences 2018, 115 (5), E848–E855. DOI: doi: 10.1073/pnas.1719368115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (22).Prejanò M; Sheng X; Himo F Computational Study of Mechanism and Enantioselectivity of Imine Reductase from Amycolatopsis orientalis. ChemistryOpen 2022, 11 (1), e202100250. DOI: 10.1002/open.202100250. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (23).Mukherjee S; Warshel A Electrostatic origin of the mechanochemical rotary mechanism and the catalytic dwell of F1-ATPase. Proceedings of the National Academy of Sciences 2011, 108 (51), 20550–20555. DOI: doi: 10.1073/pnas.1117024108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (24).Shaik S; Cohen S; Wang Y; Chen H; Kumar D; Thiel W P450 Enzymes: Their Structure, Reactivity, and Selectivity—Modeled by QM/MM Calculations. Chemical Reviews 2010, 110 (2), 949–1017. DOI: 10.1021/cr900121s. [DOI] [PubMed] [Google Scholar]
  • (25).Warshel A; Sharma PK; Kato M; Xiang Y; Liu H; Olsson MHM Electrostatic Basis for Enzyme Catalysis. Chemical Reviews 2006, 106 (8), 3210–3235. DOI: 10.1021/cr0503106. [DOI] [PubMed] [Google Scholar]
  • (26).Jiménez-Osés G; Osuna S; Gao X; Sawaya MR; Gilson L; Collier SJ; Huisman GW; Yeates TO; Tang Y; Houk KN The role of distant mutations and allosteric regulation on LovD active site dynamics. Nature Chemical Biology 2014, 10 (6), 431–436. DOI: 10.1038/nchembio.1503. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (27).Yang Z; Liu F; Steeves AH; Kulik HJ Quantum Mechanical Description of Electrostatics Provides a Unified Picture of Catalytic Action Across Methyltransferases. The Journal of Physical Chemistry Letters 2019, 10 (13), 3779–3787. DOI: 10.1021/acs.jpclett.9b01555. [DOI] [PubMed] [Google Scholar]
  • (28).Pan Y; Gao D; Yang W; Cho H; Yang G; Tai H-H; Zhan C-G Computational redesign of human butyrylcholinesterase for anticocaine medication. Proceedings of the National Academy of Sciences 2005, 102 (46), 16656–16661. DOI: doi: 10.1073/pnas.0507332102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (29).Karvelis E; Swanson C; Tidor B Substrate Turnover Dynamics Guide Ketol-Acid Reductoisomerase Redesign for Increased Specific Activity. ACS Catalysis 2024, 14 (14), 10491–10509. DOI: 10.1021/acscatal.4c01446. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (30).Mou Z; Eakes J; Cooper CJ; Foster CM; Standaert RF; Podar M; Doktycz MJ; Parks JM Machine learning-based prediction of enzyme substrate scope: Application to bacterial nitrilases. Proteins: Structure, Function, and Bioinformatics 2021, 89 (3), 336–347. DOI: 10.1002/prot.26019. [DOI] [PubMed] [Google Scholar]
  • (31).Ran X; Jiang Y; Shao Q; Yang ZJ EnzyKR: a chirality-aware deep learning model for predicting the outcomes of the hydrolase-catalyzed kinetic resolution. Chemical Science 2023, 14 (43), 12073–12082, 10.1039/D3SC02752J. DOI: 10.1039/D3SC02752J. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (32).Venanzi NAE; Basciu A; Vargiu AV; Kiparissides A; Dalby PA; Dikicioglu D Machine Learning Integrating Protein Structure, Sequence, and Dynamics to Predict the Enzyme Activity of Bovine Enterokinase Variants. Journal of Chemical Information and Modeling 2024, 64 (7), 2681–2694. DOI: 10.1021/acs.jcim.3c00999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (33).Jiang Y; Ding N; Shao Q; Stull SL; Cheng Z; Yang ZJ Substrate Positioning Dynamics Involves a Non-Electrostatic Component to Mediate Catalysis. The Journal of Physical Chemistry Letters 2023, 14 (50), 11480–11489. DOI: 10.1021/acs.jpclett.3c02444. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (34).Kari J; Schaller K; Molina GA; Borch K; Westh P The Sabatier principle as a tool for discovery and engineering of industrial enzymes. Current Opinion in Biotechnology 2022, 78, 102843. DOI: 10.1016/j.copbio.2022.102843. [DOI] [PubMed] [Google Scholar]
  • (35).Jiang Y; Yan B; Chen Y; Juarez RJ; Yang ZJ Molecular Dynamics-Derived Descriptor Informs the Impact of Mutation on the Catalytic Turnover Number in Lactonase Across Substrates. The Journal of Physical Chemistry B 2022, 126 (13), 2486–2495. DOI: 10.1021/acs.jpcb.2c00142. [DOI] [PubMed] [Google Scholar]
  • (36).Jumper J; Evans R; Pritzel A; Green T; Figurnov M; Ronneberger O; Tunyasuvunakool K; Bates R; Žídek A; Potapenko A; et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021, 596 (7873), 583–589. DOI: 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (37).Gaines CS; Piccirilli JA; York DM The L-platform/L-scaffold framework: a blueprint for RNA-cleaving nucleic acid enzyme design. Rna 2020, 26 (2), 111–125, Article. DOI: 10.1261/rna.071894.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (38).Zhang J; Kulik HJ; Martinez TJ; Klinman JP Mediation of donor–acceptor distance in an enzymatic methyl transfer reaction. Proceedings of the National Academy of Sciences 2015, 112 (26), 7954–7959. DOI: doi: 10.1073/pnas.1506792112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (39).Biler M; Crean RM; Schweiger AK; Kourist R; Kamerlin SCL Ground-State Destabilization by Active-Site Hydrophobicity Controls the Selectivity of a Cofactor-Free Decarboxylase. Journal of the American Chemical Society 2020, 142 (47), 20216–20231. DOI: 10.1021/jacs.0c10701. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (40).Herrera DP; Chánique AM; Martínez-Márquez A; Bru-Martínez R; Kourist R; Parra LP; Schüller A Rational Design of Resveratrol O-methyltransferase for the Production of Pinostilbene. International Journal of Molecular Sciences 2021, 22 (9), 4345. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (41).Staudt A; Terholsen H; Kaur J; Müller H; Godehard SP; Itabaiana I Jr.; Leal ICR; Bornscheuer UT Rational Design for Enhanced Acyltransferase Activity in Water Catalyzed by the Pyrobaculum calidifontis VA1 Esterase. Microorganisms 2021, 9 (8). DOI: 10.3390/microorganisms9081790. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (42).Moroz YS; Dunston TT; Makhlynets OV; Moroz OV; Wu Y; Yoon JH; Olsen AB; McLaughlin JM; Mack KL; Gosavi PM; et al. New Tricks for Old Proteins: Single Mutations in a Nonenzymatic Protein Give Rise to Various Enzymatic Activities. Journal of the American Chemical Society 2015, 137 (47), 14905–14911. DOI: 10.1021/jacs.5b07812. [DOI] [PubMed] [Google Scholar]
  • (43).Gora A; Brezovsky J; Damborsky J Gates of Enzymes. Chemical Reviews 2013, 113 (8), 5871–5923. DOI: 10.1021/cr300384w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (44).Kokkonen P; Bednar D; Pinto G; Prokop Z; Damborsky J Engineering enzyme access tunnels. Biotechnology Advances 2019, 37 (6), 107386. DOI: 10.1016/j.biotechadv.2019.04.008. [DOI] [PubMed] [Google Scholar]
  • (45).Kaushik S; Marques SM; Khirsariya P; Paruch K; Libichova L; Brezovsky J; Prokop Z; Chaloupkova R; Damborsky J Impact of the access tunnel engineering on catalysis is strictly ligand-specific. The FEBS Journal 2018, 285 (8), 1456–1476. DOI: 10.1111/febs.14418. [DOI] [PubMed] [Google Scholar]
  • (46).Pavlova M; Klvana M; Prokop Z; Chaloupkova R; Banas P; Otyepka M; Wade RC; Tsuda M; Nagata Y; Damborsky J Redesigning dehalogenase access tunnels as a strategy for degrading an anthropogenic substrate. Nature Chemical Biology 2009, 5 (10), 727–733. DOI: 10.1038/nchembio.205. [DOI] [PubMed] [Google Scholar]
  • (47).Prakinee K; Phintha A; Visitsatthawong S; Lawan N; Sucharitakul J; Kantiwiriyawanitch C; Damborsky J; Chitnumsub P; van Pée K-H; Chaiyen P Mechanism-guided tunnel engineering to increase the efficiency of a flavin-dependent halogenase. Nature Catalysis 2022, 5 (6), 534–544. DOI: 10.1038/s41929-022-00800-8. [DOI] [Google Scholar]
  • (48).Sykora J; Brezovsky J; Koudelakova T; Lahoda M; Fortova A; Chernovets T; Chaloupkova R; Stepankova V; Prokop Z; Smatanova IK; et al. Dynamics and hydration explain failed functional transformation in dehalogenase design. Nature Chemical Biology 2014, 10 (6), 428–430. DOI: 10.1038/nchembio.1502. [DOI] [PubMed] [Google Scholar]
  • (49).Brezovsky J; Babkova P; Degtjarik O; Fortova A; Gora A; Iermak I; Rezacova P; Dvorak P; Smatanova IK; Prokop Z; et al. Engineering a de Novo Transport Tunnel. ACS Catalysis 2016, 6 (11), 7597–7610. DOI: 10.1021/acscatal.6b02081. [DOI] [Google Scholar]
  • (50).Xu L; Nawaz MZ; Khalid HR; Waqar Ul H; Alghamdi HA; Sun J; Zhu D Modulating the pH profile of vanillin dehydrogenase enzyme from extremophile Bacillus ligniniphilus L1 through computational guided site-directed mutagenesis. Int J Biol Macromol 2024, 263 (Pt 1), 130359. DOI: 10.1016/j.ijbiomac.2024.130359. [DOI] [PubMed] [Google Scholar]
  • (51).Janson G; Valdes-Garcia G; Heo L; Feig M Direct generation of protein conformational ensembles via machine learning. Nature Communications 2023, 14 (1), 774. DOI: 10.1038/s41467-023-36443-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (52).Wayment-Steele HK; Ojoawo A; Otten R; Apitz JM; Pitsawong W; Hömberger M; Ovchinnikov S; Colwell L; Kern D Predicting multiple conformations via sequence clustering and AlphaFold2. Nature 2024, 625 (7996), 832–839. DOI: 10.1038/s41586-023-06832-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (53).Brown BP; Stein RA; Meiler J; McHaourab HS Approximating Projections of Conformational Boltzmann Distributions with AlphaFold2 Predictions: Opportunities and Limitations. Journal of Chemical Theory and Computation 2024, 20 (3), 1434–1447. DOI: 10.1021/acs.jctc.3c01081. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (54).Kleffner R; Flatten J; Leaver-Fay A; Baker D; Siegel JB; Khatib F; Cooper S Foldit Standalone: a video game-derived protein structure manipulation interface using Rosetta. Bioinformatics 2017, 33 (17), 2765–2767. DOI: 10.1093/bioinformatics/btx283 (acccessed 7/24/2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (55).Barber-Zucker S; Mindel V; Garcia-Ruiz E; Weinstein JJ; Alcalde M; Fleishman SJ Stable and Functionally Diverse Versatile Peroxidases Designed Directly from Sequences. Journal of the American Chemical Society 2022, 144 (8), 3564–3571. DOI: 10.1021/jacs.1c12433. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (56).Lee J; Kladwang W; Lee M; Cantu D; Azizyan M; Kim H; Limpaecher A; Gaikwad S; Yoon S; Treuille A; et al. RNA design rules from a massive open laboratory. Proceedings of the National Academy of Sciences 2014, 111 (6), 2122–2127. DOI: doi: 10.1073/pnas.1313039111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (57).Boby ML; Fearon D; Ferla M; Filep M; Koekemoer L; Robinson MC; Consortium‡, T. C. M.; Chodera JD; Lee AA; London N; et al. Open science discovery of potent noncovalent SARS-CoV-2 main protease inhibitors. Science 2023, 382 (6671), eabo7201. DOI: doi: 10.1126/science.abo7201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (58).Leible S; Schlager S; Schubotz M; Gipp B A Review on Blockchain Technology and Blockchain Projects Fostering Open Science. Frontiers in Blockchain 2019, 2, Review. DOI: 10.3389/fbloc.2019.00016. [DOI] [Google Scholar]
  • (59).Pauling L Molecular Architecture and Biological Reactions. Chemical & Engineering News Archive 1946, 24 (10), 1375–1377. DOI: 10.1021/cen-v024n010.p1375. [DOI] [Google Scholar]
  • (60).Warshel A Electrostatic Origin of the Catalytic Power of Enzymes and the Role of Preorganized Active Sites *. Journal of Biological Chemistry 1998, 273 (42), 27035–27038. DOI: 10.1074/jbc.273.42.27035 (acccessed 2025/02/03). [DOI] [PubMed] [Google Scholar]
  • (61).Fried SD; Boxer SG Measuring Electric Fields and Noncovalent Interactions Using the Vibrational Stark Effect. Accounts of Chemical Research 2015, 48 (4), 998–1006. DOI: 10.1021/ar500464j. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (62).Kozuch J; Schneider SH; Zheng C; Ji Z; Bradshaw RT; Boxer SG Testing the Limitations of MD-Based Local Electric Fields Using the Vibrational Stark Effect in Solution: Penicillin G as a Test Case. The Journal of Physical Chemistry B 2021, 125 (17), 4415–4427. DOI: 10.1021/acs.jpcb.1c00578. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (63).Fried SD; Bagchi S; Boxer SG Extreme electric fields power catalysis in the active site of ketosteroid isomerase. Science 2014, 346 (6216), 1510–1514. DOI: doi: 10.1126/science.1259802. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (64).Fried SD; Boxer SG Electric Fields and Enzyme Catalysis. Annual Review of Biochemistry 2017, 86 (Volume 86, 2017), 387–415. DOI: 10.1146/annurev-biochem-061516-044432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (65).Bhowmick A; Sharma SC; Head-Gordon T The Importance of the Scaffold for de Novo Enzymes: A Case Study with Kemp Eliminase. Journal of the American Chemical Society 2017, 139 (16), 5793–5800. DOI: 10.1021/jacs.6b12265. [DOI] [PubMed] [Google Scholar]
  • (66).Ruiz-Pernía JJ; Świderek K; Bertran J; Moliner V; Tuñón I Electrostatics as a Guiding Principle in Understanding and Designing Enzymes. Journal of Chemical Theory and Computation 2024, 20 (5), 1783–1795. DOI: 10.1021/acs.jctc.3c01395. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (67).Bhowmick A; Sharma SC; Honma H; Head-Gordon T The role of side chain entropy and mutual information for improving the de novo design of Kemp eliminases KE07 and KE70. Physical Chemistry Chemical Physics 2016, 18 (28), 19386–19396, 10.1039/C6CP03622H. DOI: 10.1039/C6CP03622H. [DOI] [PubMed] [Google Scholar]
  • (68).Vaissier V; Sharma SC; Schaettle K; Zhang T; Head-Gordon T Computational Optimization of Electric Fields for Improving Catalysis of a Designed Kemp Eliminase. ACS Catalysis 2018, 8 (1), 219–227. DOI: 10.1021/acscatal.7b03151. [DOI] [Google Scholar]
  • (69).Galmés MÀ; Nödling AR; He K; Luk LYP; Świderek K; Moliner V Computational design of an amidase by combining the best electrostatic features of two promiscuous hydrolases. Chemical Science 2022, 13 (17), 4779–4787, 10.1039/D2SC00778A. DOI: 10.1039/D2SC00778A. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (70).Jindal G; Ramachandran B; Bora RP; Warshel A Exploring the Development of Ground-State Destabilization and Transition-State Stabilization in Two Directed Evolution Paths of Kemp Eliminases. ACS Catalysis 2017, 7 (5), 3301–3305. DOI: 10.1021/acscatal.7b00171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (71).Kamerlin SCL; Sharma PK; Chu ZT; Warshel A Ketosteroid isomerase provides further support for the idea that enzymes work by electrostatic preorganization. Proceedings of the National Academy of Sciences 2010, 107 (9), 4075–4080. DOI: doi: 10.1073/pnas.0914579107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (72).Bradshaw RT; Dziedzic J; Skylaris C-K; Essex JW The Role of Electrostatics in Enzymes: Do Biomolecular Force Fields Reflect Protein Electric Fields? Journal of Chemical Information and Modeling 2020, 60 (6), 3131–3144. DOI: 10.1021/acs.jcim.0c00217. [DOI] [PubMed] [Google Scholar]
  • (73).Adjoua O; Lagardère L; Jolly L-H; Durocher A; Very T; Dupays I; Wang Z; Inizan TJ; Célerse F; Ren P; et al. Tinker-HP: Accelerating Molecular Dynamics Simulations of Large Complex Systems with Advanced Point Dipole Polarizable Force Fields Using GPUs and Multi-GPU Systems. J. Chem. Theory Comput 2021, 17 (4), 2034–2053. DOI: 10.1021/acs.jctc.0c01164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (74).Yan S; Ji X; Peng W; Wang B Evaluating the Transition State Stabilization/Destabilization Effects of the Electric Fields from Scaffold Residues by a QM/MM Approach. The Journal of Physical Chemistry B 2023, 127 (19), 4245–4253. DOI: 10.1021/acs.jpcb.3c01054. [DOI] [PubMed] [Google Scholar]
  • (75).Zaragoza JPT; Offenbacher AR; Hu S; Gee CL; Firestein ZM; Minnetian N; Deng Z; Fan F; Iavarone AT; Klinman JP Temporal and spatial resolution of distal protein motions that activate hydrogen tunneling in soybean lipoxygenase. Proceedings of the National Academy of Sciences 2023, 120 (10), e2211630120. DOI: doi: 10.1073/pnas.2211630120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (76).Osuna S The challenge of predicting distal active site mutations in computational enzyme design. WIREs Computational Molecular Science 2021, 11 (3), e1502. DOI: 10.1002/wcms.1502. [DOI] [Google Scholar]
  • (77).Corbella M; Pinto GP; Kamerlin SCL Loop dynamics and the evolution of enzyme activity. Nature Reviews Chemistry 2023, 7 (8), 536–547. DOI: 10.1038/s41570-023-00495-w. [DOI] [PubMed] [Google Scholar]
  • (78).Yabukarski F; Biel JT; Pinney MM; Doukov T; Powers AS; Fraser JS; Herschlag D Assessment of enzyme active site positioning and tests of catalytic mechanisms through X-ray–derived conformational ensembles. Proceedings of the National Academy of Sciences 2020, 117 (52), 33204–33215. DOI: doi: 10.1073/pnas.2011350117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (79).Du S; Kretsch RC; Parres-Gold J; Pieri E; Cruzeiro VWD; Zhu M; Pinney MM; Yabukarski F; Schwans JP; Martínez TJ; et al. Conformational ensembles reveal the origins of serine protease catalysis. Science 2025, 387 (6735), eado5068. DOI: doi: 10.1126/science.ado5068. [DOI] [PubMed] [Google Scholar]
  • (80).Clements HD; Flynn AR; Nicholls BT; Grosheva D; Lefave SJ; Merriman MT; Hyster TK; Sigman MS Using Data Science for Mechanistic Insights and Selectivity Predictions in a Non-Natural Biocatalytic Reaction. Journal of the American Chemical Society 2023, 145 (32), 17656–17664. DOI: 10.1021/jacs.3c03639. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (81).Billeter SR; Webb SP; Agarwal PK; Iordanov T; Hammes-Schiffer S Hydride Transfer in Liver Alcohol Dehydrogenase: Quantum Dynamics, Kinetic Isotope Effects, and Role of Enzyme Motion. Journal of the American Chemical Society 2001, 123 (45), 11262–11272. DOI: 10.1021/ja011384b. [DOI] [PubMed] [Google Scholar]
  • (82).Jara GE; Pontiggia F; Otten R; Agafonov RV; Martí MA; Kern D Wide Transition-State Ensemble as Key Component for Enzyme Catalysis. Cold Spring Harbor Laboratory: 2023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (83).Bonk BM; Weis JW; Tidor B Machine Learning Identifies Chemical Characteristics That Promote Enzyme Catalysis. Journal of the American Chemical Society 2019, 141 (9), 4108–4118. DOI: 10.1021/jacs.8b13879. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (84).Schenkmayerova A; Pinto GP; Toul M; Marek M; Hernychova L; Planas-Iglesias J; Daniel Liskova V; Pluskal D; Vasina M; Emond S; et al. Engineering the protein dynamics of an ancestral luciferase. Nature Communications 2021, 12 (1), 3616. DOI: 10.1038/s41467-021-23450-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (85).Duran C; Casadevall G; Osuna S Harnessing Conformational Dynamics in Enzyme Catalysis to achieve Nature-like catalytic efficiencies: The Shortest Path Map tool for computational enzyme design. Faraday Discussions 2024, 10.1039/D3FD00156C. DOI: 10.1039/D3FD00156C. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (86).Hur S; Bruice TC Comparison of Formation of Reactive Conformers (NACs) for the Claisen Rearrangement of Chorismate to Prephenate in Water and in the E. coli Mutase: The Efficiency of the Enzyme Catalysis. Journal of the American Chemical Society 2003, 125 (19), 5964–5972. DOI: 10.1021/ja0210648. [DOI] [PubMed] [Google Scholar]
  • (87).Hur S; Bruice TC The near attack conformation approach to the study of the chorismate to prephenate reaction. Proceedings of the National Academy of Sciences 2003, 100 (21), 12015–12020. DOI: doi: 10.1073/pnas.1534873100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (88).Shao Q; Jiang Y; Yang ZJ EnzyHTP Computational Directed Evolution with Adaptive Resource Allocation. Journal of Chemical Information and Modeling 2023, 63 (17), 5650–5659. DOI: 10.1021/acs.jcim.3c00618. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (89).Shao Q; Jiang Y; Yang ZJ EnzyHTP: A High-Throughput Computational Platform for Enzyme Modeling. Journal of Chemical Information and Modeling 2022, 62 (3), 647–655. DOI: 10.1021/acs.jcim.1c01424. [DOI] [PubMed] [Google Scholar]
  • (90).Offenbacher AR; Hu S; Poss EM; Carr CAM; Scouras AD; Prigozhin DM; Iavarone AT; Palla A; Alber T; Fraser JS; et al. Hydrogen–Deuterium Exchange of Lipoxygenase Uncovers a Relationship between Distal, Solvent Exposed Protein Motions and the Thermal Activation Barrier for Catalytic Proton-Coupled Electron Tunneling. ACS Central Science 2017, 3 (6), 570–579. DOI: 10.1021/acscentsci.7b00142. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (91).Jara GE; Pontiggia F; Otten R; Agafonov RV; Martí MA; Kern D Wide Transition-State Ensemble as Key Component for Enzyme Catalysis. eLife Sciences Publications, Ltd: 2024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (92).Otten R; Pádua RAP; Bunzel HA; Nguyen V; Pitsawong W; Patterson M; Sui S; Perry SL; Cohen AE; Hilvert D; et al. How directed evolution reshapes the energy landscape in an enzyme to boost catalysis. Science 2020, 370 (6523), 1442–1446. DOI: doi: 10.1126/science.abd3623. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (93).Lauko A; Pellock SJ; Sumida KH; Anishchenko I; Juergens D; Ahern W; Jeung J; Shida A; Hunt A; Kalvet I; et al. Computational design of serine hydrolases. Science 0 (0), eadu2454. DOI: doi: 10.1126/science.adu2454. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (94).Bunzel HA; Anderson JLR; Hilvert D; Arcus VL; van der Kamp MW; Mulholland AJ Evolution of dynamical networks enhances catalysis in a designer enzyme. Nature Chemistry 2021, 13 (10), 1017–1022. DOI: 10.1038/s41557-021-00763-6. [DOI] [PubMed] [Google Scholar]
  • (95).Feller G; Gerday C Psychrophilic enzymes: hot topics in cold adaptation. Nat Rev Microbiol 2003, 1 (3), 200–208. DOI: 10.1038/nrmicro773. [DOI] [PubMed] [Google Scholar]
  • (96).Nguyen V; Wilson C; Hoemberger M; Stiller JB; Agafonov RV; Kutter S; English J; Theobald DL; Kern D Evolutionary drivers of thermoadaptation in enzyme catalysis. Science 2017, 355 (6322), 289–294. DOI: doi: 10.1126/science.aah3717. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (97).Åqvist J; Sočan J; Purg M Hidden Conformational States and Strange Temperature Optima in Enzyme Catalysis. Biochemistry 2020, 59 (40), 3844–3855. DOI: 10.1021/acs.biochem.0c00705. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (98).Sočan J; Purg M; Åqvist J Computer simulations explain the anomalous temperature optimum in a cold-adapted enzyme. Nature Communications 2020, 11 (1), 2644. DOI: 10.1038/s41467-020-16341-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (99).van der Ent F; Skagseth S; Lund BA; Sočan J; Griese JJ; Brandsdal BO; Åqvist J Computational design of the temperature optimum of an enzyme reaction. Science Advances 2023, 9 (26), eadi0963. DOI: doi: 10.1126/sciadv.adi0963. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (100).Hobbs JK; Jiao W; Easter AD; Parker EJ; Schipper LA; Arcus VL Change in Heat Capacity for Enzyme Catalysis Determines Temperature Dependence of Enzyme Catalyzed Rates. ACS Chemical Biology 2013, 8 (11), 2388–2393. DOI: 10.1021/cb4005029. [DOI] [PubMed] [Google Scholar]
  • (101).Åqvist J; Isaksen GV; Brandsdal BO Computation of enzyme cold adaptation. Nature Reviews Chemistry 2017, 1 (7), 0051. DOI: 10.1038/s41570-017-0051. [DOI] [Google Scholar]
  • (102).Carvalho-Silva VH; Coutinho ND; Aquilanti V Temperature Dependence of Rate Processes Beyond Arrhenius and Eyring: Activation and Transitivity. Frontiers in Chemistry 2019, 7, Original Research. DOI: 10.3389/fchem.2019.00380. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (103).Souza PM d. Application of microbial α-amylase in industry-A review. Brazilian journal of microbiology 2010, 41, 850–861. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (104).Apic G; Gough J; Teichmann SA Domain combinations in archaeal, eubacterial and eukaryotic proteomes. Journal of molecular biology 2001, 310 (2), 311–325. [DOI] [PubMed] [Google Scholar]
  • (105).Ge R; Ding N; Jiang Y; Yang Z Linker-Mediated Domain Separation Enhances Cold Adaptation in Cellulases. ChemRxiv. 2024. DOI: doi: 10.26434/chemrxiv-2024-4m1f0 [DOI] [Google Scholar]
  • (106).Li P; Soudackov AV; Hammes-Schiffer S Fundamental Insights into Proton-Coupled Electron Transfer in Soybean Lipoxygenase from Quantum Mechanical/Molecular Mechanical Free Energy Simulations. Journal of the American Chemical Society 2018, 140 (8), 3068–3076. DOI: 10.1021/jacs.7b13642. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (107).Masgrau L; Truhlar DG The Importance of Ensemble Averaging in Enzyme Kinetics. Accounts of Chemical Research 2015, 48 (2), 431–438. DOI: 10.1021/ar500319e. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (108).Zhong J; Reinhardt CR; Hammes-Schiffer S Role of Water in Proton-Coupled Electron Transfer between Tyrosine and Cysteine in Ribonucleotide Reductase. Journal of the American Chemical Society 2022, 144 (16), 7208–7214. DOI: 10.1021/jacs.1c13455. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (109).Warburton RE; Soudackov AV; Hammes-Schiffer S Theoretical Modeling of Electrochemical Proton-Coupled Electron Transfer. Chemical Reviews 2022, 122 (12), 10599–10650. DOI: 10.1021/acs.chemrev.1c00929. [DOI] [PubMed] [Google Scholar]
  • (110).Frost CF; Balasubramani SG; Antoniou D; Schwartz SD Connecting Conformational Motions to Rapid Dynamics in Human Purine Nucleoside Phosphorylase. The Journal of Physical Chemistry B 2023, 127 (1), 144–150. DOI: 10.1021/acs.jpcb.2c07243. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (111).Amrein BA; Steffen-Munsberg F; Szeler I; Purg M; Kulkarni Y; Kamerlin SCL CADEE: Computer-Aided Directed Evolution of Enzymes. IUCrJ 2017, 4 (1), 50–64. DOI: doi: 10.1107/S2052252516018017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (112).Roda S; Terholsen H; Meyer JRH; Cañellas-Solé A; Guallar V; Bornscheuer U; Kazemi M AsiteDesign: a Semirational Algorithm for an Automated Enzyme Design. The Journal of Physical Chemistry B 2023, 127 (12), 2661–2670. DOI: 10.1021/acs.jpcb.2c07091. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (113).Wijma HJ; Floor RJ; Bjelic S; Marrink SJ; Baker D; Janssen DB Enantioselective Enzymes by Computational Design and In Silico Screening. Angew. Chem. Int. Ed 2015, 54 (12), 3726–3730. DOI: 10.1002/anie.201411415. [DOI] [PubMed] [Google Scholar]
  • (114).Hon J; Borko S; Stourac J; Prokop Z; Zendulka J; Bednar D; Martinek T; Damborsky J EnzymeMiner: automated mining of soluble enzymes with diverse structures, catalytic properties and stabilities. Nucleic Acids Research 2020, 48 (W1), W104–W109. DOI: 10.1093/nar/gkaa372. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (115).Zhou Q; Chin M; Fu Y; Liu P; Yang Y Stereodivergent atom-transfer radical cyclization by engineered cytochromes P450. Science 2021, 374 (6575), 1612–1616. DOI: doi: 10.1126/science.abk1603. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (116).Chen K; Arnold FH Engineering new catalytic activities in enzymes. Nature Catalysis 2020, 3 (3), 203–213. DOI: 10.1038/s41929-019-0385-5. [DOI] [Google Scholar]
  • (117).García-Paz FDM; Del Moral S; Morales-Arrieta S; Ayala M; Treviño-Quintanilla LG; Olvera-Carranza C Multidomain chimeric enzymes as a promising alternative for biocatalysts improvement: a minireview. Molecular Biology Reports 2024, 51 (1). DOI: 10.1007/s11033-024-09332-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (118).Wilkinson MD; Dumontier M; Aalbersberg IJ; Appleton G; Axton M; Baak A; Blomberg N; Boiten J-W; da Silva Santos LB; Bourne PE; et al. The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data 2016, 3 (1), 160018. DOI: 10.1038/sdata.2016.18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (119).Harrow J; Drysdale R; Smith A; Repo S; Lanfear J; Blomberg N ELIXIR: providing a sustainable infrastructure for life science data at European scale. Bioinformatics 2021, 37 (16), 2506–2511. DOI: 10.1093/bioinformatics/btab481 (acccessed 1/23/2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (120).Loschmidt Laboratories Protein Engineering Portal. 2025. https://loschmidt.chemi.muni.cz/portal/ (accessed.
  • (121).Prešern U; Goličnik M Enzyme Databases in the Era of Omics and Artificial Intelligence. International Journal of Molecular Sciences 2023, 24 (23), 16918. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (122).Lu C; Lubin JH; Sarma VV; Stentz SZ; Wang G; Wang S; Khare SD Prediction and design of protease enzyme specificity using a structure-aware graph convolutional network. Proceedings of the National Academy of Sciences 2023, 120 (39), e2303590120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (123).Notin P; Kollasch A; Ritter D; Van Niekerk L; Paul S; Spinner H; Rollins N; Shaw A; Orenbuch R; Weitzman R Proteingym: Large-scale benchmarks for protein fitness prediction and design. Advances in Neural Information Processing Systems 2024, 36. [Google Scholar]
  • (124).Yan B; Ran X; Gollu A; Cheng Z; Zhou X; Chen Y; Yang ZJ IntEnzyDB: an Integrated Structure–Kinetics Enzymology Database. Journal of chemical information and modeling 2022, 62 (22), 5841–5848. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (125).Das S; Shimshi M; Raz K; Nitoker Eliaz N; Mhashal AR; Ansbacher T; Major DT EnzyDock: Protein–Ligand Docking of Multiple Reactive States along a Reaction Coordinate in Enzymes. Journal of Chemical Theory and Computation 2019, 15 (9), 5116–5134. DOI: 10.1021/acs.jctc.9b00366. [DOI] [PubMed] [Google Scholar]
  • (126).O’Brien TE; Bertolani SJ; Zhang Y; Siegel JB; Tantillo DJ Predicting productive binding modes for substrates and carbocation intermediates in terpene synthases—bornyl diphosphate synthase as a representative case. ACS catalysis 2018, 8 (4), 3322–3330. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (127).Burns LA; Faver JC; Zheng Z; Marshall MS; Smith DGA; Vanommeslaeghe K; MacKerell AD Jr.; Merz KM Jr.; Sherrill CD The BioFragment Database (BFDb): An open-data platform for computational chemistry analysis of noncovalent interactions. J Chem Phys 2017, 147 (16), 161727. DOI: 10.1063/1.5001028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (128).Dauparas J; Anishchenko I; Bennett N; Bai H; Ragotte RJ; Milles LF; Wicky BI; Courbet A; de Haas RJ; Bethel N Robust deep learning–based protein sequence design using ProteinMPNN. Science 2022, 378 (6615), 49–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (129).Duan C; Du Y; Jia H; Kulik HJ Accurate transition state generation with an object-aware equivariant elementary reaction diffusion model. Nature Computational Science 2023, 3 (12), 1045–1055. DOI: 10.1038/s43588-023-00563-7. [DOI] [PubMed] [Google Scholar]
  • (130).Wang T; He X; Li M; Li Y; Bi R; Wang Y; Cheng C; Shen X; Meng J; Zhang H; et al. Ab initio characterization of protein molecular dynamics with AI2BMD. Nature 2024, 635 (8040), 1019–1027. DOI: 10.1038/s41586-024-08127-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (131).Doga H; Raubenolt B; Cumbo F; Joshi J; Difilippo FP; Qin J; Blankenberg D; Shehab O A Perspective on Protein Structure Prediction Using Quantum Computers. J. Chem. Theory Comput 2024, 20 (9), 3359–3378. DOI: 10.1021/acs.jctc.4c00067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (132).Pan X; Yang J; Van R; Epifanovsky E; Ho J; Huang J; Pu J; Mei Y; Nam K; Shao Y Machine-Learning-Assisted Free Energy Simulation of Solution-Phase and Enzyme Reactions. J. Chem. Theory Comput 2021, 17 (9), 5745–5758. DOI: 10.1021/acs.jctc.1c00565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (133).Olehnovics E; Liu YM; Mehio N; Sheikh AY; Shirts MR; Salvalaglio M Assessing the Accuracy and Efficiency of Free Energy Differences Obtained from Reweighted Flow-Based Probabilistic Generative Models. J. Chem. Theory Comput 2024, 20 (14), 5913–5922. DOI: 10.1021/acs.jctc.4c00520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (134).Armer C; Kane H; Cortade DL; Redestig H; Estell DA; Yusuf A; Rollins N; Spinner H; Marks D; Brunette T; et al. Results of the Protein Engineering Tournament: An Open Science Benchmark for Protein Modeling and Design. bioRxiv 2024, 2024.2008.2012.606135. DOI: 10.1101/2024.08.12.606135. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES