Abstract
Over a century ago, physicists started broadly relying on theoretical models to guide new experiments. Soon thereafter, chemists began doing the same. Now, biological research enters a new era when experiment and theory walk hand in hand. Novel software and specialized hardware became essential to understand experimental data and propose new models. In fact, current petascale computing resources already allow researchers to reach unprecedented levels of simulation throughput to connect in silico and in vitro experiments. The reduction in cost and improved access allowed a large number of research groups to adopt supercomputing resources and techniques. Here, we outline how large-scale computing has evolved to expand decades-old research, spark new research efforts, and continuously connect simulation and observation. For instance, multiple publicly and privately funded groups have dedicated extensive resources to develop artificial intelligence tools for computational biophysics, from accelerating quantum chemistry calculations to proposing protein structure models. Moreover, advances in computer hardware have accelerated data processing from single-molecule experimental observations and simulations of chemical reactions occurring throughout entire cells. The combination of software and hardware has opened the way for exascale computing and the production of the first public exascale supercomputer, Frontier, inaugurated by the Oak Ridge National Laboratory in 2022. Ultimately, the popularization and development of computational techniques and the training of researchers to use them will only accelerate the diversification of tools and learning resources for future generations.
Significance
From the integration of artificial intelligence tools to the development of specialized hardware, computational biophysics has evolved to encompass state-of-the-art technologies in every stage of its scientific research efforts. The development of new computational tools had implications across all fields, leading public funding agencies to support the creation and use of supercomputers for biological research and private sector investments to support numerous in silico drug development startups. The next generation of exascale supercomputers are already creating new opportunities for integrating computational modeling and experiments, pushing computational biophysics beyond explaining experimental results, fostering discoveries with an unprecedented level of detail.
Introduction
The history of biophysics is paved by cross-disciplinary innovation. From Erwin Schrödinger’s take on genetics using an information transfer perspective (1) to Hodgkin and Huxley’s description of electric currents in cells (2), biology and mathematics and physiology and physics have continuously worked together. With the advent of computers, new opportunities emerged for combining disciplines, as exemplified by the 2013 Nobel Prize awarded 10 years ago to Karplus, Levitt, and Warshel for their contributions to theoretical chemistry. Software and hardware have become essential to many aspects of biophysical research, not only to make sense of experimental data but to propose and validate new models.
In this perspective, we will outline how large-scale computing has provided novel insights to classic research efforts and can continue to open new roads for the exploration of biophysical phenomena. While we attempt to cover important computational techniques, detailing their many differences would be outside the scope of this perspective. To the readers that are interested in details of the computational techniques, we suggest recent reviews in quantum mechanics/molecular mechanics, coarse-grained simulations, and Brownian dynamics, among other techniques that benefit from exascale computing (3,4,5,6,7,8). Here, we will focus on the interplay between methodological advances and biological breakthroughs.
From molecules to cells
Protein structure prediction, one of the oldest biochemistry challenges tackled by computational biophysics, was also the focus of recent broadspread media attention due to the performance of artificial intelligence (AI)-based algorithms. The notion that a primary sequence of amino acid residues contained all the necessary information to determine a protein’s three-dimensional structure captivated researchers for decades. Many attempted to use computers to impose physical and chemical laws onto unsuspecting bits representing atoms and bonds, and despite documented examples of sequence-similar and structure-dissimilar proteins in public databases (9), homology-based protein structure prediction enjoyed continuous successes over the years, with prominent examples being Modeller (10), Rosetta (11), and SWISS-MODEL (12). The new AI-era models such as AlphaFold (13), RoseTTAFold (14), and, more recently, OmegaFold (15) also rely heavily on previously determined primary and tertiary structures of known proteins, either explicitly using sequence alignments or implicitly storing that information on millions of neural network parameters. They have now reached the point where almost 200 million structures, including the entire human proteome, have been computationally predicted (16). The unifying thread in this diverse set of approaches is the extensive use of state-of-the-art computer hardware and software to optimize models, execute algorithms, and analyze results. Specifically, the development of new graphic processing units (GPUs) and tensor processing units over the past decade, along with specialized machine-learning packages such as Keras, TensorFlow, and PyTorch, among others, now provides unprecedented training potential for new neural network models and applications.
Beyond predicting structures, the study of protein dynamics and properties has benefited from large-scale computing in many ways. Dedicated hardware built exclusively for molecular dynamics (MD) simulations has allowed unprecedented insight into small proteins’ folding pathways (17). With the advance of GPUs and tensor processing units, training neural networks is becoming faster and easier, opening new doors for academic research. For example, AI models trained on protein sequences to produce learned representations can now be used to predict features such as fold stability and mutation impact on protein function (18). The same infrastructure powers efforts to integrate AI into analysis of protein dynamics (19), such as the dimensionality reduction of an MD trajectory using autoencoders (20). These and other developments were only possible due to the creation of advanced neural network architectures, including case-specific autoencoders and recurrent neural networks. This type of novel insight into protein sequence and dynamics can significantly help AI-based drug development pipelines (21) and computer-aided protein design and evolution efforts (22).
A particularly interesting research endeavor where computational power has bridged the gap between theory and experiments is single-molecule force spectroscopy (SMFS), where unbinding or unfolding forces are used as probes for molecular interactions and energy landscapes (25,26). The computational study of mechanostable molecular interactions started decades ago. The streptavidin:biotin complex, for example, widely used in experimental research, was simulated 25 years ago (27) using the most advanced hardware available at the time to achieve picosecond-level sampling of eight replicas per system. Almost 20 years later, the same molecular complex was revisited to elucidate the origins of conflicting experimental observations: In 2018 (28), a monovalent streptavidin tetramer was simulated to reveal different unbinding pathways for biotin when pulling the complex from the C- or N-terminus. In 2020 (29), in a study that demanded 400 80-ns-long replicas for a total of 32 s of simulation time, the four-subunit streptavidin complex bound to a single biotin molecule was simulated to reveal how the tethering geometry affects experimental measurements. Exploring more complicated interfaces, in silico SMFS reached hundreds of replicas totaling dozens of s of MD simulated time when modeling protein complexes, providing an unprecedented level of detail and explaining the origins of mechanostability in protein:protein interactions (23,30,31,32). More recently, combining this in silico SMFS approach with AI-based structure prediction has allowed for the first screening of mechanical properties of dozens of Staphylococcus’ adhesins during initial steps of infection (24) (see Fig. 1). These efforts were made possible largely by the use of National Science Foundation-funded supercomputers, which allowed parallel broad sampling of in silico SMFS experiments at multiple pulling speeds, tethering geometries, and interface variants. Despite technological advances, there is still a large gap between timescales accessible through experimental and in silico SMFS. While state-of-the-art resources have significantly broadened researchers’ access to all-atom MD simulations that span millisecond timescales, few groups can reliably obtain enough replicas of independent simulations as to fit the models used to explain experimental observations. A recent example of this limitation was explored by using coarse-grained MD simulations to connect all-atom and experimental SMFS data for Staphylococcus’ adhesins (33).
Figure 1.
Modern supercomputers are powering discoveries with an in silico SMFS approach. Staphylococcus aureus is a pathogen that can form biofilms on implants and medical devices. A critical step to biofilm formation is the formation of a tight interaction between microbial surface proteins called adhesins and components of the extracellular matrix of the host. A combination of in silico and in vitro single-molecule techniques has revealed how the bond between staphylococcal adhesins and their human targets can withstand forces in the same order of that of covalent bonds (23,24). To see this figure in color, go online.
In recent decades, supercomputers have been used to simulate increasingly complicated molecular complexes (34). The first all-atom simulation of the entire HIV-1 viral capsid (35), for example, relied on cryoelectron microscopy (cryo-EM) data to build a 64-million-atom system. In 2013, simulating this system required 128,000 cores of the National Science Foundation-funded Blue Waters supercomputer to achieve 200 ns of total simulation time for the entire capsid. Almost 10 years later, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) viral envelope was simulated using a combination of AI-powered methods and high-performance computing resources from the Department of Energy-funded Summit supercomputer. This research led to the creation of a 305-million-atom simulation that elucidated how elements of the SARS-CoV-2 viral envelope interact (36), which still achieved 68 ns per day of simulation performance. Large-scale simulation efforts have also targeted realistic compositions of cell membranes (37), and the electron transfer process in mitochondrial complexes (38).
More than broadening the reach of simulations to increasingly larger systems, high-performance computing has also been used to extend the range of enhanced sampling and free-energy calculation in MD simulations (39). Many macromolecular complexes would require seconds of simulated time in order to spontaneously visit relevant biological states multiple times. That is why the continuous improvement of enhanced sampling methods is crucial for computational biophysics. For example, researchers used a weighted ensemble strategy to simulate the spike protein trimer of SARS-CoV-2 in over 130 s of trajectories, obtaining hundreds of spontaneous opening and closing events (40) that showed how glycans control the motion of this macromolecular complex. One particular challenge for the exascale era that is already being addressed today is the integration of quantum mechanical/molecular mechanical (QM/MM) calculations to classical simulations, which will require efficient communication as both simulation size and computer clusters grow over time (41,42). QM/MM simulations have been used to investigate not only chemical reactions in biological molecules (43) but also to understand how these molecules behave in highly polarizable environments, such as biological membranes (44). In a hybrid QM/MM approach, the charging of a tRNA molecule was examined using a massively parallel computation strategy that included a string method with swarms of trajectories to enhance the sampling of possible chemical reactions paths (45).
While exascale computing achieved by large supercomputer clusters will revolutionize computational biophysics research in the coming years, petascale research is already accessible today. The combination of drastic increase in computing power within similar price ranges for central processing units, the development of new architectures for GPUs, and improvements in supercomputer infrastructure and usability allows researchers to reach unprecedented levels of simulation throughput to connect in silico and in vitro experiments.
Improvements made to speed up MD software packages (46,47), coupled with novel hardware, allowed researchers to accumulate collections of simulation data of unprecedented size. In fact, the development of specialized GPU kernels and the accelerated rate of inter-GPU communication have greatly accelerated MD packages, both in small and large systems. Now, a major part of computational biophysics efforts are focused on the development of analysis methods that can accelerate data processing and extraction of information from simulations. Researchers are engaging this analysis challenge on many fronts, from graphical user interface-based packages like QwikMD (48,49) to coding-oriented packages such as MDAnalysis (50), PyTraj (51,52), MDtraj (53), and Bio3D (54). Specialized analysis methods have also been developed to detect large-scale, low-frequency movements in macromolecular complexes (55,56) or to represent biomolecules as a graph and detect cooperative motions (57,58). The common theme among these and other methods is their ability to represent macromolecular dynamics in low-dimensional spaces that are amenable to both visualization and analysis. Thankfully, the origin of this problem also provides solutions. Computer clusters are regularly used to accelerate the analysis of results they created and can even help visualize systems too large to leave their massive storage devices (59,60).
Experimental biology advances have been equally important in pushing the boundaries of computational biophysics research. Cryo-EM has occupied a prominent role in such studies given its ability to provide large-scale architectures for macromolecular complexes and even detailed structures for their building blocks. From the structure of the eukaryotic translational initiation complex (61) to the structure of the mitochondrial complex I (62,63) or the organization of the entire nuclear pore complex of Saccharomyces cerevisiae (64), cryo-EM has continued to provide a macroscopic view for many research efforts, opening the doors for the detailed examination of dynamics that all-atom simulations can provide. A notable example was the determination of the water-mediated proton transport through the proton channel of a yeast ATPase (65), which revealed how the transport mechanism relies on specific protonation states of amino acid side chains throughout the channel. The cryo-EM and MD techniques outlined here can be combined for even more synergistic research, providing hybrid approaches for structure determination (66,67), such as MD flexible fitting (68) and damped-dynamics flexible fitting (69). The exploration of macromolecular dynamics can also be accelerated by combining experiments and simulations (70), including methods such as CryoFold (71) and Metainference (72), where short-lived intermediate states can be computationally inferred to describe large-scale molecular rearrangements.
Moving toward cell-scale models, different simulation and analysis techniques are combined to reach length scales and timescales that would be impossible to achieve only a few years before. Already in 2008, researchers were exploring how crowding in the Escherichia coli cytoplasm affected molecular diffusion and bimolecular association reactions using coarse-grained reaction-diffusion simulations (73). In 2010, a landmark study used the same biological target, the E. coli cytoplasm, and its 50 most abundant macromolecules to perform a Brownian dynamics simulation that assessed crowding effects on protein folding, molecular association, and thermodynamic properties in the cytoplasm (74). Recently, another group used an all-atom approach to build a model of the E. coli cytoplasm composed of 1.5 million atoms and submitted it to a total of 3 s of MD simulation (75), focusing on the technical aspects of preparing and validating all-atom models for future larger-scale models of bacterial cytoplasm.
A tour de force that exemplifies the union of experimental and computational biophysics was the exploration of principles of cellular energy metabolism in Rhodobacter sphaeroides (76) (see Fig. 2). While all individual molecules and protein complexes involved in the photosynthetic energy conversion mechanism of the bacteria’s chromatophore were previously resolved, their structure and pairwise interactions are not sufficient to provide a unified model for this mechanism. It was necessary to combine atomic force microscopy, electron microscopy, crystallography, mass spectrometry, proteomics, and optical spectroscopy data to build the model for the chromatophore. Beyond protein structures and the lipid composition of the membrane in which proteins were immersed, the very arrangement of all components was mapped to build a 100-million-atom model for the organelle. With all this information, researchers could explore multiple aspects of the photosynthetic energy conversion mechanism. Simulations ranged from QM/MM, to obtain excitation energies of pigments, to all-atom, coarse-grained, and Brownian dynamics, reaching hundreds of microseconds of sampling to track the internal movements of the electron carrier cytochrome . Evidently, the effort relied on supercomputing resources and a variety of specialized algorithms and software packages made available to the community.
Figure 2.
Model of a 100-million-atom chromatophore, an organelle responsible for light harvesting in Rhodobacter sphaeroides. The model was built using state-of-the-art supercomputers to combine atomic force microscopy, electron microscopy, crystallography, mass spectrometry, proteomics, and optical spectroscopy data. Figure adapted from Singharoy et al. (76). To see this figure in color, go online.
Much of the work described so far focuses on studying a protein or complex within its local biological environment. Similarly, the study of chemical reactions can benefit from contextualizing them within the greater network to which they belong. The field of chemical kinetics, and in particular the discreteness and stochasticity inherent to chemical reactions within living cells, has seen great advances with the development of theory and computational methods for time-resolved simulations of reaction networks (77). Such simulations allow researchers to explore a variety of behaviors observed during cell growth, proliferation, and differentiation, for example with the modeling of self-regulated feedback loops in gene expression circuits (78).
While many reactions in a cell can be approximated using deterministic models, such as ordinary differential equations or flux balance analysis, in order to reach whole-cell coverage, one must integrate biochemical systems that rely on species with low copy numbers, such as mRNAs or transcriptions factors. In such cases, stochasticity is unavoidable, and researchers have developed software packages to describe reactions kinetics that can even incorporate spatial inhomogeneity. A prominent example was the recent whole-cell model of the model organism JCVI-syn3A, a minimal cell with an artificially reduced genome that allowed for an unprecedented model coverage of a living organism (79). This effort counted with a variety of modeling approaches and integrated ordinary differential equations with reaction-diffusion master equation models to describe an entire cell cycle of this cell, including energy metabolism, DNA replication, and gene expression. The same research also counted with an extensive array of experimental input, from qPCR to cryoelectron tomograms.
In the context of whole-cell modeling, the development of large-scale computing resources was doubly important. First, it allowed for rare events to be extensively sampled, which is essential for the study of stochastic systems, and for validating models using experimental observations. Second, it allowed researchers to meet the parameter optimization demands from numeric approximations of complex reaction networks. Cell-scale models are inevitably underinformed, and researchers depend on parameter estimation and careful sensitivity analysis to properly analyze model results. StochSS (80), for example, is a stochastic simulation package already integrated into cloud computing environments, allowing researchers to access extensive computational power with ease. Lattice Microbes (81), on the other hand, focused on GPU-specialized code to extend the reach of reaction-diffusion master equation simulations into larger spatial coverage of a cell and longer simulation times, leveraging the impressive advances seen in GPU hardware over the past decade.
Conclusion
Since all software packages described here are freely available for academic use, and many are easily accessible through web interfaces, it is easy to forget about the costs associated with creating and providing such services. AlphaFold, for example, reportedly took many days of training using the most advanced hardware to reach the published model and is distributed with over two TB of databases. Training the final version of this model (and ignoring intermediate versions) was estimated to require dozens of thousands of dollars in computing costs, resources largely unavailable to the vast majority of academic researchers in the world. Thinking about the big picture, the need for extensive computing resources in many fields justifies investments of hundreds of millions of dollars made by governments to build the fastest supercomputers in the world. Universities also recognize the need for computational infrastructure by funding new computing-oriented faculty with larger start-up packages that approach the ones offered to experimental counterparts. This hidden cost of large-scale computing comes with an environmental impact: Australian astronomers estimated that their supercomputer usage is responsible for emitting the equivalent of approximately 15 kilotons of CO2 per year (82). This is prompting supercomputing centers and cloud computing providers to be more transparent about energy expenditure, helping researchers to balance computational power and environmental impact (83). Hopefully, a combination of more efficient software and renewable energy sources will soon power all our research needs.
Between state-of-the-art high-performance computing infrastructure easily accessible through cloud instances and breakthroughs in quantum computing already being used in drug development efforts (84), computational biophysics continues to benefit from private sector investments. NVIDIA routinely makes its hardware available for universities, so academic researchers can develop new algorithms using the latest advances in GPU architectures, and AMD’s GPUs currently power some of the largest supercomputers in the world. Microsoft, Amazon, and other cloud service providers offer free resources for students and academic rates for laboratories, not to mention tech giants such as Google and Facebook who are continuously funding algorithmic developments in their AI-specialized divisions. In fact, both basic and applied research in computational biophysics now enjoy the neural network architectures originally designed for computer vision or text processing. While there are notable case-specific developments, such as AlphaFold, recurrent neural networks originally designed for language processing now power DNA sequence analysis, and facial recognition algorithms now process microscopy images.
While computational research in biophysics has always been driven by improvements in experimental methodologies, much of the past research has focused on establishing computational methods themselves. Now, with the advances made in computer hardware, in silico models can capitalize on the algorithms and theories developed over the past decades to literally reach the time- and length scales observed experimentally. As a result, computational research can routinely and accurately create new proteins, propose atomic-level modifications to small molecules, and design regulatory circuits using molecular biology building blocks. More than simply recapitulating experimental observations, the current range of possibilities in computational biophysics allows us bring detail and provide missing information that is out of reach for in vitro models. Due to these and other advances, in silico research has gained both the attention and trust of experimentalists, which is reflected in major combined computational-experimental efforts now frequently seen in high-impact publications. Our path forward will include new combinations of computational and experimental efforts, with unprecedented precision and reach, ultimately allowing us to challenge biological assumptions and propose new models for biophysical phenomena.
Author contributions
All authors wrote and reviewed the article.
Acknowledgments
This work was supported by the National Science Foundation under grant MCB-2143787 (CAREER: In Silico Single-Molecule Force Spectroscopy).
Declaration of interests
The authors declare no competing interests.
Editor: Meyer Jackson.
References
- 1.Schrodinger E. At the University Press; 1951. What Is Life? the Physical Aspect of the Living Cell. [Google Scholar]
- 2.Hodgkin A.L., Huxley A.F. A quantitative description of membrane current and its application to conduction and excitation in nerve. J. Physiol. 1952;117:500–544. doi: 10.1113/jphysiol.1952.sp004764. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Hollingsworth S.A., Dror R.O. Molecular dynamics simulation for all. Neuron. 2018;99:1129–1143. doi: 10.1016/j.neuron.2018.08.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Venable R.M., Krämer A., Pastor R.W. Molecular dynamics simulations of membrane permeability. Chem. Rev. 2019;119:5954–5997. doi: 10.1021/acs.chemrev.8b00486. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Salo-Ahen O.M.H., Alanko I., et al. Vanmeert M. Molecular dynamics simulations in drug discovery and pharmaceutical development. Processes. 2020;9:71. [Google Scholar]
- 6.Joshi S.Y., Deshmukh S.A. A review of advancements in coarse-grained molecular dynamics simulations. Mol. Simulat. 2021;47:786–803. [Google Scholar]
- 7.Huber G.A., McCammon J.A. Brownian dynamics simulations of biological molecules. Trends Chem. 2019;1:727–738. doi: 10.1016/j.trechm.2019.07.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Gupta C., Sarkar D., et al. Singharoy A. The ugly, bad, and good stories of large-scale biomolecular simulations. Curr. Opin. Struct. Biol. 2022;73:102338. doi: 10.1016/j.sbi.2022.102338. [DOI] [PubMed] [Google Scholar]
- 9.Kosloff M., Kolodny R. Sequence-similar, structure-dissimilar protein pairs in the PDB. Proteins. 2008;71:891–902. doi: 10.1002/prot.21770. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Webb B., Sali A. Comparative protein structure modeling using MODELLER. Curr. Protoc. Bioinformatics. 2016;54:5–6. doi: 10.1002/cpbi.3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Leman J.K., Weitzner B.D., et al. Bonneau R. Macromolecular modeling and design in Rosetta: recent methods and frameworks. Nat. Methods. 2020;17:665–680. doi: 10.1038/s41592-020-0848-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Waterhouse A., Bertoni M., et al. Schwede T. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res. 2018;46:W296–W303. doi: 10.1093/nar/gky427. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Jumper J., Evans R., et al. Hassabis D. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. doi: 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Baek M., DiMaio F., et al. Baker D. Accurate prediction of protein structures and interactions using a three-track neural network. Science. 2021;373:871–876. doi: 10.1126/science.abj8754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Wu R., Ding F., et al. Ma J. High-resolution de novo structure prediction from primary sequence. BioRxiv. 2022 doi: 10.1101/2022.07.21.500999. Preprint at. [DOI] [Google Scholar]
- 16.Callaway E. ‘The entire protein universe’: AI predicts shape of nearly every known protein. Nature. 2022;608:15–16. doi: 10.1038/d41586-022-02083-2. [DOI] [PubMed] [Google Scholar]
- 17.Shaw D.E., Grossman J., et al. Fenton C.H. In SC’14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE; 2014. Anton 2: raising the bar for performance and programmability in a special-purpose molecular dynamics supercomputer; pp. 41–53. [Google Scholar]
- 18.Alley E.C., Khimulya G., et al. Church G.M. Unified rational protein engineering with sequence-based deep representation learning. Nat. Methods. 2019;16:1315–1322. doi: 10.1038/s41592-019-0598-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Noé F., De Fabritiis G., Clementi C. Machine learning for protein folding and dynamics. Curr. Opin. Struct. Biol. 2020;60:77–84. doi: 10.1016/j.sbi.2019.12.005. [DOI] [PubMed] [Google Scholar]
- 20.Wehmeyer C., Noé F. Time-lagged autoencoders: deep learning of slow collective variables for molecular kinetics. J. Chem. Phys. 2018;148:241703. doi: 10.1063/1.5011399. [DOI] [PubMed] [Google Scholar]
- 21.Melo M.C.R., Maasch J.R.M.A., de la Fuente-Nunez C. Accelerating antibiotic discovery through artificial intelligence. Commun. Biol. 2021;4 doi: 10.1038/s42003-021-02586-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Wittmann B.J., Yue Y., Arnold F.H. Informed training set design enables efficient machine learning-assisted directed protein evolution. Cell Syst. 2021;12:1026–1045.e7. doi: 10.1016/j.cels.2021.07.008. [DOI] [PubMed] [Google Scholar]
- 23.Milles L.F., Schulten K., et al. Bernardi R.C. Molecular mechanism of extreme mechanostability in a pathogen adhesin. Science. 2018;359:1527–1533. doi: 10.1126/science.aar2094. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Gomes P.S.F.C., Gomes D.E.B., Bernardi R.C. Protein structure prediction in the era of AI: challenges and limitations when applying to in silico force spectroscopy. Front. Bioinform. 2022;2:983306. doi: 10.3389/fbinf.2022.983306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Zoldák G., Rief M. Force as a single molecule probe of multidimensional protein energy landscapes. Curr. Opin. Struct. Biol. 2013;23:48–57. doi: 10.1016/j.sbi.2012.11.007. [DOI] [PubMed] [Google Scholar]
- 26.Dudko O.K., Hummer G., Szabo A. Intrinsic rates and activation free energies from single-molecule pulling experiments. Phys. Rev. Lett. 2006;96:108101. doi: 10.1103/PhysRevLett.96.108101. [DOI] [PubMed] [Google Scholar]
- 27.Izrailev S., Stepaniants S., et al. Schulten K. Molecular dynamics study of unbinding of the avidin-biotin complex. Biophys. J. 1997;72:1568–1581. doi: 10.1016/S0006-3495(97)78804-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Sedlak S.M., Schendel L.C., et al. Bernardi R.C. Direction matters: monovalent streptavidin/biotin complex under load. Nano Lett. 2019;19:3415–3421. doi: 10.1021/acs.nanolett.8b04045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Sedlak S.M., Schendel L.C., et al. Bernardi R.C. Streptavidin/biotin: tethering geometry defines unbinding mechanics. Sci. Adv. 2020;6:eaay5999. doi: 10.1126/sciadv.aay5999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Bernardi R.C., Durner E., et al. Nash M.A. Mechanisms of nanonewton mechanostability in a protein complex revealed by molecular dynamics simulations and single-molecule force spectroscopy. J. Am. Chem. Soc. 2019;141:14752–14763. doi: 10.1021/jacs.9b06776. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Liu Z., Liu H., Vera A.M., Nash M.A., et al. High force catch bond mechanism of bacterial adhesion in the human gut. Nat. Commun. 2020;11 doi: 10.1038/s41467-020-18063-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Bauer M.S., Gruber S., et al. Lipfert J. A tethered ligand assay to probe SARS-CoV-2:ACE2 interactions. Proc. Natl. Acad. Sci. USA. 2022;119 doi: 10.1073/pnas.2114397119. e2114397119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Melo M.C.R., Gomes D.E.B., Bernardi R.C. Molecular origins of force-dependent protein complex stabilization during bacterial infections. J. Am. Chem. Soc. 2023;145:70–77. doi: 10.1021/jacs.2c07674. [DOI] [PubMed] [Google Scholar]
- 34.Perilla J.R., Goh B.C., et al. Schulten K. Molecular dynamics simulations of large macromolecular complexes. Curr. Opin. Struct. Biol. 2015;31:64–74. doi: 10.1016/j.sbi.2015.03.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Zhao G., Perilla J.R., et al. Zhang P. Mature HIV-1 capsid structure by cryo-electron microscopy and all-atom molecular dynamics. Nature. 2013;497:643–646. doi: 10.1038/nature12162. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Casalino L., Dommer A.C., et al. Amaro R.E. AI-driven multiscale simulations illuminate mechanisms of SARS-CoV-2 spike dynamics. Int. J. High Perform. Comput. Appl. 2021;35:432–451. doi: 10.1177/10943420211006452. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Ingólfsson H.I., Melo M.N., et al. Marrink S.J. Lipid organization of the plasma membrane. J. Am. Chem. Soc. 2014;136:14554–14559. doi: 10.1021/ja507832e. [DOI] [PubMed] [Google Scholar]
- 38.Galemou Yoga E., Parey K., et al. Angerer H. Essential role of accessory subunit LYRM6 in the mechanism of mitochondrial complex I. Nat. Commun. 2020;11:6008. doi: 10.1038/s41467-020-19778-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Bernardi R.C., Melo M.C.R., Schulten K. Enhanced sampling techniques in molecular dynamics simulations of biological systems. Biochim. Biophys. Acta. 2015;1850:872–877. doi: 10.1016/j.bbagen.2014.10.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Sztain T., Ahn S.-H., et al. Amaro R.E. A glycan gate controls opening of the SARS-CoV-2 spike protein. Nat. Chem. 2021;13:963–968. doi: 10.1038/s41557-021-00758-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Lee C.T., Amaro R.E. Exascale computing: a new dawn for computational biology. Comput. Sci. Eng. 2018;20:18–25. doi: 10.1109/MCSE.2018.05329812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Keal T.W., Elena A.-M., et al. Woodley S.M. Materials and molecular modeling at the exascale. Comput. Sci. Eng. 2022;24:36–45. [Google Scholar]
- 43.Vennelakanti V., Nazemi A., et al. Kulik H.J. Harder, better, faster, stronger: large-scale QM and QM/MM for predictive modeling in enzymes and proteins. Curr. Opin. Struct. Biol. 2022;72:9–17. doi: 10.1016/j.sbi.2021.07.004. [DOI] [PubMed] [Google Scholar]
- 44.Bernardi R.C., Pascutti P.G. Hybrid QM/MM molecular dynamics study of benzocaine in a membrane environment: how does a quantum mechanical treatment of both anesthetic and lipids affect their interaction. J. Chem. Theor. Comput. 2012;8:2197–2203. doi: 10.1021/ct300213u. [DOI] [PubMed] [Google Scholar]
- 45.Melo M.C.R., Bernardi R.C., et al. Luthey-Schulten Z. NAMD goes quantum: an integrative suite for hybrid simulations. Nat. Methods. 2018;15:351–354. doi: 10.1038/nmeth.4638. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Phillips J.C., Hardy D.J., et al. Tajkhorshid E. Scalable molecular dynamics on CPU and GPU architectures with NAMD. J. Chem. Phys. 2020;153:044130. doi: 10.1063/5.0014475. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Abraham M.J., Murtola T., et al. Lindahl E. GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. Software. 2015;1-2:19–25. [Google Scholar]
- 48.Ribeiro J.V., Bernardi R.C., Schulten K., et al. QwikMD—integrative molecular dynamics toolkit for novices and experts. Sci. Rep. 2016;6 doi: 10.1038/srep26536. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Humphrey W., Dalke A., Schulten K. VMD: visual molecular dynamics. J. Mol. Graph. 1996;14:33–38. doi: 10.1016/0263-7855(96)00018-5. 27-28. [DOI] [PubMed] [Google Scholar]
- 50.Gowers R.J., Linke M., et al. Kenney I.M. volume 98. SciPy Austin, TX; 2016. MDAnalysis: a Python package for the rapid analysis of molecular dynamics simulations; p. 105. (In Proceedings Of The 15th Python In Science Conference). [Google Scholar]
- 51.Roe D.R., Cheatham T.E., III PTRAJ and CPPTRAJ: software for processing and analysis of molecular dynamics trajectory data. J. Chem. Theor. Comput. 2013;9:3084–3095. doi: 10.1021/ct400341p. [DOI] [PubMed] [Google Scholar]
- 52.Nguyen H., Roe D.R., et al. Case D.A. New Brunswick, NJ: Rutgers University; 2016. PYTRAJ: Interactive Data Analysis for Molecular Dynamics Simulations. [Google Scholar]
- 53.McGibbon R.T., Beauchamp K.A., et al. Pande V.S. MDTraj: a modern open library for the analysis of molecular dynamics trajectories. Biophys. J. 2015;109:1528–1532. doi: 10.1016/j.bpj.2015.08.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Grant B.J., Skjærven L., Yao X.-Q. The Bio3D packages for structural bioinformatics. Protein Sci. 2021;30:20–30. doi: 10.1002/pro.3923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Cui Q., Bahar I. CRC press; 2005. Normal Mode Analysis: Theory and Applications to Biological and Chemical Systems. [Google Scholar]
- 56.Louet M., Casiraghi M., et al. Banères J.L. Concerted conformational dynamics and water movements in the ghrelin G protein-coupled receptor. Elife. 2021;10:e63201. doi: 10.7554/eLife.63201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Melo M.C.R., Bernardi R.C., et al. Luthey-Schulten Z. Generalized correlation-based dynamical network analysis: a new high-performance approach for identifying allosteric communications in molecular dynamics trajectories. J. Chem. Phys. 2020;153:134104. doi: 10.1063/5.0018980. [DOI] [PubMed] [Google Scholar]
- 58.Gheeraert A., Pacini L., et al. Rivalta I. Exploring allosteric pathways of a v-type enzyme with dynamical perturbation networks. J. Phys. Chem. B. 2019;123:3452–3461. doi: 10.1021/acs.jpcb.9b01294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Sener M., Levy S., et al. Cox D. Multiscale modeling and cinematic visualization of photosynthetic energy conversion processes from electronic to cell scales. Parallel Comput. 2020;102:102698. doi: 10.1016/j.parco.2020.102698. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Stone J.E., Vandivort K.L., Schulten K. 2013. GPU-accelerated molecular visualization on petascale supercomputing platforms; pp. 1–8. (In Proceedings of the 8th International Workshop on Ultrascale Visualization). [Google Scholar]
- 61.Fernández I.S., Bai X.-C., et al. Scheres S.H.W. Molecular architecture of a eukaryotic translational initiation complex. Science. 2013;342:1240585. doi: 10.1126/science.1240585. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Bridges H.R., Fedor J.G., Hirst J., et al. Structure of inhibitor-bound mammalian complex I. Nat. Commun. 2020;11 doi: 10.1038/s41467-020-18950-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Gupta C., Khaniya U., et al. Singharoy A. Charge transfer and chemo-mechanical coupling in respiratory complex I. J. Am. Chem. Soc. 2020;142:9220–9230. doi: 10.1021/jacs.9b13450. [DOI] [PubMed] [Google Scholar]
- 64.Akey C.W., Singh D., et al. Rout M.P. Comprehensive structure and functional adaptations of the yeast nuclear pore complex. Cell. 2022;185:361–378.e25. doi: 10.1016/j.cell.2021.12.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Roh S.-H., Shekhar M., et al. Chiu W. Cryo-EM and MD infer water-mediated proton transport and autoinhibition mechanisms of Vo complex. Sci. Adv. 2020;6:eabb9605. doi: 10.1126/sciadv.abb9605. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Goh B.C., Hadden J.A., et al. Schulten K. Computational methodologies for real-space structural refinement of large macromolecular complexes. Annu. Rev. Biophys. 2016;45:253–278. doi: 10.1146/annurev-biophys-062215-011113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Cassidy C.K., Himes B.A., et al. Zhang P. CryoEM-based hybrid modeling approaches for structure determination. Curr. Opin. Microbiol. 2018;43:14–23. doi: 10.1016/j.mib.2017.10.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Trabuco L.G., Villa E., et al. Schulten K. Molecular dynamics flexible fitting: a practical guide to combine cryo-electron microscopy and X-ray crystallography. Methods. 2009;49:174–180. doi: 10.1016/j.ymeth.2009.04.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Kovacs J.A., Galkin V.E., Wriggers W. Accurate flexible refinement of atomic models against medium-resolution cryo-EM maps using damped dynamics. BMC Struct. Biol. 2018;18:11–12. doi: 10.1186/s12900-018-0089-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Nierzwicki Ł., Palermo G. Molecular dynamics to predict cryo-EM: capturing transitions and short-lived conformational states of biomolecules. Front. Mol. Biosci. 2021;8:641208. doi: 10.3389/fmolb.2021.641208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Shekhar M., Terashi G., et al. Singharoy A. CryoFold: determining protein structures and data-guided ensembles from cryo-EM density maps. Matter. 2021;4:3195–3216. doi: 10.1016/j.matt.2021.09.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Bonomi M., Camilloni C., et al. Vendruscolo M. Metainference: a Bayesian inference method for heterogeneous systems. Sci. Adv. 2016;2:e1501177. doi: 10.1126/sciadv.1501177. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Ridgway D., Broderick G., et al. Ellison M.J. Coarse-grained molecular simulation of diffusion and reaction kinetics in a crowded virtual cytoplasm. Biophys. J. 2008;94:3748–3759. doi: 10.1529/biophysj.107.116053. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.McGuffee S.R., Elcock A.H. Diffusion, crowding & protein stability in a dynamic molecular model of the bacterial cytoplasm. PLoS Comput. Biol. 2010;6:e1000694. doi: 10.1371/journal.pcbi.1000694. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Oliveira Bortot L., Bashardanesh Z., Van der Spoel D. Making soup: preparing and validating models of the bacterial cytoplasm for molecular simulation. J. Chem. Inf. Model. 2020;60:322–331. doi: 10.1021/acs.jcim.9b00971. [DOI] [PubMed] [Google Scholar]
- 76.Singharoy A., Maffeo C., et al. Schulten K. Atoms to phenotypes: molecular design principles of cellular energy metabolism. Cell. 2019;179:1098–1111.e23. doi: 10.1016/j.cell.2019.10.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Gillespie D.T., Hellander A., Petzold L.R. Perspective: stochastic algorithms for chemical kinetics. J. Chem. Phys. 2013;138:170901. doi: 10.1063/1.4801941. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Holehouse J., Cao Z., Grima R. Stochastic modeling of autoregulatory genetic feedback loops: a review and comparative study. Biophys. J. 2020;118:1517–1525. doi: 10.1016/j.bpj.2020.02.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Thornburg Z.R., Bianchi D.M., et al. Luthey-Schulten Z. Fundamental behaviors emerge from simulations of a living minimal cell. Cell. 2022;185:345–360.e28. doi: 10.1016/j.cell.2021.12.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Drawert B., Hellander A., et al. Petzold L.R. Stochastic simulation service: bridging the gap between the computational expert and the biologist. PLoS Comput. Biol. 2016;12:e1005220. doi: 10.1371/journal.pcbi.1005220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Roberts E., Stone J.E., Luthey-Schulten Z. Lattice microbes: high-performance stochastic simulation method for the reaction-diffusion master equation. J. Comput. Chem. 2013;34:245–255. doi: 10.1002/jcc.23130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Stevens A.R.H., Bellstedt S., et al. Murphy M.T. The imperative to reduce carbon emissions in astronomy. Nat. Astron. 2020;4:843–851. [Google Scholar]
- 83.Gibney E. How to shrink AI’s ballooning carbon footprint. Nature. 2022;607:648. doi: 10.1038/d41586-022-01983-7. [DOI] [PubMed] [Google Scholar]
- 84.Outeiral C., Strahm M., et al. Deane C.M. The prospects of quantum computing in computational molecular biology. WIREs Comput. Mol. Sci. 2021;11:e1481. [Google Scholar]