Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2024 May 23;14:11791. doi: 10.1038/s41598-024-62242-5

RETRACTED ARTICLE: Comparing ANI-2x, ANI-1ccx neural networks, force field, and DFT methods for predicting conformational potential energy of organic molecules

Mozafar Rezaee 1, Saeid Ekrami 2, Seyed Majid Hashemianzadeh 1,
PMCID: PMC11116541  PMID: 38783010

Abstract

In this study, the conformational potential energy surfaces of Amylmetacresol, Benzocaine, Dopamine, Betazole, and Betahistine molecules were scanned and analyzed using the neural network architecture ANI-2 × and ANI-1ccx, the force field method OPLS, and density functional theory with the exchange-correlation functional B3LYP and the basis set 6-31G(d). The ANI-1ccx and ANI-2 × methods demonstrated the highest accuracy in predicting torsional energy profiles, effectively capturing the minimum and maximum values of these profiles. Conformational potential energy values calculated by B3LYP and the OPLS force field method differ from those calculated by ANI-1ccx and ANI-2x, which account for non-bonded intramolecular interactions, since the B3LYP functional and OPLS force field weakly consider van der Waals and other intramolecular forces in torsional energy profiles. For a more comprehensive analysis, electronic parameters such as dipole moment, HOMO, and LUMO energies for different torsional angles were calculated at two levels of theory, B3LYP/6-31G(d) and ωB97X/6-31G(d). These calculations confirmed that ANI predictions are more accurate than density functional theory calculations with B3LYP functional and OPLS force field for determining potential energy surfaces. This research successfully addressed the challenges in determining conformational potential energy levels and shows how machine learning and deep neural networks offer a more accurate, cost-effective, and rapid alternative for predicting torsional energy profiles.

Subject terms: Biochemistry, Computational biology and bioinformatics, Chemistry, Materials science, Mathematics and computing

Introduction

Machine learning methodologies have recently made significant inroads into various scientific fields, enabling a plethora of applications, including the field of computational chemistry, such as predicting chemical properties and reactions, predicting electronic structures and various molecular properties, in silico drug design, and more114. One of the interesting applications of machine learning methodologies in the field of computational chemistry is to reduce the computational cost of quantum mechanics calculations. In this method, an accurately generated dataset obtained by quantum mechanics calculations could be used for training significantly faster ML models that compute a variety of properties.

Neural networks (NN) are algorithms designed to mimic the function and structure of the human brain. Given a set of known function values at specific points, the parameters of the NN are optimized to replicate the input data during a 'training' process and then used to evaluate the function at other points. Neural networks are widely used in a variety of applications including construction of interatomic potentials from discrete samples for building molecular potential energy surfaces (PES) in internal coordinates. Potential energy surface can be used to describe possible conformations of a molecular system, or the spatial positions of interacting molecules in a system.

Studying torsional energy profiles in molecular systems is crucial because of their significant role in the loading/releasing of drugs in drug delivery systems. By definition, a rotated torsion angle is determined by 4 consecutively bonded atoms in a molecule. Studying torsional angles is computationally demanding and even advanced platforms, such as Q-MD simulations, are limited by time constraints, especially when applied to large-scale molecular modeling. This is why empirical potentials remain appealing, as they provide rapid access to intra-molecular energies and forces. On the other hand, the development of consistent empirical potentials is a complex and protracted process that typically involves parameter fitting for interaction potentials. It appears that machine learning and deep learning methods, specifically neural network potential energy surfaces (NNPESs), trained with accurate datasets are well-suited to predict torsional energy profiles, as these models are as accurate as quantum mechanics methods while being as efficient as classical methods. In addition, they can be transferred and employed to predict potential energy surfaces and diverse conformational energies1525.

In 2007, Behler and Parrinlo introduced a neural network designed to calculate the potential energy surfaces of atomic or molecular systems as a function of their geometry. They converted the cartesian coordinates of the atoms into symmetric functions and used these as an input for training a neural network to predict potential energy surfaces. Initially, Behler and Parrinlo trained a neural network with only silicon bulk data. Unfortunately, their network was not transferable to other molecules and did not make good predictions for other molecules. Afterward, they introduced a new type of symmetry function in which the symmetrical function values for each atom show the local environment that determines its energy. They constructed two angular and radial symmetry functions for each atom 26. Typically, neural network potentials developed in these studies are non-transferable. To date, there has been a scarcity of research that utilizes the Behler and Parrinello Symmetry Functions (SFs) to provide a neural network potential that is genuinely transferable across complex chemical environments. The exceptions are limited to cases such as all-trans alkanes, bulk materials, and water scenarios27,28, where non-equilibrium structures and potential surface smoothness are not considered. The SFs have limitations in differentiating atomic numbers, making training in complex chemical environments empirically challenging. Consequently, the original SFs are predominantly applicable to specific studies. In 2017, Roitberg et al. constructed a neural network called Accurate NeurAl networK engINe for Molecular Energies (ANAKIN-ME) or ANI in short. They employed modified symmetry functions from Behler and Parrinello to create Environmental Atomic Vectors (AEVs) and used these AEVs as inputs for the neural network training, resulting in potential energy predictions. They trained the neural network using density functional theory (DFT) data from the GDB database29,30. Additionally, they introduced a random Normal Mode Sampling (NMS) method for generating molecular conformations. ANI-1, a deep neural network (NN) trained on DFT calculations with ωB97X/6-31G(d) level of calculation is able to predict accurate and transferable potentials for organic molecules. Single-atom atomic environment vectors (AEVs) as a molecular representation provide the ability to train neural networks on data involving configuration and structure spaces. Making this type of input for training potential neural networks has never been done before. ANI-1 is trained on molecules with up to 8 atoms containing four types of atoms: Hydrogen, Carbon, Nitrogen, and Oxygen to predict the total energy of organic molecules. AEVs fix transferability problems SFs. In ANI-1, major corrections were applied to Behler's symmetry functions, and more diverse data were used to train the neural network, which made this neural network more transferable than the initial Behler's neural network. In this process, almost 57,000 molecular structures are selected and different molecular conformers from these structures were generated by normal mode sampling, which in the end creates about 20 million conformers. Thus, it was clear that the vectors of the atomic environment were a better molecular representation and, as a result, a better input than the Behler symmetry function31,32. The next ANI model, ANI-1x, was trained based on a dataset containing DFT calculations for approximately 5 million diverse molecular conformations, which were obtained using an active learning algorithm3335. Active learning has an automatic data diversification process. A sufficient diversity of data by which neural network was trained, makes machine learning predictions for unseen cases more generalizable and more transferable. This is a challenge for training machine learning and deep learning models in supervised methods, which was solved by active learning methods. In the ANI-1 × model, potentials are iteratively trained by active learning to determine what new data should be included in the next stage of the training dataset. When molecules are selected from datasets such as GDB-11 and ChEMBL, one of four active learning sampling techniques include: molecular dynamic sampling, normal mode sampling, dimer sampling, and torsion sampling36 is applied to each molecule. Active learning methods search the conformational and configuration spaces of the molecules and employ a measure of uncertainty estimated (ρ) to choose what new data should be generated, then include the new data in the next stage of the training cycle. This uncertainty estimated (ρ) is measured for different structures and if a suitable structure is found, it is added to the dataset. This process continues iteratively, and a diverse set of data is produced for training network ANI-1x. A different ANI model, the ANI-1ccx, utilized only 10% of the ANI-1 × data due to the expensive and time-consuming nature of coupled cluster methods. The ANI-1ccx dataset comprises 500,000 data points obtained through an accurate CCSD(T)/CBS extrapolation. This means that dataset ANI-1ccx 1 is a subsample of dataset ANI-1 × that was intelligently selected, but an exact coupled cluster level of theory (approx. CCSD(T)/CBS) has been recalculated. Coupled cluster methods (CC) are one of the most accurate computational chemistry methods for calculating the chemical quantities of molecules. To perform calculations within the range of chemical accuracy in the coupled cluster method, the complete basis set (CBS) and singlet, doublet, and triplet excitation operators (CCSDT) are used. But both cases increase the time of chemical calculations. With this long calculation time, it is not possible to perform these coupled-cluster calculations for all 5 million conformers selected from method ANI-1x. An extrapolation method was implemented to keep sufficient accuracy with less CPU time. Since the CCSD(T)/CBS scheme is significantly computationally expensive than the original ANI-1 × DFT reference calculations, only a subset of the original data could be computed using the extrapolation scheme. A subset sampling technique similar to the original active learning method was used to select which data to compute at the CCSD(T)/CBS level of theory19,3640.

The potential models ANI-1ccx, ANI-1x, and ANI-1 focused on molecules containing four types of atoms: Carbon, Oxygen, Hydrogen, and Nitrogen. In 2020, the same group introduced network ANI-2x, which included additional F, Cl, and S atoms alongside the four mentioned atoms. This network also utilized active learning sampling techniques. It relied on the Model ANI-1 approach to create environmental atomic vectors but introduced new modifications to accommodate the three additional chemical elements41. Datasets like GDB-11 and ChEMBL42,43 were used to construct the potential network ANI-2x. All calculations in this network were performed using the density functional theory method, at ωB97X/6-31G(d) level of theory. This network was trained with 8.9 million molecular conformations. The significance of employing machine learning and deep learning methods lies in their ability to reduce calculation time and their training on accurate datasets.

In this research, pre-trained networks of ANI-1ccx and ANI-2 × were employed to predict the torsional energy profiles of five molecules: Amylmetacresol, Benzocaine, Dopamine, Betazole, and Betahistine. The selection of molecular structures in this research aimed to facilitate comparisons regarding the presence of van der Waals forces in some torsional profiles, which are not considered in others. In addition, calculations using the OPLS force field method and density functional theory with the B3LYP exchange-correlation functional were performed for all the mentioned molecules.

Computational details

In this study, pre-trained models of ANI-1ccx and ANI-2 × neural networks were used to predict the torsional energy profiles of five molecules: Amylmetacresol, Benzocaine, Dopamine, Betazole, and Betahistine. The pre-trained neural networks are available in TorchANI python library which is freely accessible under MIT license44. The process begins with calling the mol file of the desired molecule, then, determining the indices of the atoms involved in the desired dihedral. After establishing the initial coordinates and the degree of rotation at each stage, which is 5 degrees in our research the energy diagram versus the dihedral angle is generated through the ANI models prediction.

Apart from ANI models’ predictions, the calculation of the torsional energy profiles of the specified dihedrals were also computed by density functional theory (DFT) calculations with the exchange–correlation functional and basis set of B3LYP and 6-31G(d). All DFT calculations were carried out using the Gaussian09 software45. To calculate the torsional energy profile in the density functional theory, we used scan unrelaxed (rigid), because the torsion angle must be independent of the structure, unlike scan relax, in which the cross effects of other parts of the molecule could be seen in the torsion energy. To obtain molecular energies, we used our optimized structures in the previously mentioned neural networks. Additionally, we conducted force field calculations using the OPLS method to determine the torsional energy profiles of our target molecules in GROMACS simulation package46. To calculate torsional energy profiles in the force field OPLS, we extracted the parameters of this force field for the torsion part for different molecules. For different angles that included a 5-degree scan, we put the angles and parameters about the torsion energy in this force field, and the torsion energies were obtained. The results were then compared with the outcomes of the DFT (B3LYP) and the neural networks ANI-2 × and ANI-1ccx.

To gain a comprehensive understanding of the level of accuracy of potential energy networks and to compare them with DFT results, we calculated and evaluated electronic properties such as the dipole moment and the energy of the HOMO and LUMO orbitals using density functional theory at both B3LYP/6-31G(d) and ωB97X/6-31G(d) levels of theory. Single-point calculations were performed at intervals of 40 degrees for five structures: Amylmetacresol and Betahistine in dihedral γ1, Betazole, Benzocaine, and Dopamine in dihedral γ2.

Result and discusion

The molecular structure of Amylmetacresol and Benzocaine, along with Newman projection and torsional energy profiles diagrams for dihedral rotations γ1 and γ2 as a function of rotation angle, are shown in Fig. 1. The carbon atoms attached to the benzene ring in Amylmetacresol maintain the same plane and do not possess specific functional groups that cause hydrogen bonding. At angles of zero and 360 degrees, the Newman projection results in an eclipsed form, leading to spatial repulsion and an increase in torsional energy. At the angle of 180 degrees, the hydrogens are in an Anti conformer, which is fully staggered and results in the lowest torsional energy. At angles of 60 and 300 degrees, the Gauche conformer is predominant. In Amylmetacresol, only steric van der Waals repulsion is observed in these specified tosion profiles. However, for the Amylmetacresol molecule, due to the absence of substantial intra-atomic forces in the specified torsion angles, there is not a significant discrepancy in potential energy calculations among different methods. Neural networks ANI-2 × and ANI-1ccx for this molecule have been able to predict the maxima and minima of the rotational energy profile within the accuracy of density functional theory (B3LYP) and OPLS force field. Regarding the dihedral γ1 for Benzocaine, at angles of zero and 360 degrees, they exhibit higher torsional energy due to the substantial steric hindrance caused by the presence of the benzene ring and the carbonyl group with methyl. At angles of 120 and 240 degrees, where van der Waals repulsion forces are more pronounced, quantum calculations B3LYP and OPLS do not effectively capture the influence of these forces, resulting in notable errors in their predictions. In contrast, the ANI-1ccx diagrams exhibit the highest accuracy compared to other methods. The dihedral γ2 for Benzocaine differs from the previous three examples. At angles of zero and 360 degrees, the potential energy is at its lowest value, which is associated with the resonance form of the carbonyl group. This resonance results in the oxygen attached to the carbonyl group becoming positively charged, and the electron cloud attracts the hydrogens of the carbon atom towards it, leading to the interaction of positive and negative charges. As the angle approaches intermediate values, spatial repulsion effects become prominent, and the electrostatic attraction interaction diminishes.

Figure 1.

Figure 1

Displays the dihedral angles γ1 and γ2 for the molecules Amylmetacresol and Benzocaine, along with Newman projections and torsional energy profiles diagrams generated using four different models: ANI-2x, DFT, OPLS, and ANI-1ccx.

The exchange–correlation of B3LYP functional considers van der Waals forces weaker compared to functional ωB97X. Given that the network ANI-1ccx is trained based on the coupled cluster method, which is effective in capturing intramolecular forces, it can be considered the most precise model for torsional energy profiles prediction in this research. On the other hand, network ANI-2 × is trained based on density functional theory calculations with the exchange–correlation functional ωB97X, which considers intramolecular forces better due to its formalism and includes spatial effects and other forces like hydrogen bonding. In contrast, OPLS calculations are grounded in classical methods and exhibit limited observation of intramolecular interactions, resulting in the OPLS force field method having a diminished capacity to predict torsional energy profiles in Anti and Gauche conformers.

The molecular structure of Dopamine, Betazole, Betahistine, Newman projection, and torsional energy profiles in dihedral rotation γ1 and γ2 according to rotation angle are shown in Fig. 2. In the case of the Dopamine molecule in dihedral γ1, the rotational potential energy at 120 and 240-degree states is higher than that of the Amylmetacresol and Benzocaine molecule due to steric repulsion between the benzene ring and the amine group. This is also confirmed by methods ANI-1ccx and ANI-2 ×, and the results of these two models are validated by the results of the DFT (B3LYP) and the OPLS force field. For the Dopamine molecule in dihedral γ2, a hydrogen bond forms between the amino group and the hydroxyl group of the benzene ring at angles of 0 and 360 degrees. Notably, this hydrogen bond was not detected by the B3LYP functional and OPLS force field. As the dihedral angle shifts away from zero and approaches 90 and 270 degrees, the amine group and hydroxyl groups align in the same plane, and the impact of the hydrogen bonding in reducing potential energy becomes evident. At 180 degrees, the distance between the groups involved in the hydrogen bonding is at its maximum, but some hydrogen bonding persists and is observable in the potential energy diagrams provided by the ANI-1ccx and ANI-2 × methods. The results for this molecule show that when van der Waals forces, hydrogen bonding, or strong repulsive effects are involved in the torsional energy profile, the predictions of neural networks ANI-1ccx and ANI-2 × are more accurate than the results of OPLS force field and density functional theory with functional B3LYP, and the accuracy of these networks is within their reference data. For Betazole in dihedral γ1, the rotational potential energy reaches its maximum at angles of 0, 120, 240, and 360 degrees due to steric repulsion between the amino group and the ring. At an angle of 60,180,30 degrees, this spatial repulsion is minimized. It's worth noting that the OPLS force field behaves differently from the other three methods in this case. This discrepancy in the behavior of the OPLS force field indicates that its results may not always offer reliable predictions for torsion constants, and caution should be exercised when using the OPLS force field to examine the torsional properties of molecules. In the case of the Betazole molecule in dihedral γ2, hydrogen bonds form between the amine group and the hydrogen attached to the nitrogen within the ring at angles of 0 and 360 degrees. At 180 degrees, a hydrogen bond still exists, but due to the greater distance, it weakens, which is not captured by the B3LYP functional and OPLS force field methods. At angles of 90 and 275 degrees, the hydrogen bond is once again strengthened. The different behavior of the B3LYP functional and OPLS force field methods is evident in the torsional energy profiles, as compared to the ANI-1ccx and ANI-2 × models. These results show that ANI-2 × and ANI-1ccx neural networks are successful in predicting minimum and maximum energy. The Betahistine molecule in conformer γ1 exhibits similar behavior to Amylmetacresol, with a noticeable increase in rotation energy at 240 degrees compared to 120 degrees, likely due to steric repulsion between the nitrogen within the ring and the methyl group, which is absent at 120 degrees. In the case of the Betahistine molecule in dihedral γ2, it is observed that the OPLS force field model behaves differently from the other three methods and does not accurately explain the behavior of rotational potential energy.

Figure 2.

Figure 2

Figure 2

This figure presents the dihedral angles γ1 and γ2 for three molecules: Dopamine, Betazole, and Betahistine, along with their respective Newman projections and torsional energy profiles diagrams using four different models: ANI-2x, DFT (with B3LYP functional), OPLS, and ANI-1ccx.

In Table 1, dipole moment values and HOMO and LUMO energies obtained using DFT calculations with two functionals, B3LYP and ωB97X, are reported for the structures of Amylmetacresol in dihedral γ1, as well as Betazole and Dopamine in dihedral γ2. Torsions were considered at intervals of 40 degrees, resulting in 10 torsion angles ranging from 180 degrees to − 180 degrees for each structure. The gradient of the dipole moment concerning changes in the torsion angle appears to be an indicator of variations in van der Waals interactions and interatomic and intramolecular forces. The changes in the dipole moment with respect to angle variations are illustrated in Fig. 3. Additionally, energy changes in HOMO and LUMO orbitals can serve as an indicator of conformational energy changes. The changes in the energy of HOMO orbitals with respect to angle variations are illustrated in Fig. 4. The purpose of choosing these three structures was to investigate the changes in dipole moment and HOMO energy in terms of rotation angle for structures that have van der Waals interactions, hydrogen bonding, and steric effects. The results show that the calculated values for dipole moment, Homo, and LUMO energies are different in the two functionals. The changes of these quantities according to the angle of rotation for functional ωB97X are relatively more and this functional shows more details. This shows that functional ωB97X records the changes in interaction energies better than functional B3LYP. Since potential network ANI-2 × is trained with density functional calculations with functional ωB97X, it is expected that due to this level of calculations, as well as amount of data and their diversity included in active learning process, be more successful in determining the torsional profile energy than density functional theory calculations with functional B3LYP. The success in predicting the accuracy of this potential network in comparison with other functionals of the density functional theory should be checked. Potential network ANI-1ccx also uses coupled-cluster calculations for its training, which has been proven to have very accurate results for chemical calculations.

Table 1.

Displays the dipole moment, HOMO, and LUMO orbital energies for the structures of Amylmetacresol in dihedral γ1, as well as Dopamine and Betazole in dihedral γ2, using two different functionals: B3LYP and ωB97X.

system  + 180  + 140  + 100  + 60  + 20 − 20 − 60 − 100 − 140 − 180
Amylmetacresol B3LYP
Diploe moment 1.6735 1.6810 1.6883 1.6985 1.7195 1.7195 1.6985 1.6883 1.6810 1.6735
HOMO (eV) − 5.6903 − 5.6892 − 5.6900 − 5.6900 − 5.6841 − 5.6841 − 5.6900 − 5.6900 − 5.6892 − 5.6903
LUMO (eV) 0.2255 0.2266 0.2258 0.2258 0.2312 0.2312 0.2258 0.2258 0.2266 0.2255
Amylmetacresol ωB97X
Diploe moment 1.6784 1.6875 1.6927 1.7065 1.7282 1.7235 1.7024 1.6913 1.6852 1.6784
HOMO (eV) − 8.1096 − 8.1058 − 8.1094 − 8.1091 − 8.1028 − 8.1034 − 8.1096 − 8.1094 − 8.1083 − 8.1096
LUMO (eV) 2.4816 2.4827 2.4813 2.4816 2.4873 2.4870 2.4810 2.4813 2.4824 2.4816
Betazole B3LYP
Diploe moment 3.1402 3.0945 3.0074 2.9996 3.0119 2.9698 2.9103 2.8955 3.0243 3.1402
HOMO (eV) − 6.2057 − 6.2658 − 6.2903 − 6.2601 − 6.2490 − 6.2639 − 6.2653 − 6.2658 − 6.2378 − 6.2057
LUMO (eV) 0.8489 0.7542 0.7025 0.7738 0.7428 0.7545 0.7812 0.7014 0.7591 0.8489
Betazole ωB97X
Diploe moment 3.1564 3.1243 3.0382 3.0181 3.0240 2.9912 2.9369 2.9180 3.0306 3.1564
HOMO (eV) − 8.7965 − 8.8506 − 8.8775 − 8.8547 − 8.8487 − 8.8574 − 8.8547 − 8.8585 − 8.8294 − 8.7965
LUMO (eV) 3.3583 3.2566 3.2027 3.2794 3.2402 3.2577 3.2938 3.2008 3.2566 3.3583
Dopamine B3LYP
Diploe moment 2.2880 2.9374 3.4630 3.7079 3.5732 3.0778 2.3225 1.6186 1.6051 2.2880
HOMO (eV) − 5.4620 − 5.4743 − 5.4851 − 5.4879 − 5.4397 − 5.4457 − 5.4979 − 5.4871 − 5.4658 − 5.4620
LUMO (eV) 0.2214 0.2415 0.2497 0.1975 0.2780 0.2794 0.1961 0.2538 0.2506 0.2214
Dopamine ωB97X
Diploe moment 2.3630 3.0168 3.5359 3.7676 3.6308 3.1389 2.3882 1.6945 1.6807 2.3630
HOMO (eV) − 7.8588 − 7.8672 − 7.8767 − 7.8797 − 7.8337 − 7.8389 − 7.8884 − 7.879 − 7.8612 − 7.8588
LUMO (eV) 2.4587 2.4824 2.4955 2.4321 2.5181 2.5197 2.4334 2.4996 2.4900 2.4587

Figure 3.

Figure 3

Changes in dipole moment with respect to torsion angle for the structures of Amylmetacresol in dihedral γ1 and Dopamine and Betazole in dihedral γ2 at two levels of theory: B3LYP/6- 31G(d) and ωB97X/6-31G(d).

Figure 4.

Figure 4

Changes in HOMO orbital energy as a function of angle for the structures of Amylmetacresol in conformer γ1 and Dopamine and Betazole in conformer γ2 at two levels of theory: B3LYP/6- 31G(d) and ωB97X/6-31G(d).

In Table 2, dipole moment values and HOMO and LUMO energies were obtained using DFT calculations with two functionals, B3LYP and ωB97X, for the structures of Betahistine in dihedral γ1, as well as Benzocaine in dihedral γ2. The changes in the dipole moment are illustrated in Fig. 5. Additionally, the changes in the energy of HOMO orbitals for angle variations are illustrated in Fig. 6. The difference in dipole moment, HOMO and LUMO energies in the Betahistine molecule in B3LYP/6- 31G(d) and ωB97X/6-31G(d) levels of theory is very noticeable. The results show that functional ωB97X is more accurate in predicting electronic effects than functional B3LYP. As a result, potential network ANI-2x, which is trained based on functional ωB97X, is more accurate in predicting electronic effects than functional B3LYP, and it has minimal CPU time calculation and can be used to predict electronic effect.

Table 2.

Displays the dipole moment, HOMO, and LUMO orbital energies for the structures of Benzocaine in dihedral γ2, as well as Betahistine in dihedral γ1, using two different functionals: B3LYP and ωB97X.

System  + 180  + 140  + 100  + 60  + 20 − 20 − 60 − 100 − 140 − 180
Benzocaine B3LYP
Diploe moment 6.2572 6.3025 5.2186 4.2417 3.5546 3.838 4.8092 5.7339 6.5582 6.2572
HOMO (eV) − 5.7981 − 5.9045 − 5.8481 − 5.7967 − 5.7257 − 5.7260 − 5.7973 − 5.8490 − 5.9170 − 5.7981
LUMO (eV) − 0.7031 − 1.0215 − 1.0095 − 0.9224 − 0.7191 − 0.7202 − 0.9246 − 1.0119 − 1.0204 − 0.7031
Benzocaine ωB97X
Diploe moment 6.0759 6.0819 4.8898 3.8863 3.2555 3.5888 4.5547 5.4905 6.3699 6.0759
HOMO (eV) − 8.1507 − 8.2593 − 8.1853 − 8.1322 − 8.0740 − 8.0745 − 8.1325 − 8.1861 − 8.2721 − 8.1507
LUMO (eV) 1.51831 1.19016 1.2249 1.3139 1.5194 1.5183 1.3115 1.2225 1.1907 1.5183
Betahistine B3LYP
Diploe moment 2.2696 1.6427 1.5608 2.1637 2.9594 3.4583 3.5598 3.4599 3.0046 2.2696
HOMO (eV) − 5.8925 − 5.8724 − 5.7399 − 5.8164 − 5.8044 − 5.7592 − 5.8164 − 5.8202 − 5.8354 − 5.8925
LUMO (eV) − 0.7439 − 0.7798 − 0.7725 − 0.7431 − 0.7303 − 0.7409 − 0.7431 − 0.6821 − 0.6922 − 0.7439
Betahistine ωB97X
Diploe moment 1.4946 1.9608 2.5638 3.0777 3.5291 3.4584 2.9202 2.2626 1.6439 1.4946
HOMO (eV) − 8.3570 − 8.4166 − 8.4413 − 8.3847 − 8.3075 − 8.3186 − 8.3970 − 8.3589 − 8.3219 − 8.3570
LUMO (eV) 1.4990 1.4478 1.4574 1.5159 1.5289 1.5058 1.5235 1.5608 1.5570 1.4990

Figure 5.

Figure 5

Changes in dipole moment with respect to torsion angle for the structures of Betahistine in dihedral γ1 and Benzocaine in dihedral γ2 at two levels of theory: B3LYP/6- 31G(d) and ωB97X/6-31G(d).

Figure 6.

Figure 6

Changes in HOMO orbital energy as a function of angle for the structures of Betahistine in dihedral γ1 and Benzocaine in dihedral γ2 at two levels of theory: B3LYP/6- 31G(d) and ωB97X/6-31G(d).

As discussed already, to predict torsional energy profiles, OPLS method has the lowest CPU time and high error margin. On the other hand, DFT calculations provide a more accurate representation of intramolecular interactions, including van der Waals forces, hydrogen bonding, and spatial effects47,48, and could be used to predict torsional energy profiles or make corrections to the OPLS calculations. However, DFT calculations are computationally intensive, requiring significant CPU time. As an alternative, in here, ANI models were used to predict torsional energy profiles, which align well with DFT calculations but with considerably lower computational overhead.

Conclusions

This research involved a comparison of four methods for predicting torsional energy profiles. The two methods, ANI-1ccx and ANI-2x, trained on the coupled cluster and the exchange–correlation functional ωB97X, exhibited a strong capability to accurately capture interactions during rotation and provide an appropriate explanation for torsional profile behavior. In contrast, the two methods, OPLS force field and B3LYP functional were less effective in elucidating the potential energy barriers associated with molecules containing special intra-molecular interactions. It appears that DFT with B3LYP functional is more effective than the OPLS force field method in determining the torsional energy profiles, as demonstrated in the case of the Betazole molecule. Among these four methods, there was the highest degree of agreement in the Amylmetacresol molecule and the lowest in the Betazole molecule. This research successfully addressed the challenge of determining conformational potential energy levels, which had traditionally relied on time-consuming Ab initio, semi-empirical, and molecular mechanics methods. Machine learning techniques and deep neural networks offer a more accurate, cost-effective, and rapid alternative for predicting torsional energy profiles. This study also revealed that the ANI-1ccx method exhibited greater accuracy compared to the ANI-2 × method. In generating torsional energy profiles, the results of the ANI-1ccx, and ANI-2 × networks, suggest almost the same level of accuracy. B3LYP functional comes afterward with some degree of deviation and is superior to the OPLS force field. In terms of computational cost and accuracy, ANI methods can efficiently regenerate DFT results and calculate molecular potential energy surfaces and other molecular quantities. Moreover, B3LYP functional, which is one of the usual methods in calculating molecular properties, was compared to the results of ANI models, and in some cases, it was less accurate than neural network models. The calculation and assessment of electronic effects in this research demonstrated that the ANI-2 × is more precise than B3LYP functional. The results indicate that utilizing neural networks to determine torsional energy profiles for molecules affected by steric or van der Waals forces allows for a better estimation of force field parameters related to torsional contributions. This approach has the potential to address the challenges faced by force fields in justifying rotational behaviors.

Acknowledgements

The authors thank Iran University of Science and Technology for providing computing facilities in the Molecular Simulation Research Laboratory.

Author contributions

M. R. carried out the computer calculations, wrote the manuscript, and developed the theory. S. M. H. and S. E. contributed to the final version of the manuscript, designed and directed the project, and contributed to interpreting the results. The authors give their consent to the publication of identifiable details, which can include photographs, plots, and details within the text (computer calculations) to be published in the above Journal and Article. I confirm that I have seen and been allowed to read both the Material and the Article to be published.

Data availability

The computational data supporting the findings of this study are available from the corresponding author upon reasonable request.

Competing interests

The authors declare no competing interests.

Footnotes

This article has been retracted. Please see the retraction notice for more detail: https://doi.org/10.1038/s41598-026-42379-1

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Change history

3/2/2026

This article has been retracted. Please see the Retraction Notice for more detail: 10.1038/s41598-026-42379-1

References

  • 1.Gómez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci.4, 268–276 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Popova, M., Isayev, O., Tropsha, A. Deep reinforcement learning for de novo drug design. Sci. Adv.4 (2018). [DOI] [PMC free article] [PubMed]
  • 3.Granda, J. M., Donina, L., Dragone, V., Long, D. L. & Cronin, L. Controlling an organic synthesis robot with machine learning to search for new reactivity. Nature.559, 377–381 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Meuwely, M. Machine learning for chemical reactions. Chem. Rev.121, 10218–10239 (2021). [DOI] [PubMed] [Google Scholar]
  • 5.Stocker, S., Csanyi, G., Reuter, K., Margraf, J. T. Machine learning in chemical reaction space. Nat. Commun. 5505 (2020). [DOI] [PMC free article] [PubMed]
  • 6.Tu, Z., Stuyver, T. & Coley, C. W. Predictive chemistry: machine learning for reaction deployment, reaction development, and reaction discovery. Chem. Sci.14, 226–244 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wen, M. et al. Chemical reaction networks and opportunities for machine learning. Nat Comput Sci.3, 12–24 (2023). [DOI] [PubMed] [Google Scholar]
  • 8.Schutt, K. T., Sauceda, H. E., Kindernans, P. J., Tkatchenko, A., Muller, K. R. SchNet - A deep learning architecture for molecules and materials. J. Chem. Phys.148 (2018). [DOI] [PubMed]
  • 9.Fedik, N. et al. Extending machine learning beyond interatomic potentials for predicting molecular properties. Nat. Rev. Chem.6, 653–672 (2022). [DOI] [PubMed] [Google Scholar]
  • 10.Li, Zh., Jiang, M., Wang, Sh., Zhang, Sh. Deep learning methods for molecular representation and property prediction. Drug Discov. Today. 27, (2022). [DOI] [PubMed]
  • 11.Fiedler, L., and et al, Predicting electronic structures at any length scale with machine learning. npj Comput Mater.115, (2023).
  • 12.Unke, O. T. & Meuwly, M. PhysNet: A neural network for predicting energies, forces, dipole moments, and partial charges. J. Chem. Theory Comput.15, 3678–3693 (2019). [DOI] [PubMed] [Google Scholar]
  • 13.Welborn, M., Cheng, L. & Miller, T. F. Transferability in machine learning for electronic structure via the molecular orbital basis. J. Chem. Theory Comput.14, 4772–4779 (2018). [DOI] [PubMed] [Google Scholar]
  • 14.Collins, C. R., Gordon, G. J., von Lilienfeld, O. A., Yaron, D. J. Constant size descriptors for accurate machine learning models of molecular properties. J. Chem. Phys. 241718 (2018). [DOI] [PubMed]
  • 15.Blank, T. B., Brown, S. D., Calhoun, A. W. & Doren, D. J. Neural network models of potential energy surfaces. J. Chem. Phys.103, 4129–4137 (1995). [Google Scholar]
  • 16.Lahey, S. L. J., Phuc, T. N. T. & Rowley, C. N. Benchmarking force field and the ANI neural network potentials for the torsional potential energy surface of biaryl drug fragments. J. Chem. Inf. Model.60, 6238–6268 (2020). [DOI] [PubMed] [Google Scholar]
  • 17.Behler, J. Neural network potential-energy surfaces in chemistry: a tool for large-scale simulations. Phys. Chem. Chem. Phys.13, 17930–17955 (2011). [DOI] [PubMed] [Google Scholar]
  • 18.Behler, J. Constructing high-dimensional neural network potentials: A tutorial review. Int. J. Quantum Chem.115, 1032–1050 (2015). [Google Scholar]
  • 19.Smith, J. S., et al. Approaching coupled cluster accuracy with a general-purpose neural network potential through transfer learning. Nat Commun. 2903, (2019). [DOI] [PMC free article] [PubMed]
  • 20.Behler, J. Representing potential energy surfaces by high-dimensional neural network potentials. J. Phys. Condens. Matter. 26 (2014). [DOI] [PubMed]
  • 21.Lubbers, N., Smith, J. S., Barros, K., Hierarchical modeling of molecular energies using a deep neural network. J. Chem. Phys.148 (2018). [DOI] [PubMed]
  • 22.Artrith, N., Morawietz, T, Behler, J. High-dimensional neural-network potentials for multicomponent systems: applications to zinc oxide. J. Phys.: Condens. Matter. 26 (2014).
  • 23.Manna, S., and et al. Learning in continuous action space for developing high dimensional potential energy models. Nat Commun.368 (2022). [DOI] [PMC free article] [PubMed]
  • 24.Kushwaha, A., Kumar, T. J. D. Benchmarking PES-Learn's machine learning models predicting accurate potential energy surface for quantum scattering. Int. J. Quantum Chem.123 (2023).
  • 25.Arab, F., Nazari, F. & Illas, F. Artificial neural network-derived unified six-dimensional potential energy surface for tetra atomic isomers of the biogenic [H, C, N, O] system. J. Chem. Theory Comput.19, 1186–1196 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Behler, J., Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett.98 (2007). [DOI] [PubMed]
  • 27.Morawietz, T., Sharma, V., Behler, J. A neural network potential-energy surface for the water dimer based on environment-dependent atomic energies and charges. J. Chem. Phys. 136 (2012). [DOI] [PubMed]
  • 28.Gastegger, M., Kauffmann, C., Behler, J., Marquetand, Ph. Comparing the accuracy of high-dimensional neural network potentials and the systematic molecular fragmentation method: A benchmark study for all-trans alkanes. J. Chem. Phys.194110 (2016). [DOI] [PubMed]
  • 29.Fink, T. & Raymond, J. L. Virtual exploration of the chemical universe up to 11 Atoms of C, N, O, F: Assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discove. J. Chem. Inf. Model.47, 342–353 (2007). [DOI] [PubMed] [Google Scholar]
  • 30.Fink, T., Bruggesser, H. & Reymond, J. L. Virtual exploration of the small-molecule chemical universe below 160 daltons. Angew. Chem. Int. Ed.44, 1504–1508 (2005). [DOI] [PubMed] [Google Scholar]
  • 31.Smith, J. S., Isayev, O. & Roitberg, A. E. ANI-1: An extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci.8, 3192–3203 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Smith, J. S., Isayev, O., Roitberg, A. E. ANI-1, A data set of 20 million calculated off-equilibrium conformations for organic molecules. Sci Data.170193 (2017). [DOI] [PMC free article] [PubMed]
  • 33.Settles, B. Active learning. Synth. Lect. Artif. Intell. Mach. Learn..18, 1–111 (2012). [Google Scholar]
  • 34.Smith, J. S., Nebgen, B., Lubbers, N., Isayev, O. & Roitberg, A. E. Less is more: Sampling chemical space with active learning. J. Chem. Phys.148, 241733. 10.1063/1.5023802 (2018). [DOI] [PubMed] [Google Scholar]
  • 35.Prince, M., Does active learning work? A review of the research. J. Eng. Educ. 93 (2004).
  • 36.Smith, J. S., and et al. The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules. Sci Data.134 (2020). [DOI] [PMC free article] [PubMed]
  • 37.Hobza, P. & Sponer, J. Toward true DNA base-stacking energies: MP2, CCSD(T), and complete basis set calculations. J. Am. Chem. Soc.124, 11802–11808 (2002). [DOI] [PubMed] [Google Scholar]
  • 38.Halkier, A., Helgaker, T., Jørgensen, P., Klopper, W. & Olsen, J. Basis-set convergence of the energy in molecular Hartree-Fock calculations. Chem. Phys. Lett.302, 437–446 (1999). [Google Scholar]
  • 39.Helgaker, T., Klopper, W., Koch, H. & Noga, J. Basis-set convergence of correlated calculations on water. Chem. Phys.106, 9639–9646 (1997). [Google Scholar]
  • 40.Neese, F. & Valeev, E. F. Revisiting the atomic natural orbital approach for basis sets: Robust systematic basis sets for explicitly correlated and conventional correlated ab initio methods?. J. Chem. Theory. Comput.7, 33–43 (2011). [DOI] [PubMed] [Google Scholar]
  • 41.Devereux, Ch. et al. Extending the applicability of the ANI deep learning molecular potential to Sulfur and Halogens. J. Chem. Theory Comput.16, 4192–4202 (2020). [DOI] [PubMed] [Google Scholar]
  • 42.Davies, M. et al. MyChEMBL: A virtual platform for distributing cheminformatics tools and open. Data. Challenges. Challenges.5, 334–337 (2014). [Google Scholar]
  • 43.Bento, A. P. et al. The ChEMBL bioactivity database: An update. Nucleic Acids Res.42, D1083–D1090 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Gao, X., Ramezanghorbani, F., Isayev, O., Smith, J. S. & Roitberg, A. E. TorchANI: A free and open source PyTorch based deep learning implementation of the ANI neural network potentials. J. Chem. Inf. Model.60, 3408–3415 (2020). [DOI] [PubMed] [Google Scholar]
  • 45.Frisch, M. J. et al. Gaussian 09, Revision A.02 (Gaussian, Inc., Wallingford, CT, 2016).
  • 46.Bekker, H. et al. Gromacs: A parallel computer for molecular dynamics simulations. In Physics Computing, Vol. 92 (eds de Groot, R. A. & Nadrchal, J) 252–256 (World Scientific, Singapore, 1993). [Google Scholar]
  • 47.Hertwig, R. H. & Koch, W. On the parameterization of the local correlation functional. What is Becke-3-LYP?. Chem. Phys. Lett.268, 345–351 (1997). [Google Scholar]
  • 48.Jorgensen, W. L., Maxwell, D. S. & Tirado-Rives, J. Development and testing of the OPLS all-atom force field on conformational energetics and properties of organic liquids. J. Am. Chem. Soc.45, 11225–11236 (1996). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The computational data supporting the findings of this study are available from the corresponding author upon reasonable request.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES