Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2024 Jan 11;14:1098. doi: 10.1038/s41598-023-47698-1

Characterizing conformational states in GPCR structures using machine learning

Ilya Buyanov 1, Petr Popov 1,
PMCID: PMC10784458  PMID: 38212515

Abstract

G protein-coupled receptors (GPCRs) play a pivotal role in signal transduction and represent attractive targets for drug development. Recent advances in structural biology have provided insights into GPCR conformational states, which are critical for understanding their signaling pathways and facilitating structure-based drug discovery. In this study, we introduce a machine learning approach for conformational state annotation of GPCRs. We represent GPCR conformations as high-dimensional feature vectors, incorporating information about amino acid residue pairs involved in the activation pathway. Using a dataset of GPCR conformations in inactive and active states obtained through molecular dynamics simulations, we trained machine learning models to distinguish between inactive-like and active-like conformations. The developed model provides interpretable predictions and can be used for the large-scale analysis of molecular dynamics trajectories of GPCRs.

Subject terms: Protein analysis, Protein function predictions

Introduction

G protein-coupled receptors (GPCRs) represent a large transmembrane protein family with more than 800 human genes discovered1. Their main function is signal transduction through different cellular pathways, mediated by heterotrimeric G proteins2 and β-arrestins3, that propagate downstream signal cascades49. Dysregulation in GPCRs signaling may lead to the development of various pathologies, including oncology10, cardiovascular diseases11, neurodegenerative diseases12, and many others. Therefore, GPCRs are one of the most important pharmacological targets, and nearly 35% of the approved drugs target the GPCR protein family13. Recent progress in crystallography14 and cryo-electron microscopy15 allowed the determination of the three-dimensional structures of GPCRs in complex with drugs or drug-like molecules, providing insight for the structure-based drug discovery. Structural studies revealed that GPCRs presumably reside in different conformational states, that can be roughly divided into active or inactive with respect to the signaling pathway16. Further analysis of ~230 structures of 45 class A GPCRs identified a group of 34 amino acid residue pairs that contribute to a common activation pathway17. These amino acid residues mainly correspond to the well-studied structural motifs across the transmembrane (TM) bundle, such as CWxP18, PIF19, sodium pocket20, NPxxY21 and DRY22). Remarkably, the structural differences between the active and inactive states can exceed several Angstroms in terms of the root-mean-square deviation (RMSD). These findings bring attention to the computer-aided structure-based drug discovery, where even subtle changes in the input structures may significantly affect the virtual ligand screening results23. Thus, the input conformations are of crucial importance and should be selected considering the target drug type, that is agonist or antagonist with respect to the signaling pathway. A possible solution is to use one or several experimental structures corresponding to the target drug type24; however, GPCR structure determination is a very difficult and resource-consuming problem25. Alternatively, one may use modeled structures that become more and more reliable with the progress in machine learning applied to protein folding26,27. However, state-of-the-art modeling approaches do not allow direct modeling of a GPCR in the target conformational state. Moreover, typically, one annotates a conformational state of a GPCR structure by visual inspection or using simple heuristics, such as measurements of the outward movement of the TM6 helix with respect to the TM core28. Due to the increasing number of GPCR structures corresponding to the active and inactive states, it became possible to apply the machine learning approaches for the classification of such GPCR structures29. Further, deep learning methods were developed for the classification of molecular dynamics trajectories of GPCRs30. To the best of our knowledge, the developed machine learning-based methods lack interpretability with respect to the known GPCR activation mechanism17. In this study, we present a machine learning approach for conformational state annotation of GPCRs, dubbed as STAGS (STate Annotation of Gpcr Structures). We represent a GPCR conformation as a high dimensional feature vector comprising information about pairs of the amino acid residues contributing to the established GPCR activation pathway. Using 38 GPCR structures in inactive and active states, we constructed an annotated dataset of several thousand conformations obtained with molecular dynamic simulations. Next, we developed machine learning models to discriminate between inactive-like and active-like GPCR conformations. Finally, we demonstrated the use of the developed model in the downstream task, namely, the analysis of conformational ensembles obtained with molecular dynamics simulations.

Results

The STAGS workflow

The developed machine learning model takes the high-dimensional representation of a GPCR structure as the input and outputs the probability of this structure to belong either to the active or inactive state, and Figure 1 illustrates the overview of the STAGS workflow. The high-dimensional representation comprises 38 pair-wise distances between the centers of amino acid residues side chains contributing to the common activation mechanism of class A GPCRs17. To train the STAGS model we collected a dataset of 10 active and 28 inactive crystallographic structures of Class A GPCRs (see Table S1), and run short-range (20 ns) full-atom molecular dynamics simulations for each structure. We then retrieved 200 conformations evenly distributed across the trajectories, resulting in 7,600 structural models (see Methods), and calculated their high-dimensional representations, thus obtaining the 7,600×38 training matrix. Note, that longer simulations may result in the active to inactive conformational transitions31. Next, we derived machine learning models using stratified cluster-based splits for training and validation sets; we obtained the final model with nearly perfect accuracy for binary classification (see Table S2). Note, that decision trees commonly show state-of-the-art performance among classical machine learning approaches for tabular data, which is the case for structural motifs and microswitches in a GPCR32; moreover decision trees belong to the class of interpretable machine learning models33.

Figure 1.

Figure 1

Overview of the STAGS workflow comprising the dataset collection, feature engineering, machine learning model training and application. Data Collection represents a composition of a dataset of active and inactive conformations of class A GPCRs. Feature Engineering illustrates a calculation of the descriptor corresponding to the pair-wise distances between amino acid residues from the molecular dynamics simulations. ML Model illustrates a decision tree from the random forest. Scoring demonstrates the calculated scores for active- and inactive-like conformations of a GPCR.

The STAGS interpretability makes it possible to understand the decision-making by traversing the model’s estimators. To demonstrate this, we considered the structures of μ-type opioid receptor in the active-like (PDB ID : 6DDE) and inactive-like (PDB ID : 4DKL) states (see Figure 2). In the case of the active-like state, a decision tree considers distances for residue pairs 5×62:6×37, 7×45:7×49, 3×46:6×37, 7×54:8×51, 2×50:3×39, while in the case of the inactive-like state, the decision tree considers distances for residue pairs 5×62:6×37, 2×50:7×49, 5×58:6×40, 6×40:7×49, 5×55:6×41, 5×51:6×44, according to the GPCRdb numbering scheme (see Methods).

Figure 2.

Figure 2

(a) A flare-plot highlighting pairs of residues (connected lines), that are used to calculate the features for the training set. Line colours correspond to the relation of a residue pair to the GPCR activation pathway. Three examples of pair-wise distances are shown for the μ-opioid receptor. (b) Illustration of a decision tree for the inactive-like (PDB ID:4DKL) and active-like (PDB ID:6DDE) structures of the μ-opioid receptor.

These residues relate to microswitches (5×62:6×37, 3×46:6×37), hydrophobic lock (5×58:6×40, 5×55:6×41), sodium pocket (2×50:3×39, 2×50:7×49), and proximity to Y7×53 (7×54:8×51)17. The corresponding tree paths diverge on the 5×62:6×37 residue pair, because of the different distance between those residues (6.07 Å and 15.09 Å, for active- and inactive-like state structures, respectively), which corresponds to the mutual orientation between TM5 and TM6 important for the activation34. Interestingly, we observed that features, that are in the root of each estimator, typically involve residues of the transmembrane helix 6, for example 5×58:6×40, 3×46:6×37, and 5×62:6×37 (see Figure S4). Therefore, the STAGS model might capture the relative placement of the transmembrane helix 6 as one of the main characteristics of active/inactive GPCR states.

Analysis of conformational ensembles

For the first case study, we considered the molecular dynamics trajectory of the A2AR receptor in complex with the antagonist ZM241385 retrieved from GPCRmd35 (starting structure PDB ID: 4EIY). Figure 3b shows the trajectory profile with respect to the active state probability score obtained with STAGS. As one can see, most of the time the score is close to zero, indicating inactive-like conformations, which is expected. However, during ~0.1μs of the simulation, we observed a score increase, and for some frames, the score exceeds the active-like score threshold. A closer investigation of the MD trajectory revealed substantial changes in both the receptor conformation and the ligand binding pose (see Figure 3a). Indeed, when considering conformations A and B corresponding to the minimal and maximum scores of STAGS (0.0 and 0.65), the RMSD between the two ligand poses is 8.5 Å, so the receptor-ligand interactions differ a lot. For example, a non-polar contact between the furan ring of ZM241385 and ILE80(3×28) is present in the conformation A, but not the conformation B (4.0 Å vs. 8.4 Å); another example is a non-polar contact between the phenol group of ZM241385 and the oxygen atom of GLU169(5×30), which is stronger in conformation B compared to conformation A (1.7 Å vs. 3.1 Å). However, STAGS does not ’see’ the ligands and operates with the receptor conformations only; from this perspective, the model discriminates differences in pair-wise distances of residues related to the position of TM6 with respect to TM5 and TM7 (6×41:5×55 and 6×40:7×49). More precisely, the 6×41:5×55 distances are 8.78 Å and 4.9 Å and the 6×40:7×49 distances are 6.16 Å and 7.35 Å for the conformations A and B, respectively. Therefore, this case study demonstrates that STAGS can be used to detect important events in the molecular dynamics trajectories of GPCRs.

Figure 3.

Figure 3

STAGS captures changes in receptor-ligand interactions across the molecular dynamics trajectory (GPCRmd ID: 4410464; starting structure PDB ID: 4EIY). (a) Snapshots of two conformations and ligand binding poses corresponding to the minimal and maximal STAGS scores. (b) STAGS’s score distribution across the molecular dynamics trajectory. The dots represent the score values, and the solid curve represents smoothed score over five consecutive frames. The red solid line corresponds to the STAG’s classification threshold.

To further explore STAGS applied to MD simulations, we considered a more complex trajectory started from the intermediate state structure (GPCRmd ID: 31-10356; starting structure PDB ID: 2YDO). Accordingly, we observed a wide range of STAGS’s scores with 4:6 ratio of active-to-inactive classified structures, given the score threshold of 0.375 (see Figure 4b). Interestingly, although the starting structure was characterized as intermediate36, it is in complex with the agonist adenosine; and the first part of the trajectory corresponds to the highest probability of active like conformations according to STAGS. Therefore, we hypothesized that STAGS might capture the binding site characteristics of the active/inactive structures of GPCRs. To test this hypothesis, we measured the binding site RMSDs between the MD frames and experimentally determined active GPCR structures, as it follows. We considered the ligand binding site as the residues within 8 Å from ligands in a set of active-like A2a structures (PDB IDs: 2YDO, 5WF5, 4UG2, 5G53, 3QAK, 4UHR, 2YDV, 6GDG, 5WF5, 5WF6, 7EZC). In addition, we considered the G protein binding site as the residues within 8 Å from the alpha helix of Gα subunit in the active A2a structure (PDB ID: 5G53) (see Figure 4a). Then, we calculated the RMSD of the binding sites between each frame and the selected structures after superimposition, and Figure 4c shows the obtained results. As one can see, for both ligand and G protein binding sites, the frames classified as active show smaller RMSD values compared to the frames classified as inactive. Indeed, the Mann-Whitney statistics p-values corresponding to the null hypothesis, that there are no differences between RMSD values for the frames classified as active and inactive, are 2.5e-7 and 9.2e-47 for the ligand and G protein binding sites, respectively (3.7e-12 and 3.4e-73 for the student t-test statistics). It is important to note, that the median RMSD values are close to each other: 2.16 Å vs. 2.22 Å and 2.6 Å and 2.89 Å for the ligand and G protein binding sites, respectively. This can be explained by the fact, that the common activation pathway of GPCRs covers not only the binding site residues; hence other residues can play a decisive role in the frame’s classification. Overall, this case study demonstrates that STAGS can distinguish between active- and inactive-like GPCR structures and can be applied to trace conformational changes in the receptors.

Figure 4.

Figure 4

STAGS applied to the binding site analysis of the molecular dynamics trajectory of the GPCR intermediate state. (a) The structure of the A2a receptor (PDB ID: 5G53) with highlighted ligand (red) and G protein (magenta) binding sites. (b) The calculated STAGS’s score across the trajectory (GPCRmd ID: 3110356; starting structure PDB ID: 2YDO). (c) The distribution of the mean RMSD values for the subsets of trajectory frames classified as active (red) and inactive (blue).

With the progress in long-scale molecular dynamics simulations, it is important that the derived models are feasible for large-scale calculations. To test STAGS in the high-throughput setup, we applied it to the class A GPCR MD simulations retrieved from GPCRmd. In the first step, we retrieved 1417 trajectories from GPCRmd (as for 11.07.2022) for 502 GPCRmd entries comprising 254 unique PDB IDs. Filtering entries with distorted structures or without information about the receptor’s state resulted in 468, 98, and 115 trajectories annotated as inactive, active, and intermediate, respectively (see Table S5). In total, the 566 trajectories comprised 2,000,000 conformations, and it took STAGS 1 day on a desktop computer (NVIDIA GeForce GTX 1650, AMD® Ryzen 7 3800x 8-core processor × 16) to process them. As GPCRmd has a human-based annotation of the MD trajectories, we were interested in whether STAGS can be used to classify the whole trajectory as well. However, as we demonstrated above, a trajectory may contain both active-like and inactive-like frames, therefore its classification is not straightforward. To classify a trajectory as active or inactive, we introduced a threshold δ[0.0;1.0]; if the ratio of frames classified as active is larger than δ then the trajectory is classified as active (otherwise, inactive). We observed that for δ[0.2;0.8] the classification accuracy is 0.95 and it is decreasing for the other values (see Figure S1). Therefore, although in general human-based classification matches with the STAGS’s classification of the GPCR trajectories, many trajectories comprise frames that STAGS considered with a different state, indicating potentially important events in the simulation.

Conclusion

In this study, we presented STAGS (State Annotation of GPCR Structures), a novel machine learning approach for conformational state annotation of G protein-coupled receptors (GPCRs). By constructing a dataset comprising several thousand GPCR conformations obtained through molecular dynamics simulations, we successfully trained machine learning models to discriminate between inactive-like and active-like GPCR states. Leveraging information about amino acid residue pairs contributing to the established GPCR activation pathway, our models provide interpretable predictions, enhancing our understanding of GPCR conformational dynamics. We demonstrated STAGS efficacy in analyzing molecular dynamics trajectories and identifying events related to the ligand binding pose as well as active-inactive conformational changes. The STAGS source code is available at https://github.com/i-Molecule/gpcr-3D-annotation.

Methods

Dataset

In order to create a dataset of class A GPCR molecular dynamics trajectories, we formed a set of 38 structures, including 10 active state structures and 28 inactive state structures (Table S1) according to the literature, where crystallographical structures were presented. For each of the 38 structural models, we ran short molecular dynamics simulations (20 nanoseconds) and stored 200 frames corresponding to each 100 picoseconds snapshot. Thus, we obtained a dataset of 38×200=7,600 structural models with known conformational states. Finally, for each frame, we calculated a descriptor consisting of 38 pair-wise distances between the residues involved in the common activation mechanism of class A GPCRs (see Table S2). It is important to note that during molecular dynamics simulation, a receptor could shift from one state to another. It has been reported that conformational transition between the different states may occur on a scale of several hundred nanoseconds37. Therefore, we performed only short simulations of 20 nanoseconds, to minimize the risk of training set quality deterioration. To verify that there are no apparent conformational transitions, we measured the RMSD profile of a simulation trajectory with respect to the active- and inactive-like GPCR structures. Figure S3 shows the obtained results, as one can see (i) the RMSD profiles for active and inactive-like structures are clearly separated, and (ii) lower RMSD profiles correspond to the state of the starting GPCR structure.

Machine learning

For machine learning algorithms we considered SVM and Random forest approaches from the python sklearn38 package, as well as the XGBoost approach from the xgboost package39. It is important to note, that random splitting of the dataset into the train-validation-test would result in a biased performance because very similar frames of the same receptor are likely shared between the splits. Therefore, we divided the dataset into 38 folds, such that the entire molecular dynamics trajectory belongs to a particular fold. Then we split the dataset into train and test partitions of 28 and 10 folds, respectively, and used 5-fold cross-validation using the train partition. We used precision, recall, F1-score, accuracy, and MCC as the performance metrics:

Precision=TPTP+FP 1
Recall=TPTP+FN 2
F-score=2precisionrecallprecision+recall 3
Accuracy=TP+TNTP+TN+FP+FN 4
MCC=TPTN-FPFN(TP+FP)(TP+FN)(TN+FP)(TN+FN) 5

where, TN, TP, FP and FN represent true negatives, true positives, false positives and false negatives, respectively. We observed high performance in terms of all metrics (see Table S3).

Molecular dynamics

Prior to MD system assembly, the membrane for simulations was built once, for all the protein structures, using a membrane builder tool from CHARMM GUI web-server40. The membrane consisted of a homogeneous bilayer of 210 phosphatidylcholine (POPC) lipids. Receptors were placed into the membrane using InflateGRO methodology41. Note, that because different GPCRs have different sizes, the amount of lipid molecules that were replaced by different receptors varied. All molecular dynamics simulations were carried out in GROMACS 2020.3 software package42 using the Leap-Frog integration algorithm with 2-fs intervals. The system was built under periodic boundary conditions (PBC) with the dimensions of the computational cell set to 9.06 × 9.1 x 12.37 nm. CHARMM36 force field43 was utilized with the water model TIP3P44. The system was neutralized and Na+, Cl ions were added up to ionic strength equal to 0.15 mol/L. Energy minimization was performed using the Steepest Descent algorithm in 50000 steps until the maximum force was less than 1000.0 kJ/mol/nm. Equilibration consisted of two subsequent simulations: 100-ps NVT ensemble (constant Number of particles, Volume, and Temperature) with the V-rescale thermostat45 mode with fixed positions of the atoms of the protein backbone for relaxation of the environment and 1000-ps NPT ensemble (constant Number of particles, Pressure, and Temperature) with the Parrinello-Rahman barostat46. The LINCS (LINear Constraint Solver) algorithm47 was utilized for restriction of all bonds and heavy atoms in the NVT and NPT stages. Following the equilibration, the 20-ns MD trajectory was calculated using the Parrinello-Raman barostat with preliminary annealing of individual parts of the system: the lipid bilayer, protein, and water with dissolved ions through two points from a temperature of 5 K to 315 K (Table 1). Pressure coupling was performed using time constant equal to 10.0 ps. Temperature coupling was performed using time constant equal to 0.1 ps. The Coulomb interactions were considered explicitly at small distances (up to 1.2 ns), and the long-range part of the potential was approximated by the Ewald summation method. To consider van der Waals interactions, a cutoff of 1.2 nm was used. For each system, one corresponding MD trajectory was calculated.

Table 1.

Molecular dynamics simulations parameters.

Step Time Tcouple Pcouple Integrator
Energy minimization 50 000 steps Steepest descent minimization
Nvt 100 ps V-rescale Leap-frog
Npt 1000 ps Nose-Hoover Parrinello-Rahman Leap-frog
Molecular dynamics 20 ns V-rescale Parrinello-Rahman Leap-frog

Features

We considered pairs of interacting amino acid residues, involved in common GPCR activation pathway17, as features suitable for classification of the conformational states of GPCR structures (see Table S2). More precisely, we calculated the distances between the two residues as the distance between the geometric centers of the residue side chains. We performed standard feature analysis, including feature importance and cross-correlation, and the most important features appeared to be associated with the G protein coupling region and microswitch residues (3×43:6×41, 3×50:7×53, 3:46:6×37). The cross-correlation analysis has shown that there were no strongly correlated features in our dataset (Pearson correlation coefficient |r| < 0.9). We have used the GPCRdb numbering scheme to denote the residues48, which is an improved version of the sequence-based Ballesteros-Weinstein numbering scheme49. For each transmembrane helix, the most conserved residue is assigned to a number 50, and the other residues within the helix are numbered with respect to this residue; for example, a residue 3×46 is four residues before the most conserved residue of the transmembrane helix 3.

Supplementary Information

Acknowledgements

This work was supported by Russian Scientific Foundation project No. 22-74-10098. The authors thank Natalia Sivitskaia for thoughtful suggestions and remarks.

Author contributions

I.B. and P.P. constructed the training, validation, and test sets, processed protein structures, formulated the machine learning problem, developed STAGS, conducted numerical experiments, performed data analysis and wrote the manuscript. P.P. organized and managed the project implementation, and supervised the research.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Data availability

The source code to derive and apply the STAGS model is freely available at https://github.com/i-Molecule/gpcr-3D-annotation.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-023-47698-1.

References

  • 1.Insel PA, et al. Gpcromics: an approach to discover gpcr drug targets. Trends Pharmacol. Sci. 2019;40:378–387. doi: 10.1016/j.tips.2019.04.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Oldham WM, Hamm HE. Heterotrimeric g protein activation by g-protein-coupled receptors. Nat. Rev. Mol Biol. 2008;9:60–71. doi: 10.1038/nrm2299. [DOI] [PubMed] [Google Scholar]
  • 3.Eichel K, von Zastrow M. Subcellular organization of gpcr signaling. Trends Pharmacol. Sci. 2018;39:200–208. doi: 10.1016/j.tips.2017.11.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Simon MI, Strathmann MP, Gautam N. Diversity of g proteins in signal transduction. Science. 1991;252:802–808. doi: 10.1126/science.1902986. [DOI] [PubMed] [Google Scholar]
  • 5.Krumins AM, Gilman AG. Targeted knockdown of g protein subunits selectively prevents receptor-mediated modulation of effectors and reveals complex changes in non-targeted signaling proteins. J. Biol. Chem. 2006;281:10250–10262. doi: 10.1074/jbc.M511551200. [DOI] [PubMed] [Google Scholar]
  • 6.Kristiansen K. Molecular mechanisms of ligand binding, signaling, and regulation within the superfamily of g-protein-coupled receptors: molecular modeling and mutagenesis approaches to receptor structure and function. Pharmacol. Therap. 2004;103:21–80. doi: 10.1016/j.pharmthera.2004.05.002. [DOI] [PubMed] [Google Scholar]
  • 7.Milligan G, Kostenis E. Heterotrimeric g-proteins: a short history. Brit. J. Pharmacol. 2006;147:S46–S55. doi: 10.1038/sj.bjp.0706405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Smrcka A. G protein βγ subunits: central mediators of g protein-coupled receptor signaling. Cell. Mol. Life Sci. 2008;65:2191–2214. doi: 10.1007/s00018-008-8006-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Khan SM, et al. The expanding roles of gβγ subunits in g protein-coupled receptor signaling and drug action. Pharmacol. Rev. 2013;65:545–577. doi: 10.1124/pr.111.005603. [DOI] [PubMed] [Google Scholar]
  • 10.Lappano R, Maggiolini M. G protein-coupled receptors: novel targets for drug discovery in cancer. Nat. Rev. Drug Discov. 2011;10:47–60. doi: 10.1038/nrd3320. [DOI] [PubMed] [Google Scholar]
  • 11.Wang J, Gareri C, Rockman HA. G-protein-coupled receptors in heart disease. Circ. Res. 2018;123:716–735. doi: 10.1161/CIRCRESAHA.118.311403. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Huang Y, Todd N, Thathiah A. The role of gpcrs in neurodegenerative diseases: avenues for therapeutic intervention. Curr. Opin. Pharmacol. 2017;32:96–110. doi: 10.1016/j.coph.2017.02.001. [DOI] [PubMed] [Google Scholar]
  • 13.Sriram K, Insel PA. G protein-coupled receptors as targets for approved drugs: how many targets and how many drugs? Mol. Pharmacol. 2018;93:251–258. doi: 10.1124/mol.117.111062. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Cherezov V, Abola E, Stevens RC. Recent progress in the structure determination of gpcrs, a membrane protein family with high potential as pharmaceutical targets. Membr. Protein Struct. Determ. 2010;12:141–168. doi: 10.1007/978-1-60761-762-4_8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Zhang X, et al. Evolving cryo-em structural approaches for gpcr drug discovery. Structure. 2021;11:5498962. doi: 10.1016/j.str.2021.04.008. [DOI] [PubMed] [Google Scholar]
  • 16.Thal DM, Glukhova A, Sexton PM, Christopoulos A. Structural insights into g-protein-coupled receptor allostery. Nature. 2018;559:45–53. doi: 10.1038/s41586-018-0259-z. [DOI] [PubMed] [Google Scholar]
  • 17.Zhou Q, et al. Common activation mechanism of class a gpcrs. Elife. 2019;8:e50279. doi: 10.7554/eLife.50279. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Olivella, M., Caltabiano, G. & Cordomi, A. The role of cysteine 6.47 in class a gpcrs. BMC Struct. Biol.13, 1–11 (2013). [DOI] [PMC free article] [PubMed]
  • 19.Martí-Solano M, Sanz F, Pastor M, Selent J. A dynamic view of molecular switch behavior at serotonin receptors: implications for functional selectivity. PLoS One. 2014;9:e109312. doi: 10.1371/journal.pone.0109312. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Massink A, et al. Sodium ion binding pocket mutations and adenosine a2a receptor function. Mol. Pharmacol. 2015;87:305–313. doi: 10.1124/mol.114.095737. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Nygaard R, Frimurer TM, Holst B, Rosenkilde MM, Schwartz TW. Ligand binding and micro-switches in 7tm receptor structures. Trends Pharmacol. Sci. 2009;30:249–259. doi: 10.1016/j.tips.2009.02.006. [DOI] [PubMed] [Google Scholar]
  • 22.Alewijnse AE, et al. The effect of mutations in the dry motif on the constitutive activity and structural instability of the histamine h2receptor. Mol. Pharmacol. 2000;57:890–898. [PubMed] [Google Scholar]
  • 23.Lee Y, Basith S, Choi S. Recent advances in structure-based drug design targeting class ag protein-coupled receptors utilizing crystal structures and computational simulations. J. Med. Chem. 2018;61:1–46. doi: 10.1021/acs.jmedchem.6b01453. [DOI] [PubMed] [Google Scholar]
  • 24.Wang J, et al. The structural study of mutation-induced inactivation of human muscarinic receptor m4. IUCrJ. 2020;7:294–305. doi: 10.1107/S2052252520000597. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Maeda S, Schertler GF. Production of gpcr and gpcr complexes for structure determination. Curr. Opin. Struct. Biol. 2013;23:381–392. doi: 10.1016/j.sbi.2013.04.006. [DOI] [PubMed] [Google Scholar]
  • 26.Jumper J, et al. Highly accurate protein structure prediction with alphafold. Nature. 2021;596:583–589. doi: 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Baek, M. et al. Accurate prediction of protein structures and interactions using a 3-track network. bioRxiv (2021). [DOI] [PMC free article] [PubMed]
  • 28.Shi L, et al. β2 adrenergic receptor activation: Modulation of the proline kink in transmembrane 6 by a rotamer toggle switch. J. Biol. Chem. 2002;277:40989–40996. doi: 10.1074/jbc.M206801200. [DOI] [PubMed] [Google Scholar]
  • 29.Yadav P, Mollaei P, Cao Z, Wang Y, Farimani AB. Prediction of gpcr activity using machine learning. Comput. Struct. Biotechnol. J. 2022;20:2564–2573. doi: 10.1016/j.csbj.2022.05.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Li C, et al. An interpretable convolutional neural network framework for analyzing molecular dynamics trajectories: A case study on functional states for g-protein-coupled receptors. J. Chem. Inf. Model. 2022;62:1399–1410. doi: 10.1021/acs.jcim.2c00085. [DOI] [PubMed] [Google Scholar]
  • 31.Dror RO, et al. Activation mechanism of the β2-adrenergic receptor. Proc. Natl. Acad. Sci. 2011;108:18684–18689. doi: 10.1073/pnas.1110499108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Borisov V, et al. Deep neural networks and tabular data: A survey. IEEE Trans. Neural Netw. Learn. Syst. 2022;14:8598. doi: 10.1109/TNNLS.2022.3229161. [DOI] [PubMed] [Google Scholar]
  • 33.Lundberg, S. M. et al. Explainable ai for trees: From local explanations to global understanding. arXiv preprint arXiv:1905.04610 (2019). [DOI] [PMC free article] [PubMed]
  • 34.Hulme EC. Gpcr activation: a mutagenic spotlight on crystal structures. Trends Pharmacol. Sci. 2013;34:67–84. doi: 10.1016/j.tips.2012.11.002. [DOI] [PubMed] [Google Scholar]
  • 35.Rodríguez-Espigares I, et al. Gpcrmd uncovers the dynamics of the 3d-gpcrome. Nat. Methods. 2020;17:777–787. doi: 10.1038/s41592-020-0884-y. [DOI] [PubMed] [Google Scholar]
  • 36.Lebon G, et al. Agonist-bound adenosine a2a receptor structures reveal common features of gpcr activation. Nature. 2011;474:521–525. doi: 10.1038/nature10136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Latorraca NR, Venkatakrishnan A, Dror RO. Gpcr dynamics: structures in motion. Chem. Rev. 2017;117:139–155. doi: 10.1021/acs.chemrev.6b00177. [DOI] [PubMed] [Google Scholar]
  • 38.Pedregosa F, et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011;12:2825–2830. [Google Scholar]
  • 39.Chen, T. & Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, 785–794, 10.1145/2939672.2939785 (ACM, New York, NY, USA, 2016).
  • 40.Jo S, Kim T, Iyer VG, Im W. Charmm-gui: a web-based graphical user interface for charmm. J. Comput. Chem. 2008;29:1859–1865. doi: 10.1002/jcc.20945. [DOI] [PubMed] [Google Scholar]
  • 41.Schmidt TH, Kandt C. Lambada and inflategro2: efficient membrane alignment and insertion of membrane proteins for molecular dynamics simulations. J. Chem. Inf. Model. 2012;52:2657–2669. doi: 10.1021/ci3000453. [DOI] [PubMed] [Google Scholar]
  • 42.Van Der Spoel D, et al. Gromacs: fast, flexible, and free. J. Comput. Chem. 2005;26:1701–1718. doi: 10.1002/jcc.20291. [DOI] [PubMed] [Google Scholar]
  • 43.Huang J, et al. Charmm36m: an improved force field for folded and intrinsically disordered proteins. Nat. Methods. 2017;14:71–73. doi: 10.1038/nmeth.4067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Mark P, Nilsson L. Structure and dynamics of the tip3p, spc, and spc/e water models at 298 k. J. Phys. Chem. A. 2001;105:9954–9960. doi: 10.1021/jp003020w. [DOI] [Google Scholar]
  • 45.Bussi G, Donadio D, Parrinello M. Canonical sampling through velocity rescaling. J. Chem. Phys. 2007;126:014101. doi: 10.1063/1.2408420. [DOI] [PubMed] [Google Scholar]
  • 46.Melchionna S, Ciccotti G, Lee Holian B. Hoover npt dynamics for systems varying in shape and size. Mol. Phys. 1993;78:533–544. doi: 10.1080/00268979300100371. [DOI] [Google Scholar]
  • 47.Hess B, Bekker H, Berendsen HJ, Fraaije JG. Lincs: A linear constraint solver for molecular simulations. J. Comput. Chem. 1997;18:1463–1472. doi: 10.1002/(SICI)1096-987X(199709)18:12&#x0003c;1463::AID-JCC4&#x0003e;3.0.CO;2-H. [DOI] [Google Scholar]
  • 48.Isberg V, et al. Gpcrdb: an information system for g protein-coupled receptors. Nucleic Acids Res. 2016;44:D356–D364. doi: 10.1093/nar/gkv1178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Ballesteros, J. A. & Weinstein, H. [19] integrated methods for the construction of three-dimensional models and computational probing of structure-function relations in g protein-coupled receptors. In Methods in Neurosciences, vol. 25, 366–428 (Elsevier, 1995).

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The source code to derive and apply the STAGS model is freely available at https://github.com/i-Molecule/gpcr-3D-annotation.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES