Abstract
GPCRs are master regulators of cell signaling by transducing extracellular stimuli into the cell via selective coupling to intracellular G-proteins. Here we present a computational analysis of the structural determinants of G-protein-coupling repertoire of experimental and predicted 3D GPCR-G-protein complexes. Interface contact analysis recapitulates structural hallmarks associated with G-protein-coupling specificity, including TM5, TM6 and ICLs. We employ interface contacts as fingerprints to cluster Gs vs Gi complexes in an unsupervised fashion, suggesting that interface residues contribute to selective coupling. We experimentally confirm on a promiscuous receptor (CCKAR) that mutations of some of these specificity-determining positions bias the coupling selectivity. Interestingly, Gs-GPCR complexes have more conserved interfaces, while Gi/o proteins adopt a wider number of alternative docking poses, as assessed via structural alignments of representative 3D complexes. Binding energy calculations demonstrate that distinct structural properties of the complexes are associated to higher stability of Gs than Gi/o complexes. AlphaFold2 predictions of experimental binary complexes confirm several of these structural features and allow us to augment the structural coverage of poorly characterized complexes such as G12/13.
Subject terms: Cellular signalling networks, Molecular modelling, Computational biophysics, G protein-coupled receptors
Selective GPCR-G protein complexes formation is critical for signal transduction regulation. Here, the authors use a data-driven approach to show that the structures of experimental and predicted complex interfaces inform, at least partially, on G protein binding preferences.
Introduction
G-protein-coupled receptors (GPCRs) constitute the largest family of cell-surface receptors, making them a primary pharmacological class which is targeted by approximately one-third of the marketed drugs1. They transduce extracellular physico-chemical stimuli to intracellular signaling pathways by coupling to one or more heterotrimeric G-proteins2,3, which are grouped into four major families: Gs, Gi/o, Gq/11 and G12/13 based on homology of their α-subunits4. GPCRs’ downstream activity is controlled by β-arrestins, which desensitize GPCRs’ activity and provide an additional layer of signaling modulation via ERK5. Receptor’s conformational change upon ligand binding leads to recognition and activation of intracellular G-proteins. Every mammalian GPCR displays a unique repertoire of G-protein-coupling preferences, ranging from highly selective to promiscuous profiles, which orchestrate specific cellular responses6. Aberrant signal transduction is linked to many pathological states, including cancer7–12. A mechanistic understanding of the signal transduction processes, integrated with multi-modal data associated with a disease state, can inform targeted therapies and personalized medicine procedures (e.g.13).
The experimental profiling of specific coupling preferences is critical to understanding GPCR biology and pharmacology. The binding activities of GPCRs for transducer proteins are being quantitatively screened via medium-throughput methodologies14–17. Based on binding profiling from these large-scale experimental assays, sequence-based machine learning for coupling specificity has been proposed18,19. Phylogenetic analysis of co-evolutionary patterns inferred from sequence alignments of GPCRs and G-proteins have also provided insights into the sequence determinants of coupling specificity, for the entire GPCR family20 as well as for specific subfamilies21. The abundance of experimental structures for GPCRs in ligand-dependent, alternative functional states has also shed light on the structural hallmarks controlling receptor activation for class A GPCR22 as well as across classes23.
The determination of receptor-G-protein complex structures is also progressing rapidly, with over 360 complex structures deposited in the PDB (as of March 2023). The determination of the first structures of Gi coupled receptor complexes allowed for initial comparisons with Gs counterparts and highlighted a role of TM5 and TM6 as selectivity filter24–28. As a complement to the release of the MT1-Gi complex, we also systematically compared available Gs and Gi/o complexes with Class A receptors in terms of interface contact networks and G-protein docking mode similarity assessed via structural alignment29. The recent determination of four structures of the serotonin receptors (e.g. 5-HT4, 5-HT6 and 5-HT7 with Gs, and 5-HT4 with Gi1) confirmed the role of TM5 and TM6, and in particular their variable length, as a selectivity filter for G-protein binding30. The authors also showed via bioinformatics analysis that this macro-switch is conserved among other class A GPCRs30. Yet, a comprehensive picture of the structural hallmarks of coupling specificity remains elusive.
In this work we analyze through structural bioinformatics experimental, as well as predicted, GPCR-G-protein 3D complexes to shed further light on the structural basis of coupling specificity through the analysis of interaction interface contact networks, G-protein docking modes and binding energies (Fig. 1A).
Results
Different G-protein complexes are characterized by alternative contact network topologies
We considered a total of 362 3D experimental GPCR-G-protein complexes, comprising 166 Gs, 184 Gi/o, 9 Gq/11 and 3 G12/13 complexes, corresponding to 93, 17, 10, 3, and 2 unique receptors from Class A, B1, B2, C and Frizzled, respectively, and entailing 9 different G-proteins (i.e. GNAS, GNAI1, GNAI2, GNAI3, GNAT1, GNAO1, GNAQ, GNA11, GNA13) (Fig. 1B; Supplementary Data 1). To avoid any bias due to redundant structures solved for the same GPCR-G-protein complex, we derived a set of 125 non-redundant 3D complexes by considering representative structures for each receptor-G-protein pair, using resolution and canonical sequence coverage as selection criteria (Fig. 1C; see Methods). We first identified the residues that are in spatial contact at the GPCR-G-protein interaction interface (see Methods). We then mapped contacting residues to consensus numbering through GPCRdb31 (Supplementary Data 2) and the common G-protein numbering (CGN)32 schemes (Supplementary Data 3), respectively for GPCRs and G-proteins. We aggregated contacts based on secondary structure elements (SSEs; Fig. 2A), to yield a network of interacting SSE elements at GPCR-G-protein interfaces (Fig. 1A and Fig. 2A). For the most abundant coupling groups (i.e. Gs and Gi/o), we derived specific SSE contact networks by pooling contacts on the basis of the bound G-protein. SSE contact networks highlight structural signatures specific to each coupling group. Certain SSEs are invariably central within the interface network, such as TM5, ICL2, or ICL3 for GPCRs (Fig. 2B–D) or H5 for the G-protein (Fig. 2B, C, E). Other elements vary their connectivity based on the bound G-protein. In particular, TM5 has a higher degree of interacting SSEs in Gs complexes as well as an overall number of contacts, while Gi/o complexes are instead characterized by higher interconnectivity at the ICL1, TM3, TM6, and ICL3 (Fig. 2B–D). Differences in the overall network topology also emerged when we measured the information flow, quantified as the number of shortest paths passing through each node (i.e. betweenness centrality; see Methods). Indeed, TM5, ICL2, and H8 have a higher betweenness centrality in Gs complexes, while TM3, TM6, and ICL1 prevail in Gi/o ones (Fig. 2F). Overall, Gs contact graphs are significantly different from Gi/o ones, as assessed by comparing distances, computed as the Frobenius norm of the difference between the adjacency matrices of the interface contact graphs (permanova P = 1E-06; see Methods).
Contact interface fingerprints imprint coupling specificity
We employed interface contacts to build interaction fingerprints, which are vectors that numerically encode the presence or absence of contact, and which can be used to compare in an unsupervised way GPCR-G-protein complexes based on their interface’s structural features (Fig. 1A). We have generated interface fingerprints by mapping either residue pairs at each vector position (Complex fingerprints, or CF), or contact positions separately for the receptor and G-protein (Receptor and G-protein fingerprints, respectively RF and GF; see Methods). We also estimated the contact positions that are more frequent than expected in each coupling group through log-odds ratio statistics (see Methods), and we used this information to filter the most informative contacts for Gs and Gi/o couplings (Fig. 3). CF clustering identifies two main clusters: the largest one (cluster 1), is enriched with Gs complexes from both class A and Class B, as well as some Gi/o complexes involving class A, class C and F receptors (Fig. 3 and Fig. 4A). The second cluster (i.e. cluster 2), is enriched almost exclusively with Gi/o complexes (Fig. 3 and Fig. 4A). In cluster 2, more receptors show promiscuous couplings towards other G-proteins, in particular towards G12/13 (35% vs 15% in Cluster 1) and Gq/11 (55% vs 45% in Cluster 1) (Fig. 4A). The higher promiscuity between Gi/o and G12/13 couplings is also observed when considering all couplings, even with no structure, available from the Universal Coupling Map (UCM33; Supplementary Fig. 1). When relaxing the criteria to perform CF clustering by removing the log-odds ratio filter and by including all unique complexes, the clustering is now mainly guided by the receptor class, with the largest one entailing Class A complexes and the smallest one all the other classes (Supplementary Fig. 2). Within each cluster, subclusters enriched in Gs or Gi/o complexes can be identified. Available G12/13 and most of the Gq/11 complexes cluster within the Gi/o-enriched subcluster (Supplementary Fig. 2). In detail, the S1PR2-GNA13 structure (i.e., PDBID: 7T6B), is clustered within Class A, Gi/o subgroup along with other G12/13 binders, such as LPAR1 and S1PR5. Similarly, ADGRG1- and ADGRL3-GNA13 complexes are found within the Gi/o subcluster of the second, class B enriched, cluster (Supplementary Fig. 2). This suggests a structural (likely evolutionary imprinted - see Discussion) connection between Gi/o, Gq/11 and G12/13 proteins.
Overall, Gs couplings are characterized by a significantly higher number of enriched contacts with respect to Gi/o ones (Pmann-whitney = 4.69E-14; Fig. 4B). We also performed clustering and enrichment with fingerprints of receptors (RF) and G-proteins (GF) separately. The RF clustering chiefly points to inter-class differences, separating complexes formed by ClassA receptors from those involving other classes (Supplementary Fig. 3A), while not showing particular contact enrichment differences between Gs and Gi/o complexes (Pmann-whitney = 4.4E-1; Fig. 4C). The GF clustering better separates these groups (Supplementary Fig. 3B) and displays marginally significant differences in contact enrichment distributions (Pmann-whitney = 4.4E-2; Fig. 4C). This suggests that the combination between G-protein and receptor’s residues provides maximum fine-tuning to the recognition process (Fig. 4B).
Complex fingerprints clustering and heatmaps helped visualizing the contact positions that are characteristic of certain G-protein-couplings (Figs. 3A and 4D, E). For instance, several contacts are observed more frequently in Gs complexes, including ICL2-s2s3.1, ICL2.52-G.S3.1, 5.65-G.H5.26, 5.68-G.H5.26, 5.72-G.h4s6.20, ICL3-h4s6.3, 6.39-G.H5.24, 6.39-G.H5.25, 6.40-G.H5.25. The latter positions (i.e. 6.39 and 6.40) are example of Gs, class B-specific contacts (Fig. 3 and Fig. 4D, F). These are favored by the characteristic TM6 break characterizing Class B receptors34, which allows residues on the TM6’s N-terminal half to approach the G-protein H5 C-term. On the other hand, the following contacts are exclusively enriched in Gi/o complexes: 2.39-G.H5.24, 3.49-G.H5.23, 3.50-G.H5.25, 3.53-G.H5.22 (or G.H5.23). Particularly striking is the enrichment of the contact involving the highly conserved D3.49, which is found exclusively in Gi/o complexes (Fig. 3 and Fig. 4E, G). Contacts mediated by the DRY R3.50 or 3.53 positions, while enriched in Gi/o complexes (Fig. 3), also mediate contacts in Gs complexes (e.g. 3.50-G.H5.23, Fig. 4D, E). Other contacts specifically enriched in Gi/o complexes are: ICL2.51-G.hns1.3, ICL2.51-G.s2s3.2, ICL2.54-G.S3.1, 5.71-G.H5.9, 6.25-G.h4s6.9, 6.28-G.h4s6.9, 6.29-G.H5.17, 6.29-G.H5.26, 6.33-G.H5.20 (G.H5.25) as well as all the contacts mediated by positions 8.47, 8.48, 8.49, 8.50 (Figs. 3A and 4E).
Overall, GPCR positions such as 5.61, 5.64, 5.65, 5.69, 5.76, 6.39, and 6.40 are enriched in Gs complexes (Fig. 4D), while position 2.37, 2.39, 3.49, ICL2.50-2.55, 6.25, 6.29, 6.33, 6.34, 6.37, 7.53, or 8.50 are enriched in Gi/o complexes (Fig. 4E). Likewise, the G-protein contact positions specifically enriched in Gs complexes are h4s6.20, h4s6.3, H4.26, H5.11 (Fig. 4D), while positions h4s6.9, h4s6.12, s6.1, H5.9, H5.21, H5.22, H5.26 are enriched in Gi/o complexes (Fig. 4E).
Notably, certain GPCR positions hold switch characteristics, in other words, some of the contacts that they mediate are enriched in Gs and others in Gi/o depending on the partner residues. For example, the contact of 5.65 with G.H5.16 is enriched in Gi/o, while the ones with G.H5.25 and G.H5.26 are in Gs. Similar patterns are observed for distinct contacts mediated by positions 5.68,5.69 and 5.72 (Fig. 3 and Fig. 4D, E).
Switching G-protein selectivity through contact interface mutation
To demonstrate the effect on G-protein coupling of the identified contact fingerprints, we employed a multistate-design computational protocol35 to design receptor mutants specific for a selected G-protein and with reduced affinity for the others, which we then validated through the NanoBit G-protein dissociation assay14 (Fig. 5A; Methods). We chose as starting templates for the design the structures of CCKAR, which has been solved in complex with both Gs (PDB ID: 7EZK) and Gi/o (PDB ID: 7EZH), and we focused the mutagenesis on the receptor positions forming the contact pairs most enriched in Gs or Gi/o complexes (Supplementary Data 4). We carried out two sets of designs: in one hand we sought to retain Gs while removing Gi/o couplings, and on the other, we maintained Gi/o while reducing binding to Gs (Fig. 5A). We found that certain mutations were more recurrent in top-designed sequences (Fig. 5A; see Methods). In particular, mutations A3036.25K, V3116.33H, K3758.48R and R3768.49L were predicted to reduce Gi/o while retaining Gs binding, whereas mutations S149ICL2.53A, V151ICL2.55K, K3086.30 R, and K3758.48P were recurrently predicted to reduce Gs while retaining Gi/o binding (Fig. 5A–D). These were subsequently tested through the NanoBiT G-protein dissociation assay using the NanoBiT-Gs and NanoBiT-Gi1 sensors. Among the eight designed mutations, two (V3116.33H and R3768.49V) and one (K3086.30R) were found to be Gs-over-Gi biased and Gi-over-Gs biased, respectively, and the rest of the five had no effect on the Gs-vs-Gi balance (Fig. 5E–G and Supplementary Fig. 4). We note that expression level of V3116.33H was equivalent to that of WT (1:4). These data confirmed the importance of the identified contacts in switching coupling preferences between these two G-proteins.
Different repertoires of G proteins docking modes
We assessed the overall 3D similarity of GPCR-G-protein complexes via structural alignment, with a particular focus on the docking mode similarity of the G-protein α-subunits with respect to the receptor. To this end, we first superimposed the Cα-atoms of the most conserved positions within the 7TM bundle (i.e. that are present in all the solved structures) and then calculated the Root Mean Squared Deviation (RMSD) of the Cα-atoms of conserved positions of the Gα subunit (Fig. 1A; see Methods). The clustering of 3D complexes based on their RMSD shows that Gs complexes tend to cluster separately from Gi/o ones (Pmann-whitney = 2.5E-14; P-permanova = 1E-6; Fig. 6, Fig. 7A and Supplementary Fig. 5). The largest cluster comprises only Class A receptors, the vast majority bound to Gi/o proteins, with the only exception of a few Gs complexes (i.e. MC2R, GALR2), as well as the BDKRB2-GNAQ and S1PR2-GNA13 complexes (Fig. 6). The second largest cluster involves the vast majority of Gs complexes, including Class A and the totality of class B receptors. This cluster also comprises several Gq/11 (i.e. CCKAR, CCKBR, CRHR2, HRH1), Gi/o (i.e. OPRK1, OPRD1, OPRL1, OPRM1, CX3CR1, ADGRF1 and ADGRG3) as well as two G12/13 complexes (i.e. ADGRG1 and ADGRL3). Finally, a third, smaller outgroup cluster contains complexes involving class A, C, and F receptors most deviating from the other structures. Also in this case, the receptors in the largest, Gi/o enriched cluster show a more promiscuous tendency, with Gq/11 and G12/13 as most recurrent secondary couplings and Gs as the least recurrent one (Fig. 6 and Fig. 7A). When considering only class A receptors, the RMSD distributions are no longer significantly different (Pmann-whitney = 6.7E-1; Supplementary Fig. 6), although characterized by significantly different centroids (P-permanova = 1E-6). We have also estimated residue level deviations of the Gα subunits of the fitted complexes by calculating Root Mean Square Fluctuations (RMSF; see Methods) and compared the profiles obtained for Gs and Gi/o complexes, which highlighted significantly higher fluctuations for Gi/o complexes with respect to Gs ones (Wilcoxon P = 1.18E-13), for the whole Ras GTPase domain and in particular for regions such as H2, H3 as well as the C-terminal lobe of the Ras domain (Fig. 7C). Overall, Gs complexes display less variability of the terminals, with H5 appearing more conformationally restrained and bent towards TM3 and ICL2, while Gi/o complexes display greater conformational variability for αN and H5 (Fig. 7D left). A comparison of representative structures of the four coupling groups show slight differences in the docking mode of each representative, which are nevertheless smaller for Gi/o, Gq, and G12/13 (Fig. 7E).
We also explored the potential conformational bias of the nanobodies (i.e. Fab16 or Nb35) used to stabilize the bound G-protein on the observed G-protein docking modes. First, we annotated the presence/absence of the nanobody for each complex subjected to RMSD clustering. We observed no correlation between RMSD clusters and the presence or absence of nanobodies (Fig. 6 and Supplementary Fig. 5). Second, we relaxed the GPCR-heterotrimeric G-protein complex without such nanobodies, using state-of-the-art methods for structural refinement (Rosetta relax; see Methods). Both RMSD and RMSF analysis performed on relaxed structures showed even larger statistically significant differences between Gs and Gi/o complexes (Pmann-whitney = 1.1-36; Supplementary Fig. 7). Notably, the differences of the Gs and Gi/o RMSD distributions is still significant when considering only class A receptors (Pmann-whitney = 6.5E-6; Supplementary Fig. 8).
Different energies characterize specific GPCR-G-protein interfaces
We exploited the relaxed GPCR-heterotrimeric G-protein complexes to further characterize the binding interface energy of the complex using Rosetta InterfaceAnalyzer36 (see Methods). By considering all available GPCR Gα-subunit pairs with an experimentally resolved complex, we showed that the ∆G of binding of Gs complexes is significantly lower than Gi/o complexes (Pmann-whitney = 7.2E-6; Fig. 8A) and it partially correlates with the slightly higher ∆SASA observed for Gs complexes compared to Gi/o ones (Pmann-whitney = 1.4E-3; Fig. 8B). When considering class A receptors only, the difference in binding energy distribution between Gs and Gi/o complexes is even larger (Pmann-whitney = 3.4E-6; Supplementary Fig. 9). Intriguingly, we observed that Gs is bound less strongly to class B1 than class A receptors (Pmann-whitney = 2.3E-3; Fig. 8C), suggesting that receptors from different classes might bind to the same G-protein with different affinities due to different structural and functional requirements. On the other hand, the same receptor always binds with higher affinity to Gs than Gi/o. Indeed, we have compared the binding energies of complexes of the same receptor (e.g. GCGR, CCKAR, and HTR4) with both Gs and Gi/o proteins. Notably, the ∆G of binding for Gs is always lower and is characterized by higher ∆SASA compared to Gi/o irrespective of the slight docking modes variations observed in Gs complex structures of the same receptor (Fig. 8D–F).
AlphaFold2 predictions extend our understanding of the structural basis of coupling diversity
To further our understanding of the structural basis of GPCR-G-protein recognition, we predicted through AlphaFold-multimer (v2.3)37 996 GPCR - G-protein alpha subunit pairs from UCM binary interactions (see Methods). We first benchmarked AF-multimer by assessing the prediction performances with or without 3D templates for 125 representative GPCR-G-protein complexes with available experimental structures. Using 3D templates in the prediction slightly improves the performances, assessed by measuring the deviation of predicted vs. experimental interfaces via either the DockQ metric (Wilcoxon P = 0.021; Fig. 9A) or the fraction of native contacts (Wilcoxon P = 0.004; Fig. 9B). It must be emphasized that even without templates multimer’s predictions achieve almost comparable performances, yielding an average DockQ score of 0.645 vs. 0.664 obtained for predictions with templates (Fig. 9A). We, therefore, ran multimer predictions with available templates for receptor-heterotrimeric G-protein complexes whose receptor-α subunit binary interactions are reported in the UCM dataset. Since some predicted complexes showed unrealistic docking topology, we created a composite filter based on structural, topological, and statistical scoring metrics (e.g. pDockQ) to remove predicted structures with unrealistic complex topologies as well as regions predicted with low confidence (based on pLDDT; see Methods). Such a filtering scheme improves the correlation between prediction scores (pDockQ) and distance from experimental structures (DockQ) in the benchmark (Fig. 9C, D). We applied this filter to the multimer’s predictions, yielding a set of 825 complexes for downstream processing (Supplementary Data 5). Contact analysis performed on predicted complexes revealed patterns similar to those observed experimentally, in particular, the models recapitulated most of the experimental contacts of Gs and Gi/o complexes (Fig. 9E). As expected, the agreement of contact patterns from experimental and predicted Gq/11 and G12/13 complexes is lower due to fewer structural templates available for this coupling groups (Fig. 9E). Notably, contact heatmaps derived from each G-protein group display highly specific patterns, which could potentially illuminate signaling mechanisms for poorly studied G-proteins such as G12/13. For instance, contact frequency heatmaps for G12/13 complexes display peculiar patterns mediated by TM5, ICL3, and TM6 with G.H5. Several of the contacts observed for other G-protein complexes at positions G.H5.25 and G.H5.26 are missing (Fig. 9E). On the other hand, contacts mediated by positions G.H5.23 and G.H5.24 with 6.33, as well as 6.29, appear as highly specific for G12/13. We also confirmed on predicted structures the differences in binding energies observed between Gs and Gi/o complexes (Pmann-whitney = 9.8E-4; Fig. 9F). We also found that Gq/11 complexes are as stable as Gs, while G12/13 ones are less affine, similarly to Gi/o (Pmann-whitney = 9.8E-4; Fig. 9F). Such differences in binding energies are anti-correlated with differences in Interface Area (Fig. 9H). Additionally, we found interclass significant differences for GsPCRs, with ClassA being the most and ClassC the least stable (Pmann-whitney = 5.8E-3; Fig. 9H).
Discussion
In the present study, we have performed a computational, comparative analysis of 3D GPCR-G-protein complexes in their nucleotide-free state, to identify structural hallmarks of the interaction interfaces that might be linked to coupling specificity.
Complexes involving different G-proteins are characterized by distinctive structural signatures, like contact networks displaying a different engagement of secondary structural elements such as TM5, TM6, and ICLs. This notion is in line with earlier comparative analysis highlighting a selectivity filter operated by TM5 and TM6 (commented on28). More recently, structural determination of serotonin receptors in complex with either Gs or Gi proteins, accompanied by bioinformatics analysis of a representative set of class A complexes, supported the role of these secondary structure elements by highlighting a macro-switch, operated by TM5 and TM6 terminals’ variable length, which dictates selectivity towards Gs vs Gi/o30. Our unsupervised analysis of interface contacts entails complementary interactions between key positions on both receptor’s and G-protein’s sides. Gs complexes are characterized by significantly higher fractions of enriched contacts, that are mainly imposed by contacting positions on the G-protein’s side. However, the combination of GPCR and G-protein residues leads to maximum discrimination between the Gs and Gi/o groups. These results are reminiscent of the G-protein barcode model for coupling specificity, which emerges by the presentation of an evolutionary more rigid G-protein barcode to a more flexible receptor counterpart20. At the same time, Gs bound receptors are characterized by a less promiscuous binding, suggesting that the structural requirements for Gs specific binding overall impose discrimination for secondary couplings.
Certain SSEs are exclusively characterized by contact enrichments for specific G-protein (e.g. TM2, TM3 or H8 for Gi/o). Notably, the DRY motif-mediated contacts, particularly D3.49’s ones, can be considered one of the main structural hallmarks of Gi/o vs Gs complexes. The latter is indeed characterized by a greater bending of G-protein’s H5 towards TM3’s C-term and ICL2, which function as hinges to concomitantly detach the G-protein C-term from the DRY motif. Other regions are characterized by a switch-like character, meaning that certain positions form contacts enriched in Gi/o and others in Gs, including: ICL2, which encompasses the known ICL2.51 specificity determinant position38, TM5, ICL3 and TM6. We noted that the switching behavior might pertain also to individual positions and depend on the specific contact partner. These residue-level features likely work in coordination with the macro-switch described for the serotonin receptors30. We leveraged our findings to implement a computational multi-state design procedure to predict mutations biasing a given coupling, starting from a promiscuous WT sequence (e.g. CCKAR), which we experimentally verified via NanoBiT G-protein dissociation assays. Experiments identified two mutations (V3116.33H and R3768.49V) to bias Gs and one (K3086.30R) to bias Gi signaling, confirming the importance of the identified contacts in switching coupling preferences between these two G-proteins.
Through our comprehensive analysis, we show that the few Gq/11 and G13 complexes available display certain structural characteristics similar to Gi/o complexes. These correlate with recent phylogenetic analysis showing that Gi and Gq family members share a common ancestor, and that G12/13’s ancestor is likely a retro-gene originated by retroposition from a pre-Gq gene39.
The usage of a state-of-art AI model (i.e. AlphaFold-multimer37) for structural prediction also allowed us to expand the structural repertoire of GPCR-G-protein complexes. This is particularly valuable for poorly characterized groups, such as G12/13 ones. Indeed, we predicted peculiar contact patterns at the TM5 and ICL3 that are characteristic of this group and might suggest unique structural requirements. Indeed, the critical importance of these regions also emerged in our previous effort to engineer a G12-DREADD, which was achieved by swapping shorter ICL3 loops from GPR183 or GPR132 on hM3D and experimentally validated to be functional14.
The observed structural differences are linked with different predicted binding affinities for the distinct coupling group complexes, with Gs complexes being more stable than Gi/o ones. Slight differences in interface contacts and docking mode might reflect the difference in binding affinities for Gs observed for class A vs. class B receptors. We speculate that the higher binding energy of class B receptor, suggestive of less stable binding to Gs, might be related to slower rates for G-protein activation observed for a representative class B receptor (GCGR) compared to class A one (ADRB2)34. On the other hand, a comparison of the binding energies of Gs and Gi/o complexes also revealed significant differences, which holds true even when considering the complexes with the same receptor, further stressing the major contribution from the G-protein residues in priming the binding. Such differences in binding energies are also confirmed in AF-multimer predicted models, which also allow us to characterize the members of other families such as Gq/11 and G12/13. We speculate that Gs has a higher energetic barrier for activation due to its peculiar biological role of AC activation, which should be tightly regulated. The conformational restriction in Gs complexes might contribute to spatio-temporally fine-tune AC activation. On the other hand, looser Gi/o binding to cognate receptors could explain their reported faster nucleotide turnover40. Moreover, lower structural conservation and less affine Gi/o complexes are likely connected to the success of this coupling14,33, which is instrumental in providing a redundant mechanism for AC inhibition. Structural features and energetics are certainly not the only factors governing the evolutionary success of a certain coupling: indeed G12/13, which form complexes predicted as weak as the Gi/o ones, are the least successful couplings. Higher-order spatio-temporal dynamics, such as the more recent evolution of G12/1339, might explain these patterns.
A limitation of this study is that most of the structures considered are in the nucleotide-free state. Indeed, previous studies have suggested that intermediate states other than the nucleotide-free complexes influence G-protein-coupling selectivity41. Nucleotide-decoupled G proteins mutants, which bypass the intermediate-state complexes, are characterized by a degradation of coupling selectivity, indirectly highlighting the importance of the conformational dynamics of these intermediate-state complexes in guiding selectivity42. Future structural studies systematically targeting the intermediate states of GPCR-G protein complexes will improve the understanding of coupling selectivity.
The greater availability of experimental complex structures and increasingly accurate predicted models, including context-aware and diffusion-based deep learning techniques to predict alternative conformational states, will allow us to better understand the structural basis of G-protein-coupling specificity in the future. This knowledge will be key to design better-biased drugs, able to modulate only certain transducers, as well as it will be leveraged to improve the design of novel chemogenetic probes such as DREADDs.
Methods
Data sets
We used the mapping between PDB43 and Pfam44 provided by SIFTS45 (2023/01/28 update) to retrieve all the structures containing both a GPCR and a G-protein α subunit. We used the Pfam entry PF00503 to identify structures of G-α subunits, and Pfam entries PF00001 (rhodopsin receptor family - class A), PF00002 (secretin receptor family - class B), PF00003(class C receptors), PF01534 (Frizzled/Smoothened family) to identify GPCRs. We found 376 structures that met this criterion. If more than one GPCR or Gα chain were in the structure, we considered the pair of chains with the highest number of contacts between them (Supplementary Data 1).
We considered as contacts all the pairs of amino acids with less than 8 Å between their Cβ (Cɑ for glycine - see below), following standard practices employed for contact analysis in structural predictions46. Only the structures in which all the contacts between the GPCR and Gα chains were mapped to the same pair of Uniprot accessions (according to SIFTS xml residue level mappings) were kept for further analysis. We excluded the remaining structures that were considered as chimeric. In this way, we obtained 362 structures.
Whenever we found more than one structure representing the same GPCR-G-protein pair, we considered the one with the highest resolution to be the representative for each complex. In case that more than one structure had the same resolution, we chose the structure which covered the largest portion of the corresponding G-protein according to SIFTS. We found 125 GPCR-Gprotein pairs with at least one solved structure (Supplementary Data 1).
GPCR-Gα complexes prediction via AlphaFold-Multimer
We used Alphafold-Multimer v2.3.13732 to generate 3D models of GPCR-heterotrimeric G-protein complexes. The human canonical protein sequences of GNB1 and GNG1 were used as the β and γ subunits, respectively. We trimmed the N-terminal part of the GPCRs up to 50 residues before the start of TM1, to avoid folding the long extracellular portion of some receptors. The databases required to run AlphaFold-Multimer were downloaded on 12 January 2023. Among the 5 models generated for each GPCR-G-protein complex, only the best one was considered for further analysis. The score used to evaluate the models was the default one used by AlphaFold-Multimer (0.2*pTM + 0.8*ipTM).
To assess the reliability of AF-multimer models, we predicted the structure of the 125 GPCR-G-protein complexes with available experimental structures. We used the human sequences of the corresponding GPCR-Gα pairs in all predictions. We generated 3D models “using templates”, that is allowing AlphaFold-Multimer to use the 3D structural models available in the PDB (they are used only to predict the structure of single chains), and “without templates”, running the predictions setting --max_template_date=01-01-1900 instead, to avoid the usage of any available experimental template.
A total of 996 GPCR-Gɑ pairs, reported to bind in UCM33, were considered, respectively corresponding to 164 and 13 human GPCRs and Gα proteins (the three members of the transducin family were not considered). We removed 21 predicted complexes (975 structures left) based on unrealistic complex topologies, i.e. not having any of the following G-protein’s H5 positions, i.e. H5.16, H5.19, H5.20, H5.23, H5.24, H5.25, that we found to most recurrently mediate contacts with GPCR’s residues in experimental structures. To further remove low-quality models produced by Alphafold, we used the pDockQ metric47, which is fine-tuned on predicting the DockQ of the predicted complex with respect to experimental complexes. We kept only the structures with pDockQ ≥ 0.2348 (118 out of 975 structures removed)36.
We then employed Alphafold’s confidence score (predicted Local Distance Difference test - pLDDT), to remove protein terminals predicted with low confidence which might lead to artefactual contacts. Hence, we removed all the residues with pLDDT <70. We then performed the interface analysis on these trimmed sequences using the same procedure used for experimental complexes. As a last filter to remove low-quality structures, we used the output of Rosetta InterfaceAnalyzer36 (https://www.rosettacommons.org/docs/latest/application_documentation/analysis/interfacE-analyzer; see below paragraph “Analysis of GPCR-Gα binding energy with Rosetta”) and we kept only models with ∆SASA ≥ 1500 Å2, which is slightly less than the minimum ∆SASA found in all experimental structures. In this way we removed 32 more structures, leading to a total of 825 high-quality complex structures for downstream analysis (Supplementary Data 5).
Contact analysis
We considered residue-residue contacts mediating GPCR-G-protein interfaces as those having the Cβ spatially closer than 8 Å (Cɑ for glycine) (as in46). We analyzed 362 solved PDB GPCRs-G-protein complexes, which according to G protein family classification comprised 184 Gi/o, 166 Gs, 9 Gq/11, and 3 G12/13 structures. We mapped interface PDB residues to Uniprot canonical sequences residues by using SIFTS residue level mappings from individual PDB XML files. The interface sequence positions were then mapped to GPCRdb numbering31 (Supplementary Data 2) and to the Common G-protein Numbering (CGN) schemes32 (Supplementary Data 3). We aggregated contacts of different structures referred to the same GPCR-G-protein complex and considered equivalent residues pairs from different structures only once to avoid redundancy. We compared Gi/o and Gs complexes by creating a consensus list of GPCR and G-protein positions found in contact with at least one of the members of the two groups. For each GPCR-G-protein consensus positions, we calculated the fraction of GPCRs displaying a contact in each G-protein family group (either Gi/o or Gs) over the total number of GPCRs in contact. Such a fraction reflects the conservation of contact in each G-protein group. We constructed an interface contact network by considering the consensus list of GPCR-G-protein contacts from all complexes. Contacting positions were projected to the secondary structure elements of both GPCRs and G-proteins, represented as nodes of the networks. Node diameter is proportional to the total number of contacts mediated by the position of that secondary structure element. Edge width is proportional to the number of unique GPCRs mediating contact between the two linked SSEs. Edge coloring (bright to dark red) is proportional to the average of the contact fraction of individual position pairs, and it reflects the overall contact conservation between two SSEs. Dashed lines indicate contact not formed in each G-protein class but present in the other. We calculated network statistics such as node degree and centrality betweenness distribution through Cytoscape (https://cytoscape.org/)49 and customized Python scripts. All the analyzes have been done through customized Python scripts, using biopython libraries (version 1.78). Network drawings have been generated through Cytoscape.
To analyze the statistical significance of the difference between Gs and Gi/o contact fingerprints, we generated an inter-chain contact graph for each structure. We then used the Frobenius norm of the difference between the adjacency matrices of the interchain contact graphs as a distance metric for the structures. The resulting distance matrix was used to perform a PERMANOVA test50 to determine if the graphs generated by interaction with the Gs and the Gi/o family were significantly different from each other. This test compares the difference in docking modes in the same G-protein family to the difference in docking modes between different G-protein families. To evaluate the difference in variance between the distribution of intra-family pairwise distances, we used the PERMDISP test. Both tests were performed using the scikit-bio library (version 0.5.4) in python (https://cran.r-project.org/web/packages/vegan/index.html). We performed 106 and 104 permutations for PERMANOVA and PERMDISP, respectively.
Analysis was also done on Alphafold predicted complexes considering 430 Gi/o, 265 Gq/11, 87 Gs and 43 G12/13 structures according to UCM. Networks have been created for all position pairs as well as for cases where >0.2 complexes have the position pairs.
Fingerprint analysis
We have generated interface fingerprints by creating position vectors by mapping the contacts (1 if present, 0 otherwise) of either residue pairs (Complex fingerprints, or CF), or the individual positions separately for the receptor and G-protein (Receptor and G-protein fingerprints, respectively RF and GF). We performed unsupervised, hierarchical clustering on rows (unique GPCR-G-protein complex) using the “Ward” method and Euclidean distance as metric, employing the clustermap function from seaborn library (version 0.11.1). Rows are color annotated to indicate: G-protein bound in the experimental structure, GPCR class, experimentally reported couplings (according to UCM). The top plot indicates the enrichment of the contacts observed at each position.
For each consensus position of GPCR and Gprotein we calculated the log-odds ratio (LOR) from contingency Table 1 using the following equation:
1 |
Table 1.
Contact pair/G-protein | Contact | No contact |
---|---|---|
Coupled | CC | CN |
Not coupled | NC | NN |
CC and CN terms represent the number of GPCRs coupled to a specific Gprotein group (Gi/o or Gs) that are or are not, respectively, in contact at that position (either individual GPCR or G-protein positions or residue pairs). NC and NN terms represent the number of non-coupled GPCRs for a specific G-protein, that are or are not in contact, respectively, at a given position (either individual GPCR or Gprotein positions or residue pairs). Contacts contributed from the loops, N-termini and C-termini of the GPCR where aggregated. We calculated the binning statistics of the log-odds ratio of contacts.
Rosetta multistate design predictions of CCKAR coupling switching mutations
To bias Gs over Gi/o coupling, or vice versa, we have designed mutations through the Multi-state design by Rosetta35,51, as availale in the RosettaCommon software suite (version 2021.16.61629). As a starting template for the design, we have chosen CCKAR (UniProt Accession: P32238), a cholecystokinin receptor that has been solved in complex with Gs (PDB ID: 7EZK), Gi/o (PDB ID: 7EZH”) and Gq/11 (PDB ID: 7EZM)52. The CCKAR positions selected for design were either involved in specific contacts, based on comparative contacts analysis of Gs (7EZK) and Gi/o (7EZH) experimental complexes, or those with the highest log-odds ratios from GPCRomE-wide contact pair statistics. To switch Gs coupling, we chose to mutate the following positions: 3.54, ICL2.51, ICL2.52, ICL2.53, ICL2.55, ICL3(299), 6.26, 6.30, 6.32, 6.33, 6.36, 7.56, 8.47, 8.48, and 8.51. To switch Gi/o coupling we chose the following positions: 2.39, 3.49, 3.50, 3.53, ICL2.51, 6.25, 6.26, 6.29, 6.33, 8.48, 8.49, and 8.50. Next, we designated positive state and negative state complexes (7ezk or 7ezh). The positive state being the one whose structure and binding energy are preserved by mutations and the negative state being the one which is destabilized by mutations. We used a custom fitness function to describe the energy states of binding interfaces of both complexes with 12 different binding weights (1-12) and getting for each of them the best fitted structure with a different sequence of mutations for destabilizing either Gs or Gi/o coupling. We calculated the binding energy of the interface with InterfaceAnalyzer (https://www.rosettacommons.org/docs/latest/application_documentation/analysis/interfacE-analyzer) and redocked the structures with RosettaDock using local refinement of the interface between GPCR and G-protein structures53. We chose the best redocked structures based on the Total score of the redocking tool and calculated the interface binding energy (REU) with InterfaceAnalyzer.
To redock the mutated GPCR-G-protein complexes we used the Rosetta docking protocol with a docking:docking_local_refine flag to refine GPCR-Gα interface. We created top 20 redocked structures for each of the complexes created by MultiState design binding weights (1–12) and chose the best redocked structure for each of them based on the Total score provided by the protocol53. Then we ran Rosetta InterfaceAnalyzer for the best redocked structures and assess the binding interface energy.
NanoBiT- G- protein-dissociation assay for designed CCKAR mutants
Ligand-induced G-protein dissociation was measured by the NanoBiT-G-protein dissociation assay14, in which the interaction between a Gα subunit and a Gβγ subunit was monitored by the NanoBiT system (Promega). Specifically, a NanoBiT-G-protein consisting of the Gα subunit fused with a large fragment (LgBiT) at the α-helical domain (Gα-LgBiT) and an N-terminally small fragment (SmBiT)-fused Gγ2 subunit with a C68S mutation (SmBiT-Gγ2-CS) was expressed along with untagged Gβ1 subunit and a test CCKAR construct. The full-length human CCKAR was inserted into the pCAGGS expression vector with an N-terminal fusion of the hemagglutinin-derived signal sequence (ssHA), FLAG epitope tag and a flexible linker (MKTIIALSYIFCLVFADYKDDDDKGGSGGGGSGGSSSGGG; the FLAG epitope tag is underlined). The resulting construct was named as ssHA-FLAG-CCKAR. HEK293A cells (Thermo Fisher Scientific, cat no. R70507) were seeded in a 6-well culture plate at a concentration of 2 × 105 cells ml-1 (2 ml per well in DMEM (Nissui) supplemented with 5% fetal bovine serum (Gibco), glutamine, penicillin and streptomycin), one day before transfection. Transfection solution was prepared by combining 5 µL (per dish hereafter) of polyethylenimine (PEI) Max solution (1 mg ml-1; Polysciences), 200 µL of Opti-MEM (Thermo Fisher Scientific) and a plasmid mixture consisting of 200 ng ssHA-FLAG-CCKAR (or an empty plasmid for mock transfection), 100 ng Gα-LgBiT subunit (Gαs-LgBiT or Gαi1-LgBiT), 500 ng Gβ1 subunit and 500 ng SmBiT-Gγ2-CS subunit. For Gαs-LgBiT, to enhance expression of the NanoBiT-Gs sensor, 100 ng RIC8B plasmid was co-transfected. After incubation for 1 day, the transfected cells were harvested with 0.5 mM EDTA-containing Dulbecco’s PBS, centrifuged and suspended in 2 ml of HBSS containing 0.01 % bovine serum albumin (BSA; fatty acid-free grade; SERVA) and 5 mM HEPES (pH 7.4) (assay buffer). The cell suspension was dispensed in a white 96-well plate at a volume of 80 µL per well and loaded with 20 µL of 50 µM coelenterazine (Angene) diluted in the assay buffer. After a 2 h incubation at room temperature, the plate was measured for baseline luminescence (SpectraMax L, Molecular Devices) and titrated concentrations of sulfated CCK-octapeitide (Peptide Institute, cat no. 4100-v; 20 µL; 6X of final concentrations) were manually added. The plate was immediately read for the second measurement as a kinetics mode and luminescence counts recorded from 5 min to 10 min after compound addition were averaged and normalized to the initial counts. The fold-change values were further normalized to those of vesicle-treated samples and used to plot the G-protein dissociation response. Using the Prism 9 software (GraphPad Prism), the G-protein dissociation signals were fitted to a four-parameter sigmoidal concentration-response curve with a constrain of the HillSlope to absolute values less than 2. For each replicate experiment, the parameter Span (= Top – Bottom) of the individual CCKAR mutants were normalized to those of WT CCKAR performed in parallel and the resulting Emax values were used to calculate ligand response activity of the mutants.
Flow cytometry
Plasmid transfection for the ssHA-FLAG-CCKAR and the NanoBiT-Gi sensor was performed according to the same procedure as described in the “NanoBiT-G-protein-dissociation assay”. One day after transfection, the cells were collected by adding 200 μl of 0.53 mM EDTA-containing Dulbecco’s PBS (D-PBS), followed by 200 μl of 5 mM HEPES (pH 7.4)-containing Hank’s Balanced Salt Solution (HBSS). The cell suspension was transferred to a 96-well V-bottom plate in duplicate and fluorescently labeled with an anti-FLAG epitope (DYKDDDDK) tag monoclonal antibody (Clone 1E6, FujiFilm Wako Pure Chemicals, cat no. 012-22384; 10 μg per ml diluted in 2% goat serum- and 2 mM EDTA-containing D-PBS (blocking buffer)) and a goat anti-mouse IgG secondary antibody conjugated with Alexa Fluor 488 (Thermo Fisher Scientific, cat no. A11001; 10 μg per ml diluted in the blocking buffer). After washing with D-PBS, the cells were resuspended in 200 μl of 2 mM EDTA-containing-D-PBS and filtered through a 40-μm filter. The fluorescent intensity of single cells was quantified by an EC800 flow cytometer equipped with a 488 nm laser (Sony). The fluorescent signal derived from Alexa Fluor 488 was recorded in an FL1 channel, and the flow cytometry data were analyzed with the FlowJo software (FlowJo). Live cells were gated with a forward scatter (FS-Peak-Lin) cutoff at the 390 setting, with a gain value of 1.7. Values of mean fluorescence intensity (MFI) from approximately 20,000 cells per sample were used for analysis. Typically, we obtained an MFI value of 2000 (arbitrary unit) for WT CCKAR and 20 for the mock transfection. For each experiment, we normalized an MFI value of the mutants by that of WT performed in parallel and denoted relative levels.
Clustering of G-proteins complex conformations
We compared Gi/o and Gs complexes by performing Root Mean Square Deviation (RMSD)-based clustering. To calculate RMSD, we created a list of consensus positions based on all the sequences of GPCR and all G-protein in 362 complexes, by first mapping PDB residues to Uniprot canonical sequences via SIFTS45 and then to GPCRdb consensus numbers31. We considered 141 GPCR and 73 G-protein consensus positions defining respectively the consensus core of the 7TM domain and the Ras GTPase domain solved in all experimental structures. The complexes were fitted using the Cα atoms of the GPCR core, while the Cα of the G-protein core were used to calculate the RMSD after superimposition. Calculations were performed using the Superimposer function of the PDB Biopython module54 (version 1.78) through customized scripts. We performed hierarchical clustering on RMSD using the Ward method with Euclidean distance as metrics, using the clustermap function from seaborn library (version 0.11.1). We compared the distribution of the RMSD calculated among complexes of the Gi/o and Gs groups using a Wilcoxon rank-sum test. Results were displayed through matplotlib (https://matplotlib.org/) and seaborn (https://seaborn.pydata.org/) libraries using customized python scripts. We also calculated the root mean squared fluctuations of the G-protein consensus positions using the following equation:
2 |
where is the coordinate of particle i and is the coordinate of particle i in the reference structure ‘, which is the complex with the least RMSD deviation from the other complexes (i.e. centroid) in the Gs (PDB: 8E3X) and Gi/o (PDB: 8F7S) groups.
We compared Gs and Gi/o groups RMSFs by performing a Wilcoxon test and we plotted each position and its standard deviation.
Analysis of GPCR-Gα subunit binding energy with Rosetta
To analyze the GPCR-Gα interface in a 3D structural model, we first relaxed the structure using the Rosetta relax application55, using backbone constraints. Then we ran Rosetta InterfaceAnalyzer36 (https://www.rosettacommons.org/docs/latest/application_documentation/analysis/interface-analyzer), from RosettaCommon software suite (version 2021.16.61629), specifying the chains of the GPCR and the Gα which are interacting in the complex. This protocol takes a multichain complex as input and computes a new structure in which the two chains of interest are separated. The interface energy and the ∆SASA are calculated as the difference in energy and SASA in the bound and unbound structure. We run InterfaceAnalyzer with the “-pack_input” and “-pack_separated” flags to optimize the side chain configuration before and after separating the chains. If a nanobody was present in a structure, we removed it before the relaxation step, to limit its influence on the analysis.
The interface energy is computed according to the Rosetta energy function, which includes physics-based terms that represent electrostatic and van der Waals’ interactions, as well as statistical terms representing the probability of finding the torsion angles in the Ramachandran plots. This score is indicated in Rosetta Energy Units (REU) and cannot be converted into the actual binding energy, but it gives a reasonable estimation of the stability of the complex56.
Software
We employed Pymol (v2.4.1) and ChimeraX (v1.5) to generate 3D cartoon representations. We employed customized scripts in python (version 3.8.11), using matplotlib (v3.6.0), seaborn (v0.11.1), and biopython (v1.78) libraries. We calculated residue-residue contact by using a customized script derived from the CIFPARSE-OBJ C + + library (https://mmcif.wwpdb.org/docs/sw-examples/cpp/html/index.html).
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Supplementary information
Acknowledgements
F.R. was supported by the Italian Ministry of University and Research through the Department of Excellence “Faculty of Sciences” of Scuola Normale Superiore. The research leading to these results also received funding from the Italian Association for Cancer Research (AIRC) under My First AIRC Grant (MFAG) 2020 - ID. 24317 project – P.I. Raimondi Francesco. A.I. was funded by KAKENHI JP21H04791, JP21H05113, and JPJSBP120213501 from the Japan Society for the Promotion of Science (JSPS); JP20gm0010004, JP20am0101095, JP22ama121038, and JP22zf0127007 from the Japan Agency for Medical Research and Development (AMED); JPMJFR215T and JPMJMS2023 from the Japan Science and Technology Agency (JST); Daiichi Sankyo Foundation of Life Science; Takeda Science Foundation; The Uehara Memorial Foundation. M.T. received JSPS KAKENHI 22J10475. We gratefully acknowledge the CINECA award, in collaboration with AIRC, for the availability of high-performance computing resources and support. We gratefully acknowledge the computational resources of the Center for High-Performance Computing (CHPC) at Scuola Normale Superiore. We thank Kayo Sato, Shigeko Nakano, and Ayumi Inoue at Tohoku University for their assistance in plasmid preparation and the cell-based GPCR assays. We also thank Tatsuya Ikuta for helpful discussion. We are grateful to Natalia De Oliveira Rosa for having uploaded the AF-predicted structures on Precogx (https://precogx.bioinfolab.sns.it/).
Source data
Author contributions
M.M. and P.M. performed the computational analysis and wrote the manuscript; M.T. performed the in-vitro assays; A.I. designed the study and wrote the manuscript; F.R. designed the study, performed the computational analysis, and wrote the manuscript.
Peer review
Peer review information
Nature Communications thanks Andreas Bock, Roland Dunbrack and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Data availability
Source data are provided with this paper. Data generated for this study are available at https://github.com/raimondilab/GPCR_structure_analysis and 10.5281/zenodo.8067369. GPCR-Heterotrimeric G protein complexes predicted with AF-multimer are also available via Precogx webserver (https://precogx.bioinfolab.sns.it/). The raw data used in this study are available in the Zenodo database under the accession code 10.5281/zenodo.8063796. The PDB accession codes used in this study can be found in Supplementary Data 1, corresponding to a total of 362 experimental structures analyzed. Source data are provided with this paper.
Code availability
Code used for this study is available at https://github.com/raimondilab/GPCR_structure_analysis and 10.5281/zenodo.8067369.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Marin Matic, Pasquale Miglionico, Manae Tatsumi.
Contributor Information
Asuka Inoue, Email: iaska@tohoku.ac.jp.
Francesco Raimondi, Email: francesco.raimondi@sns.it.
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-023-40045-y.
References
- 1.Hauser AS, Attwood MM, Rask-Andersen M, Schiöth HB, Gloriam DE. Trends in GPCR drug discovery: new agents, targets and indications. Nat. Rev. Drug Discov. 2017;16:829–842. doi: 10.1038/nrd.2017.178. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Oldham WM, Hamm HE. Heterotrimeric G protein activation by G-protein-coupled receptors. Nat. Rev. Mol. Cell Biol. 2008;9:60–71. doi: 10.1038/nrm2299. [DOI] [PubMed] [Google Scholar]
- 3.Hilger D, Masureel M, Kobilka BK. Structure and dynamics of GPCR signaling complexes. Nat. Struct. Mol. Biol. 2018;25:4–12. doi: 10.1038/s41594-017-0011-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Wettschureck N, Offermanns S. Mammalian G proteins and their cell type specific functions. Physiol. Rev. 2005;85:1159–1204. doi: 10.1152/physrev.00003.2005. [DOI] [PubMed] [Google Scholar]
- 5.Smith, J. S., Lefkowitz, R. J. & Rajagopal, S. Biased signalling: from simple switches to allosteric microprocessors. Nat. Rev. Drug Discov.10.1038/nrd.2017.229 (2018). [DOI] [PMC free article] [PubMed]
- 6.Wootten, D., Christopoulos, A., Marti-Solano, M., Babu, M. M. & Sexton, P. M. Mechanisms of signalling and biased agonism in G protein-coupled receptors. Nat. Rev. Mol. Cell Biol.110.1038/s41580-018-0049-3 (2018). [DOI] [PubMed]
- 7.Insel PA, et al. GPCRomics: an approach to discover GPCR drug targets. Trends Pharm. Sci. 2019;40:378–387. doi: 10.1016/j.tips.2019.04.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Ramms DJ, et al. Gαs–protein kinase a (PKA) pathway signalopathies: the emerging genetic landscape and therapeutic potential of human diseases driven by aberrant gαs-pka signaling. Pharm. Rev. 2021;73:155–197. doi: 10.1124/pharmrev.120.000269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Wu V, et al. Illuminating the Onco-GPCRome: Novel G protein-coupled receptor-driven oncocrine networks and targets for cancer immunotherapy. J. Biol. Chem. 2019;294:11062–11086. doi: 10.1074/jbc.REV119.005601. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Raimondi F, et al. Rare, functional, somatic variants in gene families linked to cancer genes: GPCR signaling as a paradigm. Oncogene. 2019;38:6491–6506. doi: 10.1038/s41388-019-0895-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Huh E, et al. Recurrent high-impact mutations at cognate structural positions in class A G protein-coupled receptors expressed in tumors. Proc. Natl Acad. Sci. USA. 2021;118:e2113373118. doi: 10.1073/pnas.2113373118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Arang N, Gutkind JS. G protein-coupled receptors and heterotrimeric G proteins as cancer drivers. FEBS Lett. 2020;594:4201–4232. doi: 10.1002/1873-3468.14017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Vukotic, R. et al. The effect of liraglutide on β-blockade for preventing variceal bleeding: a case series. Ann. Intern. Med. L20-0041 10.7326/L20-0041 (2020). [DOI] [PubMed]
- 14.Inoue, A. et al. Illuminating G-protein-coupling selectivity of GPCRs. Cell10.1016/j.cell.2019.04.044 (2019). [DOI] [PMC free article] [PubMed]
- 15.Inoue, A. et al TGFα shedding assay: an accurate and versatile method for detecting GPCR activation. Nat. Methods10.1038/nmeth.2172 (2012). [DOI] [PubMed]
- 16.Avet C, et al. Effector membrane translocation biosensors reveal G protein and βarrestin coupling profiles of 100 therapeutically relevant GPCRs. Elife. 2022;11:1–34. doi: 10.7554/eLife.74101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Olsen RHJ, et al. TRUPATH, an open-source biosensor platform for interrogating the GPCR transducerome. Nat. Chem. Biol. 2020;16:841–849. doi: 10.1038/s41589-020-0535-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Singh, G., Inoue, A., Gutkind, J. S., Russell, R. B. & Raimondi, F. PRECOG: PREdicting COupling probabilities of G-protein coupled receptors. Nucleic Acids Res.10.1093/nar/gkz392 (2019). [DOI] [PMC free article] [PubMed]
- 19.Matic M, et al. PRECOGx: exploring GPCR signaling mechanisms with deep protein representations. Nucleic Acids Res. 2022;50:W598–W610. doi: 10.1093/nar/gkac426. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Flock T, et al. Selectivity determinants of GPCR-G-protein binding. Nature. 2017;545:317–322. doi: 10.1038/nature22070. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Seo MJ, Heo J, Kim K, Chung KY, Yu W. Coevolution underlies GPCR-G protein selectivity and functionality. Sci. Rep. 2021;11:7858. doi: 10.1038/s41598-021-87251-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Zhou Q, et al. Common activation mechanism of class A GPCRs. Elife. 2019;8:e50279. doi: 10.7554/eLife.50279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Hauser AS, et al. GPCR activation mechanisms across classes and macro/microscales. Nat. Struct. Mol. Biol. 2021;28:879–888. doi: 10.1038/s41594-021-00674-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Draper-joyce, C. J. et al. Structure of the adenosinE-bound human adenosine A 1 receptor – G i complex. 10.1038/s41586-018-0236-6 (2018). [DOI] [PubMed]
- 25.García-Nafría J, Nehmé R, Edwards PC, Tate CG. Cryo-EM structure of the serotonin 5-HT1B receptor coupled to heterotrimeric Go. Nature. 2018;558:620–623. doi: 10.1038/s41586-018-0241-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Kang Y, et al. Cryo-EM structure of human rhodopsin bound to an inhibitory G protein. Nature. 2018;558:553–558. doi: 10.1038/s41586-018-0215-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Koehl, A. et al. Structure of the µ Opioid Receptor-G i Protein Complex. Nature 1–23 10.1038/s41586-018-0219-7 (2018).
- 28.Capper MJ, Wacker D. Structural Biology: a complex story of receptor signalling. Nature. 2018;558:529–530. doi: 10.1038/d41586-018-05503-4. [DOI] [PubMed] [Google Scholar]
- 29.Okamoto HH, et al. Cryo-EM structure of the human MT1–Gi signaling complex. Nat. Struct. Mol. Biol. 2021;28:694–701. doi: 10.1038/s41594-021-00634-1. [DOI] [PubMed] [Google Scholar]
- 30.Huang S, et al. GPCRs steer Gi and Gs selectivity via TM5-TM6 switches as revealed by structures of serotonin receptors. Mol. Cell. 2022;82:2681–2695.e6. doi: 10.1016/j.molcel.2022.05.031. [DOI] [PubMed] [Google Scholar]
- 31.Kooistra AJ, et al. GPCRdb in 2021: integrating GPCR sequence, structure and function. Nucleic Acids Res. 2021;49:D335–D343. doi: 10.1093/nar/gkaa1080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Flock T, et al. Universal allosteric mechanism for Gα activation by GPCRs. Nature. 2015;524:173–179. doi: 10.1038/nature14663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Hauser AS, et al. Common coupling map advances GPCR-G protein selectivity. Elife. 2022;11:e74107. doi: 10.7554/eLife.74107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Hilger D, et al. Structural insights into differences in G protein activation by family A and family B GPCRs. Science. 2020;369:eaba3373. doi: 10.1126/science.aba3373. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Leaver-Fay A, Jacak R, Stranges PB, Kuhlman B. A generic program for multistate protein design. PLoS One. 2011;6:e20937. doi: 10.1371/journal.pone.0020937. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Benjamin Stranges P, Kuhlman B. A comparison of successful and failed protein interface designs highlights the challenges of designing buried hydrogen bonds. Protein Sci. 2013;22:74–82. doi: 10.1002/pro.2187. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Evans, R. et al. Protein complex prediction with AlphaFold-Multimer. Preprint at bioRxiv 2021.10.04.463034 10.1101/2021.10.04.463034 (2022).
- 38.Kim HR, et al. Structural mechanism underlying primary and secondary coupling between GPCRs and the Gi/o family. Nat. Commun. 2020;11:1–12. doi: 10.1038/s41467-020-16975-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Lokits AD, Indrischek H, Meiler J, Hamm HE, Stadler PF. Tracing the evolution of the heterotrimeric G protein α subunit in Metazoa. BMC Evolut. Biol. 2018;18:1–27. doi: 10.1186/s12862-018-1147-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Sprang SR. Invited review: activation of G proteins by GTP and the mechanism of Gα-catalyzed GTP hydrolysis. Biopolymers. 2016;105:449–462. doi: 10.1002/bip.22836. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Du Y, et al. Assembly of a GPCR-G protein complex. Cell. 2019;177:1232–1242.e11. doi: 10.1016/j.cell.2019.04.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Jang, W., Lu, S., Xu, X., Wu, G. & Lambert, N. A. The role of G protein conformation in receptor–G protein selectivity. Nat. Chem. Biol.10.1038/s41589-022-01231-z (2023). [DOI] [PMC free article] [PubMed]
- 43.Berman HM, et al. The protein data bank. Nucleic Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Mistry J, et al. Pfam: The protein families database in 2021. Nucleic Acids Res. 2021;49:D412–D419. doi: 10.1093/nar/gkaa913. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Velankar S, et al. SIFTS: structure integration with function, taxonomy and sequences resource. Nucleic Acids Res. 2012;41:D483–D489. doi: 10.1093/nar/gks1258. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Monastyrskyy B, Fidelis K, Tramontano A, Kryshtafovych A. Evaluation of residuE-residue contact predictions in CASP9. Proteins. 2011;79:119. doi: 10.1002/prot.23160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Bryant P, Pozzati G, Elofsson A. Improved prediction of protein-protein interactions using AlphaFold2. Nat. Commun. 2022;13:1265. doi: 10.1038/s41467-022-28865-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Burke DF, et al. Towards a structurally resolved human protein interaction network. Nat. Struct. Mol. Biol. 2023;30:216–225. doi: 10.1038/s41594-022-00910-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Lotia S, Montojo J, Dong Y, Bader GD, Pico AR. Cytoscape app store. Bioinformatics. 2013;29:1350–1351. doi: 10.1093/bioinformatics/btt138. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Anderson MJ. A new method for non-parametric multivariate analysis of variance. Austral Ecol. 2001;26:32–46. [Google Scholar]
- 51.Sevy AM, et al. Multistate design of influenza antibodies improves affinity and breadth against seasonal viruses. Proc. Natl Acad. Sci. USA. 2019;116:1597–1602. doi: 10.1073/pnas.1806004116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Liu Q, et al. Ligand recognition and G-protein coupling selectivity of cholecystokinin A receptor. Nat. Chem. Biol. 2021;17:1238–1244. doi: 10.1038/s41589-021-00841-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Gray JJ, et al. Protein-protein docking with simultaneous optimization of rigid-body displacement and sidE-chain conformations. J. Mol. Biol. 2003;331:281–299. doi: 10.1016/S0022-2836(03)00670-3. [DOI] [PubMed] [Google Scholar]
- 54.Cock PJA, et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics. 2009;25:1422–1423. doi: 10.1093/bioinformatics/btp163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Nivón LG, Moretti R, Baker D. A pareto-optimal refinement method for protein design scaffolds. PLoS One. 2013;8:e59004. doi: 10.1371/journal.pone.0059004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Alford RF, et al. The rosetta all-atom energy function for macromolecular modeling and design. J. Chem. Theory Comput. 2017;13:3031–3048. doi: 10.1021/acs.jctc.7b00125. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Source data are provided with this paper. Data generated for this study are available at https://github.com/raimondilab/GPCR_structure_analysis and 10.5281/zenodo.8067369. GPCR-Heterotrimeric G protein complexes predicted with AF-multimer are also available via Precogx webserver (https://precogx.bioinfolab.sns.it/). The raw data used in this study are available in the Zenodo database under the accession code 10.5281/zenodo.8063796. The PDB accession codes used in this study can be found in Supplementary Data 1, corresponding to a total of 362 experimental structures analyzed. Source data are provided with this paper.
Code used for this study is available at https://github.com/raimondilab/GPCR_structure_analysis and 10.5281/zenodo.8067369.