Graphical abstract
Keywords: GPCRs, Coevolution, Interaction network, Conformational states, Functionally relevant residues
Abstract
We present an approach that, by integrating structural data with Direct Coupling Analysis, is able to pinpoint most of the interaction hotspots (i.e. key residues for the biological activity) across very sparse protein families in a single run. An application to the Class A G-protein coupled receptors (GPCRs), both in their active and inactive states, demonstrates the predictive power of our approach. The latter can be easily extended to any other kind of protein family, where it is expected to highlight most key sites involved in their functional activity.
1. Introduction
The discovery of correlated mutations in protein families across different organisms has shown to provide valuable information on the functional role of residues [1]. These mutations arise from evolutionary pressure, that drives the changes to enhance stability and/or biological function. Starting from the correlation between mutations in sequence alignments (i.e., coevolutionary analysis, CA), one can infer that the destabilization induced by a single-point mutation can be attenuated or even counterbalanced by a corresponding mutation in a different portion of the sequence. Different CA approaches include DCA [2], [3], [4], plmDCA [5], GREMLIN [6], PSICOV [7], and many others [8], [9]. These methods have been used for different goals [10]: from the definition of coarse-grained force fields for molecular simulation [11], to the prediction of mutation energetics [12], [13], the direct inference of 3D structures [3], refinement of structure prediction [14], [15], and the investigation of protein–protein interactions [16], [17].
Here we introduce a structure-based CA that identifies in a single run highly structurally and/or functionally relevant residues across a protein family. This is achieved integrating structural information on a modified version of Lui and Tiana’s [12] approach. The latter uses internal interaction networks to uncover frustrated interactions and mutation free energy differences for a specific protein. Our protocol has several advantages. Unlike previous approaches, mainly based on single protein domains (usually obtained from PFAM [18]), our protocol includes entire proteins in the calculations. In addition, it provides insight on different conformational states of proteins (unlike “classical” DCA analysis [2], which is based on pure sequence information), such as receptor active/inactive states or ion channel close/open states. Finally, and most importantly, it can be applied to large sparse families, i.e. with very large sequence variability due to evolution. Applications to the sparse and fairly well-structurally characterized [19] subfamily, “class A” of the human G-Protein Coupled Receptor (hGPCRs) superfamily, shows the predictive power of the approach: in a single run, it identifies coevolutionary related hotspots, previously pinpointed by techniques other than CA [20], [21], integrating also structural information to highlight differences in distinct conformational states. These are fundamental structural/functional residues or correlated with diseases. The protocol is totally general and can easily be extended to other subfamilies of GPCRs, from organisms other than Homo Sapiens, as well of other large receptor families with large intrinsic variability, like the pentameric ligand-gated ion channels (pLGICs) or the voltage-gated ion channels.
2. Results
The approach
In the following sections, we briefly describe our multistep strategy (Fig. 1). Our approach uses structural information in two different stages of the protocol, i.e. during the multiple sequence alignment (MSA) generation and in the construction of the interaction matrix, in contrast to most of the previously used coevolutionary approaches, that usually do not consider this information.
Below we report details of each of these steps.
2.1. Experimental data
Let us consider the general case for which sequences across a given family are highly diverse. In this case several bottlenecks, starting from the MSA generation, can be found. Thus, the use of state of the art methodologies can be applied to overcome these difficulties. We consider here an alignment formed by M sequences composed by L residues (where i is the kind of amino acid) obtained form curated databases (Uniprot [22], Pfam [18], or GPCRdb [23]). The MSA can be generated using an algorithm, Promals3D [24], that utilizes structural data, and thus can alleviate the difficulty in aligning families that displays sparse sequences.
2.2. Interaction analysis – DCA and Sequence-specific interaction matrix
Our coevolutionary analysis protocol is based on Direct Coupling Analysis [2], [3]. The basic assumption of such technique is that the interaction between different residues can be written as
(1) |
where is the amino acid at i-th position of the sequence, is the two-bodies term (analogous to the interaction term of a Potts’ model) that contains the interaction energy between residues and at position i and j of the alignment, respectively, and is a one-body potential that can act on the residues (analogous to the field term of a Potts’ model).
The key quantity here is the two-bodies term , that contains the information needed to build the interaction graph, leading, eventually, to the identification of the hotspots (see below). The frequency counts of pairs and single residues of a sequence with n amino acids can be seen as marginals of a probability distribution
(2) |
(3) |
where the probability p is defined as
(4) |
Starting from our alignment with n residues and M sequences of length L, we can compute the empirical frequencies and . For , the empirical frequencies will match the theoretical distributions . For a realistic situation, using the empirical distribution we will have finite-size effects, given by the finite number of sequences available. To overcome this issue, we reweight the empirical frequencies with the appropriate pseudocounts [25]. These pseudocounts are weighted on the distribution of residue types (via the weight x), on the distribution of residue types in the considered alignment (via the weight y), and on the distribution of residue types in a specific pair of positions in the alignment (via the weight z), namely
(5) |
where q is the number of residue types (here we have 21 types: the 20 amino acids and the gap), is an effective number of sequences and is the number of sequences with similarity larger than 70%. In this work we fixed , and following Contini and Tiana [13], that tested these parameters for both cytosolic and membrane proteins.
After the reweighting, if we apply a mean-field approximation we can obtain the associated correlation matrix for the frequencies defined as
(6) |
and finally obtain the two-bodies interaction energy in Table 2
(7) |
Table 2.
This two-bodies term (that have the form of a 4-dimensional tensor) contains the two-bodies interaction of all the possible residue pairs. Choosing our sequence of interest, we can extract from this tensor the interaction matrix that describes the interaction between all the possible amino acid pairs in the system.
2.3. Network analysis
contains interaction information that is complementary to 3D structural data, because can discriminate the most important energetic contribution of residue that are located in spatial proximity. To integrate spatial and energetic information in a single object, we computed a residue-residue contact map for every protein structure:
(8) |
Where is the distance between the Cβs of the i-th and the j-th residues (in the case of glycine, the hydrogen atom in the analogue position).
To insert structural information in , we proceed following Scarabelli et al.[26]. There, the authors perform an Hadamard product (a element-by-element multiplication of the two matrices) between the contact map and the interaction matrix, obtaining the local interaction matrix , namely
(9) |
If the experimental structure considered contains non-resolved residues, one can remove the part of involving such parts.
Now, let us consider each residue of the protein family as a node of a weighted graph . In this graph, we connect two nodes if the modulus of the interaction obtained from DCA between the respective residues is larger than a threshold value defined iteratively: we start building a network considering the maximum energy of the matrix (in modulus), obtaining an unconnected graph (i.e., a network of isolated nodes except for the two residues with the strongest interaction).1 At this point, we iteratively lower value until we obtain a connected graph. The maximum value of that still returns a connected graph defines the final interaction network and the connected graph () itself.
2.4. Hotspots
The betweenness centrality of a node in reads [27]:
(10) |
where is the number of the shortest paths in the graph that connect i and j passing through k, and is the number of the shortest paths in the graph that connect i and j.
If the considered node is “central” in the network (i.e., the information flow passes through it to connect different portion of the protein), its betweenness centrality will be larger. We considered residues as hotspots if the betweenness centrality of their associated node is larger than half of the maximum betweenness centrality in all the nodes.
2.5. Application to class A hGPCRs
hGPCRs, with more than 800 members [19], is the largest family of cell-surface receptors. External signals are translated by this family into cell stimuli. A widely used classification system of hGPCRs is the A-F system that is mainly based on their amino acid sequences and functional similarities. This classification identifies six classes, labeled A-F. Class A, also known as the “rhodopsin-like family”, is the largest group of hGPCRs [28], which includes hormones, neurotransmitters, and light receptors and accounts for around 80% of hGPCRs. These proteins share a common topological signature, namely seven -helical transmembrane (TM) domains [29]. The members of the family share also positions of residues directly involved in ligand binding and receptors activation. These include for example positions 3.28, 3.32, 3.33, 3.36, 3.37, 5.39, 6.44, 6.48, 6.55, 7.35 and 7.39 [30], [31], [32], functionally conserved along the entire class A [32], [20].2 Such common structural organization contrasts strongly with the agonists’ structural diversity, from subatomic particles (a photon), to small molecules, up to peptides and even proteins [29]. Agonist binding to class A hGPCRs triggers receptor activation. The residues involved in the activation extend from the binding cavity, [20], [34], [35], [36], [37] to the intracellular side of the receptor. Activation lead to binding of a cognate proteins, e.g. G-protein and -arrestins, and finally downstream signaling pathways. However, these proteins do not simply switch between alternative agonist-bound and inactive forms in this process. They rather adopt a series of intermediate states -likely represented by an ensemble of conformations [38]- influenced not only by agonist binding, but also by other receptors, signaling and regulatory proteins, by post-translational modifications, and by environmental cues [39].
The input of the workflow (Fig. 1) consists of sequence alignments and of experimentally solved (PDB) structures of vertebrate class A GPCRs.3 We considered the vertebrates GPCRs for building up the evolutionary history of the family because, out of vertebrate species the classification in subfamilies is more difficult and not always accurate [19].
The subclass sequences were downloaded from the Uniprot database [22]. The reviewed sequences were firstly chosen (2514 sequences). New sequences from the unreviewed data set were then manually added. All the resulting curated sequences (5,000) were aligned using the Promals3D web-server [24]. We used the default parameters of the server. The MSA obtained by using this program satisfies all the class A hGPCR features, a set of highly conserved residues in each of the transmembrane helices [30], that gives rise to the Ballesteros-Weinstein nomenclature [33] (see footNote 3). The alignment was aided by 50 experimental or predicted structures belonging to 28 different human class A GPCRs, both in the active and inactive states (see Table 1). In all the cases, the structure was not resolved in its entirety: parts of the sequence (typically the intracellular loop and the C- and N-termini of the chain) was missing in the experimental structure. To match structural/sequence information (and matrices dimension), we removed the parts of the that involved unresolved residues. Next, we built the local interaction network and computed the betweenness centrality of every residue. As mentioned in the description of the method, in this phase we used again the structural information of Table 1. Several hotspot positions underwent site-directed mutagenesis experiments (see Tables 2 and 1 SI and references within). Many mutants have a lower ligand activity or prevent activation ( https://gpcrdb.org) [23]) or are linked to disease. Residues identified as hotspots across 30% or more hGPCRs have a documented biological function (Table 2), such as: belonging to the ligand binding site (i), or to the micro-switch network of activation (ii) or being located within the allosteric Na+ binding cavity (iii). Not all the hotspots listed in Table 2 are present in all the structures in Table 1 (the number of hotspots per single structure ranging from 2 to 26, see Table 2 SI). In particular, for some identified hotspots we can infer their functional role via a direct comparison with other members within the same family. Some examples are briefly discussed below,
Table 1.
Name | Species | UNIPROT | Active | Inactive |
---|---|---|---|---|
Rhodopsin | Human | OPSD_HUMAN | 6CMO | (98%) |
Cannabinoid-1 | Human | CNR1_HUMAN | 6N4B | 5U09 |
Cannabinoid-2 | Human | CNR2_HUMAN | (66%) | 5ZTY |
Muscarinic M1 | Human | ACM1_HUMAN | 6OIJ | 5CXV |
Muscarinic M2 | Human | ACM2_HUMAN | 4MQS | 3UON |
Muscarinic M4 | Human | ACM4_HUMAN | (91%) | 5DSG |
-Adrenoreceptor | Human | ADRB2_HUMAN | 4LDE | 2RH1 |
Adenosine A1 | Human | AA1R_HUMAN | 6D9H | 5UEN |
Adenosine A2A | Human | AA2AR_HUMAN | 5G53 | 5NM4 |
-Opioid | Human | OPRD_HUMAN | (82%) | 4N6H |
-Opioid | Human | OPRM_HUMAN | 5C1M | 4DKL |
-Opioid | Human | OPRK_HUMAN | 6B73 | 4DJK |
NOP Receptor | Human | OPRX_HUMAN | (77%) | 5DHH |
Serotonin 1B | Human | 5HT1B_HUMAN | 6G79 | (60%) |
Serotonin 2A | Human | 5HT2A_HUMAN | (83%) | 6A94 |
Serotonin 2B | Human | 5HT2B_HUMAN | 5TUD | (78%) |
Serotonin 2C | Human | 5HT2C_HUMAN | 6BQG | 6BQH |
Dopamine 2 Receptor | Human | DRD2_HUMAN | (60%) | 6CM4 |
Dopamine 3 Receptor | Human | DRD3_HUMAN | (57%) | 3PBL |
Dopamine 4 Receptor | Human | DRD4_HUMAN | (57%) | 5WIU |
Angiotensin 1 | Human | AGTR1_HUMAN | 6DO1 | 4YAY |
Apelin Receptor | Human | APJ_HUMAN | (54%) | 5VBL |
C–C Chemokine 2 | Human | CCR2_HUMAN | – | 6GPX |
C–C Chemokine 5 | Human | CCR5_HUMAN | – | 5UIX |
C–C Chemokine 9 | Human | CCR9_HUMAN | – | 5LWE |
C–C Chemokine 1 | Human | CCR1_HUMAN | – | (78%) |
C–C Chemokine 3 | Human | CCR3_HUMAN | – | (77%) |
C–C Chemokine-Like 2 | Human | CCRL2_HUMAN | – | (62%) |
2.5.1. Ligand binding
Our protocol is able to capture residues with a fundamental role in selectivity, ligand binding affinity [30] and in dynamical events underlying hGPCR activation (see Table 2; [31], [32], [20]). For example, it identifies the conserved hotspots involved in ligand binding 3.32, 3.37 and 6.48 [40], [30], [32], [20] (Tables 2 and 1 SI for references). The first residue plays a role for ligand charge detection [32]. It is an aspartic acid in 22% of the class A hGPCRs, interacting with amines or with other positively charged groups [32]. The second residue is involved not only in ligand recognition but also in receptor activation [41]. The last one, is a tryptophan residue in more than 77% of the class A subfamily. This position is well known in literature, since it is a hub involved either in ligand detection and as forming the ‘toggle switch’ involved in receptor activation (see below) [42].
2.5.2. Micro-switches
The so-called “micro-switches” are small groups of residues that undergo conformational changes during receptor activation and are mechanistically involved in the activation of GPCRs [43], [42], [44]. Those include: (i) D[E]R3.50Y in helix III which is a common motif that forms the ’ionic lock’, during inactivation, (ii) NP7.50xxY in helix VII which plays an important role in proteins’ conformational changes upon activation [42]. The link between this region and the binding cavity is the ’toggle switch’ formed by positions 6.44, 6.48, 3.40, 5.50: upon ligand binding, position 3.40 rotates and locates between 6.48 and 6.44. This, induce a conformational change of the “hydrophobic barrier”, located below the “toggle switch” that includes positions 2.43, 2.46, 3.43, 3.46, 6.36, 6.37 and 6.40 [45], [42]. The conformational change is important for receptor activation [30], [31]. All the residues involved in this complex mechanism were detected as hotspots in one single run of our protocol (Table 2).
2.5.3. Na+ binding cavity
An allosteric binding cavity for a partially hydrated Na+ ion is conserved across class A hGPCRs, excluding the opsins [46]. The hydrated Na+ is bound in the middle of 7TM helices bundle, it stabilizes the inactive state and reduces basal G-protein activity [46]. D2.50 (90% conserved as Asp) directly coordinates the Na+ ion, N1.50 (97% Asn), S3.39 (75% Ser), N7.45 (70% Asn), S7.46 (66% Ser), N7.49 (75% Asn), and finally Y7.53 (89% Tyr) [46] complete the coordination of the ion. The protocol identifies, as hotspots, D2.50, S3.39 and 7.49 across class A subfamily members, except for the opsins, consistently with experiments [46] (Table 2 SI).
3. Conclusions
We have presented an approach able to capture most of the coevolution-related events relevant for the function of a very sparse protein family or subfamily (Fig. 1). The protocol provides information not only on residues spanning along the full-length, but also on different activation states of the receptors. In contrast to previous approaches, we make use of structural information.
Application to human class A hGPCRs structures shows that, from a sparse family multiple sequence alignment, we were capable of extracting all the residues known to be involved in the different aspects of the receptor activity that were previously identified [20], [21]. These include i) all the position within the binding cavity with a conserved functional role; ii) residues forming the activation microswitches and iii) residues forming the Na+ allosteric binding cavity and those that were found to be mutated in correlation with disease. Importantly, the method was able to capture the functional role of all the residues in one single shot.
Our approach is totally general and can easily be extended to other subfamilies of GPCRs for which experimental structures are available.4 As an example, we cite here the pentameric ligand-gated ion channels (pLGICs) protein family, that mediate fast neurotransmission in the nervous system [47], [48], [49]. These are evolutionary correlated, they share a common architecture that consists of three distinct domains, and they exist in at least three distinct functional states [47], [48], [50]. By exploiting the available structural information [47], [48], the approach might be able to identify all hotspots across the family in a single run. As soon as a statistical significant number of structure will be available, a more specific analysis on single subfamilies of class A hGPCRs can be readily be performed.
One of the present limitations of the protocol regards the study of the oligomers, because the integration of the structural data removes the interaction between residues that are far away in the single monomer. In the future, we plan to remove this restriction integrating oligomers data coming from experiments.
CRediT authorship contribution statement
Filippo Baldessari: Investigation. Riccardo Capelli: Conceptualization, Investigation. Paolo Carloni: Conceptualization. Alejandro Giorgetti: Conceptualization.
Acknowledgments
The authors thank Guido Tiana for providing the code to compute interaction matrices from sequence alignments. This project has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement 785907 (HBP SGA2).
Footnotes
In graph theory a connected graph is a network where, choosing any possible pair of nodes i and j, it is possible to go from i to j via a path defined by the edges of the graph.
From here on, we will use the Ballesteros-Weinstein (or generic) numbering scheme [33] commonly used for class A GPCRs. Within this framework, the first number indicates the helix and the second number indicates the residue position with respect to the most conserved residue in that helix (x.50). For example, position 3.52 refers to a residue in helix 3, two positions after the most conserved residue, the 3.50.
We consider only non-olfactory class A GPCRs as in [19].
In its current version, the approach cannot deal with the olfactory receptors as they are a huge disperse subfamily without a crystal structure.
Supplementary data associated with this article can be found, in the online version, athttps://doi.org/10.1016/j.csbj.2020.05.003.
Supplementary data
The following are the Supplementary data to this article:
References
- 1.Ovchinnikov S., Park H., Varghese N., Huang P.-S., Pavlopoulos G.A., Kim D.E., Kamisetty H., Kyrpides N.C., Baker D. Protein structure determination using metagenome sequence data. Science. 2017;355(6322):294–298. doi: 10.1126/science.aah4043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Weigt M., White R.A., Szurmant H., Hoch J.A., Hwa T. Identification of direct residue contacts in protein–protein interaction by message passing. Proc Nat Acad Sci. 2009;106(1):67–72. doi: 10.1073/pnas.0805923106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Morcos F., Pagnani A., Lunt B., Bertolino A., Marks D.S., Sander C., Zecchina R., Onuchic J.N., Hwa T., Weigt M. Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc Nat Acad Sci. 2011;108(49):E1293–E1301. doi: 10.1073/pnas.1111471108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Baldassi C., Zamparo M., Feinauer C., Procaccini A., Zecchina R., Weigt M., Pagnani A. Fast and accurate multivariate gaussian modeling of protein families: predicting residue contacts and protein-interaction partners. PloS One. 2014;9(3) doi: 10.1371/journal.pone.0092721. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ekeberg M., Hartonen T., Aurell E. Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences. J Comput Phys. 2014;276:341–356. [Google Scholar]
- 6.Kamisetty H., Ovchinnikov S., Baker D. Assessing the utility of coevolution-based residue–residue contact predictions in a sequence-and structure-rich era. Proc Nat Acad Sci. 2013;110(39):15674–15679. doi: 10.1073/pnas.1314045110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Jones D.T., Buchan D.W., Cozzetto D., Pontil M. Psicov: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics. 2011;28(2):184–190. doi: 10.1093/bioinformatics/btr638. [DOI] [PubMed] [Google Scholar]
- 8.Skwark M.J., Abdel-Rehim A., Elofsson A. Pconsc: combination of direct information methods and alignments improves contact prediction. Bioinformatics. 2013;29(14):1815–1816. doi: 10.1093/bioinformatics/btt259. [DOI] [PubMed] [Google Scholar]
- 9.Burger L., Van Nimwegen E. Disentangling direct from indirect co-evolution of residues in protein alignments. PLoS Computat Biol. 2010;6(1) doi: 10.1371/journal.pcbi.1000633. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Morcos F., Onuchic J.N. The role of coevolutionary signatures in protein interaction dynamics, complex inference, molecular recognition, and mutational landscapes. Curr Opin Struct Biol. 2019;56:179–186. doi: 10.1016/j.sbi.2019.03.024. [DOI] [PubMed] [Google Scholar]
- 11.Sutto L., Marsili S., Valencia A., Gervasio F.L. From residue coevolution to protein conformational ensembles and functional dynamics. Proc Nat Acad Sci. 2015;112(44):13567–13572. doi: 10.1073/pnas.1508584112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Lui S., Tiana G. The network of stabilizing contacts in proteins studied by coevolutionary data. J Chem Phys. 2013;139(15):10B618_1. doi: 10.1063/1.4826096. [DOI] [PubMed] [Google Scholar]
- 13.Contini A., Tiana G. A many-body term improves the accuracy of effective potentials based on protein coevolutionary data. J Chem Phys. 2015;143(2):07B608_1. doi: 10.1063/1.4926665. [DOI] [PubMed] [Google Scholar]
- 14.Ovchinnikov S., Kim D.E., Wang R.Y.-R., Liu Y., DiMaio F., Baker D. Improved de novo structure prediction in CASP 11 by incorporating coevolution information into Rosetta. Proteins: Struct Funct Bioinf. 2016;84:67–75. doi: 10.1002/prot.24974. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Evans R., Jumper J., Kirkpatrick J., Sifre L., Green T., Qin C., Zidek A., Nelson A., Bridgland A., Penedones H. De novo structure prediction with deeplearning based scoring. Annu Rev Biochem. 2018;77(363–382):6. [Google Scholar]
- 16.Ovchinnikov S., Kamisetty H., Baker D. Robust and accurate prediction of residue–residue interactions across protein interfaces using evolutionary information. Elife. 2014;3 doi: 10.7554/eLife.02030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Marchetti F., Capelli R., Rizzato F., Laio A., Colombo G. The subtle trade-off between evolutionary and energetic constraints in protein–protein interactions. J Phys Chem Lett. 2019;10(7):1489–1497. doi: 10.1021/acs.jpclett.9b00191. [DOI] [PubMed] [Google Scholar]
- 18.El-Gebali S., Mistry J., Bateman A., Eddy S.R., Luciani A., Potter S.C., Qureshi M., Richardson L.J., Salazar G.A., Smart A. The pfam protein families database in 2019. Nucl Acids Res. 2019;47(D1):D427–D432. doi: 10.1093/nar/gky995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Fredriksson R., Lagerström M.C., Lundin L.-G., Schiöth H.B. The g-protein-coupled receptors in the human genome form five main families. phylogenetic analysis, paralogon groups, and fingerprints. Molecul Pharmacol. 2003;63(6):1256–1272. doi: 10.1124/mol.63.6.1256. [DOI] [PubMed] [Google Scholar]
- 20.Zhou Q, Yang D, Wu M, Guo Y, Guo W, Zhong L, Cai X, Dai A, Jang W, Shakhnovich EI, et al. Common activation mechanism of class A GPCRs, eLife 8. [DOI] [PMC free article] [PubMed]
- 21.van Westen G.J., Wegner J.K., Bender A., IJzerman A.P., van Vlijmen H.W. Mining protein dynamics from sets of crystal structures using ”consensus structures”. Protein Sci. 2010;19(4):742–752. doi: 10.1002/pro.350. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Consortium U. Uniprot: a worldwide hub of protein knowledge. Nucleic Acids Res. 2019;47(D1):D506–D515. doi: 10.1093/nar/gky1049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Pándy-Szekeres G., Munk C., Tsonkov T.M., Mordalski S., Harpsøe K., Hauser A.S., Bojarski A.J., Gloriam D.E. GPCRdb in 2018: adding GPCR structure models and ligands. Nucleic Acids Res. 2017;46(D1):D440–D446. doi: 10.1093/nar/gkx1109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Pei J., Kim B.-H., Grishin N.V. Promals3d: a tool for multiple protein sequence and structure alignments. Nucleic Acids Res. 2008;36(7):2295–2300. doi: 10.1093/nar/gkn072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Altschul S.F., Gertz E.M., Agarwala R., Schäffer A.A., Yu Y.-K. Psi-blast pseudocounts and the minimum description length principle. Nucleic Acids Res. 2008;37(3):815–824. doi: 10.1093/nar/gkn981. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Scarabelli G., Morra G., Colombo G. Predicting interaction sites from the energetics of isolated proteins: a new approach to epitope mapping. Biophys J. 2010;98(9):1966–1975. doi: 10.1016/j.bpj.2010.01.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Brandes U. On variants of shortest-path betweenness centrality and their generic computation. Social Networks. 2008;30(2):136–145. [Google Scholar]
- 28.Attwood T., Findlay J. Fingerprinting g-protein-coupled receptors. Protein Eng, Des Selection. 1994;7(2):195–203. doi: 10.1093/protein/7.2.195. [DOI] [PubMed] [Google Scholar]
- 29.Kobilka B.K. G protein coupled receptor structure and activation. Biochimica et Biophysica Acta (BBA)-Biomembranes. 2007;1768(4):794–807. doi: 10.1016/j.bbamem.2006.10.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Venkatakrishnan A., Deupi X., Lebon G., Tate C.G., Schertler G.F., Babu M.M. Molecular signatures of G-protein-coupled receptors. Nature. 2013;494(7436):185. doi: 10.1038/nature11896. [DOI] [PubMed] [Google Scholar]
- 31.Veprintsev D, Venkatakrishnan A, Deupi X, Lebon G, Heydenreich FM, Flock T, Miljus T, Balaji S, Bouvier M, Tate CG, et al. Diverse activation pathways in class a gpcrs converge near the g-protein-coupling region. [DOI] [PMC free article] [PubMed]
- 32.Suku E., Giorgetti A. Common evolutionary binding mode of rhodopsin-like GPCRs: Insights from structural bioinformatics. AIMS. Biophysics. 2017;4:543–556. AIMS Press. [Google Scholar]
- 33.Ballesteros JA, Weinstein H. [19] integrated methods for the construction of three-dimensional models and computational probing of structure-function relations in g protein-coupled receptors. On: Methods in neurosciences, vol. 25, Elsevier, 1995, pp. 366–428.
- 34.Granier S., Kobilka B. A new era of GPCR structural and chemical biology. Nature Chem Biol. 2012;8(8):670–673. doi: 10.1038/nchembio.1025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Dalton J.A., Lans I., Giraldo J. Quantifying conformational changes in GPCRs: glimpse of a common functional mechanism. BMC Bioinf. 2015;16(1):124. doi: 10.1186/s12859-015-0567-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Dror R.O., Arlow D.H., Maragakis P., Mildorf T.J., Pan A.C., Xu H., Borhani D.W., Shaw D.E. Activation mechanism of the 2-adrenergic receptor. Proc Nat Acad Sci. 2011;108(46):18684–18689. doi: 10.1073/pnas.1110499108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Kruse A.C., Hu J., Pan A.C., Arlow D.H., Rosenbaum D.M., Rosemond E., Green H.F., Liu T., Chae P.S., Dror R.O. Structure and dynamics of the m3 muscarinic acetylcholine receptor. Nature. 2012;482(7386):552. doi: 10.1038/nature10867. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Gumbart J., Khalili-Araghi F., Sotomayor M., Roux B. Constant electric field simulations of the membrane potential illustrated with simple systems. Biochimica et Biophysica Acta (BBA)-Biomembranes. 2012;1818(2):294–302. doi: 10.1016/j.bbamem.2011.09.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Geppetti P., Veldhuis N.A., Lieu T., Bunnett N.W. G protein-coupled receptors: dynamic machines for signaling pain and itch. Neuron. 2015;88(4):635–649. doi: 10.1016/j.neuron.2015.11.001. [DOI] [PubMed] [Google Scholar]
- 40.Sandal M., Behrens M., Brockhoff A., Musiani F., Giorgetti A., Carloni P., Meyerhof W. Evidence for a transient additional ligand binding site in the tas2r46 bitter taste receptor. J Chem Theory Comput. 2015;11(9):4439–4449. doi: 10.1021/acs.jctc.5b00472. [DOI] [PubMed] [Google Scholar]
- 41.Ponzoni L., Rossetti G., Maggi L., Giorgetti A., Carloni P., Micheletti C. Unifying view of mechanical and functional hotspots across class A GPCRs. PLoS Computat Biol. 2017;13(2) doi: 10.1371/journal.pcbi.1005381. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Trzaskowski B., Latek D., Yuan S., Ghoshdastider U., Debinski A., Filipek S. Action of molecular switches in GPCRs-theoretical and experimental studies. Curr Medicinal Chem. 2012;19(8):1090–1109. doi: 10.2174/092986712799320556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Nygaard R., Frimurer T.M., Holst B., Rosenkilde M.M., Schwartz T.W. Ligand binding and micro-switches in 7tm receptor structures. Trends Pharmacol Sci. 2009;30(5):249–259. doi: 10.1016/j.tips.2009.02.006. [DOI] [PubMed] [Google Scholar]
- 44.Schönegge A.-M., Gallion J., Picard L.-P., Wilkins A.D., Le Gouill C., Audet M., Stallaert W., Lohse M.J., Kimmel M., Lichtarge O. Evolutionary action and structural basis of the allosteric switch controlling 2 AR functional selectivity. Nat Commun. 2017;8(1):1–12. doi: 10.1038/s41467-017-02257-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Tehan B.G., Bortolato A., Blaney F.E., Weir M.P., Mason J.S. Unifying family a GPCR theories of activation. Pharmacol Therapeutics. 2014;143(1):51–60. doi: 10.1016/j.pharmthera.2014.02.004. [DOI] [PubMed] [Google Scholar]
- 46.Katritch V., Fenalti G., Abola E.E., Roth B.L., Cherezov V., Stevens R.C. Allosteric sodium in class A GPCR signaling. Trends Biochem Sci. 2014;39(5):233–244. doi: 10.1016/j.tibs.2014.03.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Amundarain M.J., Ribeiro R.P., Costabel M.D., Giorgetti A. Gabaa receptor family: overview on structural characterization. Future Medicinal Chem. 2019;11(3):229–245. doi: 10.4155/fmc-2018-0336. [DOI] [PubMed] [Google Scholar]
- 48.Amundarain M.J., Viso J.F., Zamarreño F., Giorgetti A., Costabel M. Orthosteric and benzodiazepine cavities of the 122 gabaa receptor: insights from experimentally validated in silico methods. J Biomol Struct Dyn. 2019;37(6):1597–1615. doi: 10.1080/07391102.2018.1462733. [DOI] [PubMed] [Google Scholar]
- 49.Jaiteh M, Taly A, Hénin J. Evolution of pentameric ligand-gated ion channels: pro-loop receptors, PloS one 11 (3). [DOI] [PMC free article] [PubMed]
- 50.Miller P.S., Smart T.G. Binding, activation and modulation of cys-loop receptors. Trends Pharmacol Sci. 2010;31(4):161–174. doi: 10.1016/j.tips.2009.12.005. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.