Skip to main content
eLife logoLink to eLife
. 2022 Mar 3;11:e75751. doi: 10.7554/eLife.75751

Sampling alternative conformational states of transporters and receptors with AlphaFold2

Diego del Alamo 1,2,, Davide Sala 3,, Hassane S Mchaourab 1,, Jens Meiler 2,3,
Editors: Janice L Robertson4, Kenton J Swartz5
PMCID: PMC9023059  PMID: 35238773

Abstract

Equilibrium fluctuations and triggered conformational changes often underlie the functional cycles of membrane proteins. For example, transporters mediate the passage of molecules across cell membranes by alternating between inward- and outward-facing states, while receptors undergo intracellular structural rearrangements that initiate signaling cascades. Although the conformational plasticity of these proteins has historically posed a challenge for traditional de novo protein structure prediction pipelines, the recent success of AlphaFold2 (AF2) in CASP14 culminated in the modeling of a transporter in multiple conformations to high accuracy. Given that AF2 was designed to predict static structures of proteins, it remains unclear if this result represents an underexplored capability to accurately predict multiple conformations and/or structural heterogeneity. Here, we present an approach to drive AF2 to sample alternative conformations of topologically diverse transporters and G-protein-coupled receptors that are absent from the AF2 training set. Whereas models of most proteins generated using the default AF2 pipeline are conformationally homogeneous and nearly identical to one another, reducing the depth of the input multiple sequence alignments by stochastic subsampling led to the generation of accurate models in multiple conformations. In our benchmark, these conformations spanned the range between two experimental structures of interest, with models at the extremes of these conformational distributions observed to be among the most accurate (average template modeling score of 0.94). These results suggest a straightforward approach to identifying native-like alternative states, while also highlighting the need for the next generation of deep learning algorithms to be designed to predict ensembles of biophysically relevant states.

Research organism: None

Introduction

Dynamic interconversion between multiple conformations underpins the functions of integral membrane proteins in all domains of life (Campbell et al., 2016; Cournia et al., 2015; Shaw et al., 2010; Boehr et al., 2009). For example, the vectorial translocation of substrates by transporters is mediated by movements that open and close extra- and intracellular gates (Drew and Boudker, 2016; Kazmier et al., 2017). For G-protein-coupled receptors (GPCRs), ligand binding on the extracellular side triggers structural rearrangements on the intracellular side that initiate downstream signaling (Wang et al., 2020; Gusach et al., 2020). Traditional computational prediction pipelines reliant on inter-residue distance restraints calculated from deep multiple sequence alignments (MSAs) have historically struggled to accurately predict the structures of these proteins and their movements. The resulting models are unnaturally compact and frequently distorted, preventing critical questions about ligand and/or drug binding modes from being addressed (Ovchinnikov et al., 2015; Nicoludis and Gaudet, 2018).

A performance breakthrough was unveiled during CASP14 by AlphaFold2 (AF2) (Jumper et al., 2021a; Tunyasuvunakool et al., 2021; Pereira et al., 2021), which achieved remarkably accurate de novo structure prediction. Upon examining the list of CASP14 targets and corresponding models, we found that AF2 modeled the multidrug transporter LmrP (target T1024) in multiple conformations, two of which were individually consistent with published experimental data (Jumper et al., 2021b; Del Alamo et al., 2021a; Debruycker et al., 2020; Martens et al., 2016; Masureel et al., 2014). This observation stimulated the question of whether such performance can be duplicated for other membrane proteins. At its essence, this question centers on whether AF2 can sample the conformational landscape in the minimum energy basin. Here, we investigate this hypothesis using a benchmark set of topologically diverse transporters and GPCRs. Our results demonstrate that reducing the depth of the input MSAs is often conducive to the generation of accurate models in multiple conformations by AF2, suggesting that the algorithm’s outstanding predictive performance can be extended to sample alternative structures of the same target. For most proteins considered, we report a striking correlation between the breadth of structures predicted by AF2 and the corresponding cryo-EM and/or X-ray crystal structures. Finally, we propose a modeling pipeline for researchers interested in sampling alternative conformations of specific membrane proteins, which we apply to the structurally unknown GPR114/ADGRG5 adhesion GPCR as an example.

Results and discussion

The default three-stage AF2 pipeline consists of (1) querying of sequence databases and generation of an MSA, (2) inference of structure via a neural network using a randomly resampled subset of this MSA containing up to 5120 sequences, which is repeated three times (a process termed ‘recycling’), and (3) resolution of steric clashes and bond geometry using a constrained all-atom molecular dynamics simulation. The neural networks used for prediction were trained on all structures deposited in the Protein Data Bank (PDB) on or before April 30, 2018 (Jumper et al., 2021a). Therefore, by necessity, this study is restricted to proteins whose structures were absent from the PDB before this date and have since been determined at atomic resolution in two or more conformations. We selected five transporters that not only met these criteria but also reflected a range of transport mechanisms characterized in the literature (Drew and Boudker, 2016), including rocking-bundle (LAT1, Yan et al., 2019; Yan et al., 2021; ZnT8, Xue et al., 2020), rocker-switch (MCT1, Wang et al., 2021; STP10, Bavnhøj et al., 2021), and elevator (ASCT2, Garibsingh et al., 2021; Garaeva et al., 2019). We also included three GPCRs, which were distributed across classes A (CGRPR, Liang et al., 2020; Josephs et al., 2021), B1 (PTH1R, Ehrenmann et al., 2018; Zhao et al., 2019), and F (FZD7, Xu et al., 2021; to serve as points of comparison, we used the active conformation of FZD7 and the inactive conformation of the nearly identical FZD4, Yang et al., 2018).

AF2 generates multiple conformations of all eight target proteins

The sequences of all targets were truncated at the N- and C-termini to remove large soluble and/or intrinsically disordered regions which represent a challenge for AF2 (see Methods). The structures were then predicted using the default AF2 structure prediction pipeline in the absence of templates. However, the resulting models were largely identical to one another and failed to shed light on the target protein’s conformational space. To diversify the models generated by AF2, we reduced the number of recycles to one and restricted the depth of the randomly subsampled MSAs to contain as few as 16 sequences. To sample the conformational landscape more exhaustively, we generated 50 models of each protein for each MSA size, while skipping the final MD simulation to reduce the pipeline’s total computational cost. For the targets, each model’s similarity to the experimental structures was quantified using template modeling (TM) score (Zhang and Skolnick, 2004; Zhang and Skolnick, 2005; Xu and Zhang, 2010), a metric ranging from 0 to 1, which indicates how well the two backbone (Cα atoms) structures superimpose over one another (higher values corresponding to greater similarity; Figure 1A).

Figure 1. Alternative conformations of transporters and G-protein-coupled receptors (GPCRs) can be predicted by AlphaFold2 (AF2).

(A) Representative models of the transporter LAT1 in inward-facing (IF) and outward-facing (OF) conformations. Experimental structures shown in gray and models shown in colors. (B) Comparison of AF2 models with inactive/active or IF/OF experimental structures as a function of multiple sequence alignment (MSA) depth for GPCRs (top) and transporters (bottom), respectively. All models shown here were generated without templates. Dashed lines indicate the template modeling (TM) score between experimental structures and are shown for reference. (C) Supplementing shallow MSAs with OF templates allows AF2 to predict the OF conformation of MCT1. (D) Experimental structures superimposed over models with the greatest TM scores. Inactive/IF and active/OF cartoons shown on the top and bottom in teal and orange, respectively.

Figure 1.

Figure 1—figure supplement 1. Example principal component analysis (PCA) of ASCT2 models generated by AlphaFold2 (AF2) containing outlier models.

Figure 1—figure supplement 1.

Colors correspond to multiple sequence alignment (MSA) depth in sequences, while orange and teal diamonds refer to the outward- and inward-facing experimental structures, respectively. Left: outlier models are indicated by gray arrows and could be clearly delineated along PC2. Right: removal of these models reveal collective variables along PC2.
Figure 1—figure supplement 2. Conformational homogeneity as a function of multiple sequence alignment (MSA) depth.

Figure 1—figure supplement 2.

Conformational homogeneity was defined by classifying models as more similar to one of the two experimental structures based on their template modeling (TM) scores and calculating the fraction of models in the larger group. MCT1 and proteins in the training set (CCR5, MurJ, PfMATE, and SERT) were biased toward one specific conformation (uniformity = 1) even when using shallow MSAs. X-Axis shown on log-2 scale for clarity. Models generated using MSAs with 16 sequences were omitted from this figure.
Figure 1—figure supplement 3. Templates contribute to conformational sampling only when shallow multiple sequence alignments (MSAs) are provided.

Figure 1—figure supplement 3.

(A) PTH1R. (B) LAT1.
Figure 1—figure supplement 4. Protein targets with one conformation in the training set cannot be predicted in the alternative conformation.

Figure 1—figure supplement 4.

The following experimental structures served as references for the training and the alternative conformations, respectively: 5UIWr and 7F1Qr (CCR5), 5T77a and 6NC9a (MurJ), 3VVNa and 6FHZa (PfMATE), and 5I6Xa and 6DZZa (SERT).

Accurate models of all eight protein targets were obtained for at least one conformation (TM score ≥0.9), consistent with published performance statistics (Figure 1B). MSAs with hundreds or thousands of sequences were generally observed to engender tighter clustering in conformations specific to each protein. Decreasing the depth of the subsampled MSAs, by contrast, appeared to promote the generation of alternative conformations in most proteins. The increased diversity coincided with the generation of misfolded or outlier models. However, unlike the models of interest that resembled experimentally determined structures, misfolded models virtually never coclustered and could thus be identified and excluded from further analysis (example shown in Figure 1—figure supplement 1). Increasing the depth of subsampled MSAs had the desirable effect of eliminating these models, while also limiting the extent to which alternative conformations were sampled. Thus, our results revealed a delicate balance that must be achieved to generate models that are both diverse and natively folded. No general pattern was readily apparent regarding the ideal MSA depth required to achieve this balance, even when accounting for sequence length of the target (Figure 1—figure supplement 2).

One target, MCT1, was exclusively modeled by AF2 in either inward-facing (IF) or fully occluded conformations; over 99% of the models had TM scores of ≥0.9 and <0.9 to the IF and outward-facing (OF) structures, respectively, regardless of MSA depth. Therefore, we investigated the effect of providing templates of homologs in exclusively OF conformations alongside MSAs of various sizes (see Methods for details on template selection). Accurate OF models were obtained only with MSAs containing 16–32 sequences and constituted a minor population in an ensemble dominated by IF models. Thus, the generation of large numbers of models appeared to be necessary to yield intermediate conformations of interest. Similar results were observed when we modeled PTH1R using either inactive or active templates, as well as LAT1 using either OF or IF templates (Figure 1—figure supplement 3), further indicating that the information content provided by the templates diminishes as the depth of the MSA increases.

Overall, these results demonstrate that both conformations of all eight protein targets could be predicted with AF2 to high accuracy (TM score ≥0.9) by using MSAs that are far shallower than the default. However, because the optimal MSA depth and choice of templates varied for each protein, these parameters need to be explored for conformational sampling of a particular target.

Predicted conformational fluctuations correlate with implied conformational dynamics

To further investigate the structural heterogeneity predicted by these models, we calculated each residue’s Cα atom distance between the two superimposed experimental structures, as well as each residue’s root mean square fluctuation (RMSF) among all 50 models following structure-based alignment (Figure 2). Correlation between these two measures was observed in most cases and was notable for ASCT2, LAT1, CGRPR, and MCT1 with templates (R2 ≥ 0.75). The exception was MCT1 without templates, which was likely due to a lack of conformational diversity among the sampled models. The inclusion of templates restored this correlation in MCT1 but contributed negligibly to those of PTH1R and LAT1 (Figure 2—figure supplement 1). The correlation demonstrates that predicted flexibility by AF2 is related to the protein’s dynamics inferred from the experimental structures. In contrast with a recent preprint (Saldaño et al., 2021), the predicted flexibility values failed to correlate with their pLDDT values, which reflect the confidence of the AF2 prediction of each residue’s local environment (Mariani et al., 2013).

Figure 2. Comparison between the movement undergone by each Cα atom between the two superimposed experimental structures and their root mean square fluctuation (RMSF) values among AlphaFold2 (AF2) models.

Residues with low confidence (pLDDT ≤75) were omitted from this plot for all proteins except PTH1R. Multiple sequence alignment (MSA) sizes of 128 sequences were used for all predictions, except for MCT1 with templates, which instead used 32 sequences to capture the outward-facing (OF) conformation. pLDDT refers to each residue’s predicted accuracy, with a value of 100 indicating maximum confidence.

Figure 2.

Figure 2—figure supplement 1. Comparison between the movement undergone by each Cα atom between the two superimposed experimental structures of PTH1R and LAT1 and their root mean square fluctuation (RMSF) values among AlphaFold2 (AF2) models generated with templates.

Figure 2—figure supplement 1.

Distributions of predicted models relative to the experimental structures

Visual examination suggested that many of the predicted models fall ‘in between’ the two experimentally determined conformations (example shown in Figure 3A). Furthermore, certain structural features expected to be conformationally heterogeneous, such as long loops, appeared to be nearly identical across these models. Both observations raised questions about the relationship between the diversity of the predicted models and the breadth of the conformational ensembles bracketed by the experimental structures. To quantitatively place the predicted conformational variance in the context of the experimentally determined structures, we used principal component analysis (PCA), which reduces the multidimensional space to a smaller space representative of the main conformational motions. In our benchmark set, the first principal component (PC1) captured 64.9% ± 16.1% of the structural variations among the models generated using MSAs with 32 or more sequences (Figure 3B), while comparison of PC1/PC2 values suggested that the predicted dynamics deviate from simple interpolation of two end states (Figure 3—figure supplement 1). The experimental structures virtually always occupied well-defined extreme positions. In every case, a correlation was evident between each model’s PC1 values and their TM scores. Indeed, the models with the most extreme PC1 values were also among the most accurate: average TM scores were 0.94 for the top 1, top 3, and top 10 PC1 models, and Pearson’s correlation coefficients between PC1 and TM scores of the ensemble of models exceeded 0.8 for all transporters in this dataset. Moreover, the experimental structures virtually always flanked the AF2 models along PC1. The exception, PTH1R, was determined in a partially inactive and active conformation (Zhao et al., 2019), suggesting that models extending beyond the former state along PC1 may represent the fully inactive conformation. Therefore, these results indicate that accurate representative models of conformations of interest can be selected from the extreme positions along PC1.

Figure 3. Distinct conformations can be delineated using principal component analysis (PCA).

(A) Conformational heterogeneity in AlphaFold2 (AF2) models of LAT1. Experimental inward-facing (IF) and outward-facing (OF) conformations shown in teal and orange, respectively, while the gallery of AF2 models generated using 128 sequences are shown in gray. (B) Distribution of AF2 models generated using multiple sequence alignments (MSAs) with 32 or more sequences across the first principal component (PC1) following PCA (gray traces). Scatter plots comparing each model’s position along PC1 and its structural similarity to experimentally determined structures. Teal: similarity to IF (transporters) or inactive (G-protein-coupled receptors, GPCRs) conformation. Orange: similarity to OF (transporters) or active (GPCRs) conformation. Each model is shown twice, once in teal and once in orange. Native structures are shown as black dots.

Figure 3.

Figure 3—figure supplement 1. Models sampled by AlphaFold2 (AF2) in multiple conformations cannot be fully explained by linear interpolation of two end structures.

Figure 3—figure supplement 1.

Principal component analysis (PCA) of AF2 models, with colors corresponding to the depth of the multiple sequence alignments (MSAs) used for prediction. Experimental structures in an inward-facing or inactive state are shown as teal diamonds, while structures in an outward-facing or active state are shown as orange diamonds.
Figure 3—figure supplement 2. Example predictions of the adhesion G-protein-coupled receptor (aGPCR) GPR114/ADGRG5.

Figure 3—figure supplement 2.

Top: kernel density estimate of the first principal component (PC1) following principal component analysis (PCA on all AlphaFold2 (AF2)) models. Bottom: comparison of PC1 and template modeling (TM) score values; alignments were measured from the AF2 database model, which is indicated by the green dot. Right: models extracted from the cluster centers appear to adopt three distinct conformations.

Limited conformational sampling is observed for proteins with structures in the AF2 training set

A follow-up question centers on whether this strategy can yield similar results for proteins with one conformation present in the AF2 training set. We investigated this question using four membrane proteins with two experimentally determined conformations, at least one of which was included in the AF2 training set: the class A GPCR CCR5 (Zheng et al., 2017; Zhang et al., 2021), the serotonin transporter SERT (Coleman et al., 2016; Coleman et al., 2019), the multidrug transporter PfMATE (Tanaka et al., 2013; Zakrzewska et al., 2019), and the lipid flippase MurJ (Kuk et al., 2017; Kuk et al., 2019). Using the template-free prediction pipeline outlined above, we determined the resultant models’ similarity to the structures included in and absent from the training set. Unlike the results presented above, the majority of the transporter models generated this way were more similar to the conformation present in the training set than the conformation absent from the training set (i.e., their TM scores were greater; Figure 1—figure supplement 4). The conformational diversity of these models, including those generated using shallow MSAs, was far more limited than what was generally observed for the proteins discussed above, with the exception of MCT1 (Figure 1—figure supplement 2). Although conformational diversity was demonstrated to a limited extent by the generation of occluded models of MurJ and PfMATE, none of the models observed adopted the alternative conformer. By contrast, while models of CCR5 were less biased toward the training set conformation, deep MSAs reduced conformational diversity. This divergence in performance may stem from the composition of the AF2 training set, which featured the structures of many active GPCRs but no structures, for example, of IF MATEs (Claxton et al., 2021).

Concluding remarks: proposed workflow and future directions

Our results indicate that the state-of-the-art de novo structural modeling algorithm AF2 can be manipulated to accurately model alternative conformations of transporters and GPCRs whose structures were not available in the training set. The use of shallow MSAs was instrumental to obtaining structurally diverse models in most proteins, and in one case (MCT1) accurate modeling of alternative conformations also required the manual curation of template structures. Thus, while the results presented here provide a blueprint for obtaining AF2 models of alternative conformations, they also argue against an optimal one-size-fits-all approach for sampling the conformational space of every protein with high accuracy. Indeed, whereas the DeepMind team reportedly required templates to obtain models of LmrP in an OF conformation (Jumper et al., 2021b), we found that this procedure was usually unnecessary. Accurate representatives of distinct conformers were generally obtainable with exhaustive sampling and could be identified by performing PCA and selecting models at the extreme positions of PC1. Nevertheless, prediction pipelines will likely require a combination of iterative fine-tuning specific to each target of interest followed by experimental verification to identify proposed conformers. Moreover, this approach showed limited success when applied to transporters whose structures were used to train AF2, hinting at the possibility that traditional methods may still be required to capture alternative conformers (Crawley et al., 2011; Ollikainen et al., 2013).

As a final verification of this proposed pipeline, we tested it on GPR114/ADGRG5, a class B2 adhesion GPCR whose structure has not been experimentally determined. The structural model deposited in the AF2 database, which likely depicts an active conformation that diverges from the structure of the homolog GPR97 (Ping et al., 2021), could be recapitulated by using deep MSAs. The use of shallow MSAs (≤64 sequences), by contrast, yielded a range of intermediate conformations distributed across three well-separated clusters (Figure 3—figure supplement 2). One of these clusters contains models with an orientation of TM6 and TM7 that fully occludes the orthosteric site and partially blocks the cytosolic pocket where G proteins bind. The physiological relevance of these proposed structural movements nonetheless requires experimental validation.

While these results reinforce the notion that AF2 can provide models to guide biophysical studies of conformationally heterogeneous membrane proteins, they represent a methodological ‘hack’, rather than an explicit objective built into the algorithm’s architecture. Several preprints have provided evidence that AF2, despite its accuracy, likely does not learn the energy landscapes underpinning protein folding and function (Saldaño et al., 2021; Pak et al., 2021; Akdel et al., 2021). Moreover, AF2 does not directly account for the lipid environment, which has been experimentally shown to bias the conformational equilibria of membrane proteins (Martens et al., 2018; Muller et al., 2019; Immadisetty et al., 2019). As our results show, the exploration of the conformational space is in part a byproduct of low sequence information provided for inference. Ultimately, they highlight the need for further development of artificial intelligence methods capable of learning the conformational flexibility intrinsic to protein structures.

Materials and methods

Overview of the prediction pipeline

Prediction runs were executed using AlphaFold v2.0.1 and a modified version of ColabFold (Mirdita et al., 2021) that is available at https://github.com/delalamo/af2_conformations, (Del Alamo, 2021b copy archived at swh:1:rev:d60db86886186e80622deaa91045caccaf4103d3). The pipeline used in this study differs from the default AF2 pipeline in several aspects. First, all MSAs were obtained using the MMSeqs2 server (Steinegger and Söding, 2017), rather than the default databases. Second, template search was disabled, except when explicitly performed with specific templates of interest (see below). Third, the number of recycles was set to one, rather than three by default. Finally, models were not refined following their prediction. This study utilized all 5 neural networks when predicting structures without templates, with 10 predictions per neural network per MSA size. The following residues were omitted from modeling: 1–131 and 401–461 of CGRPR, 1–247 of FZD7, 1–175 and 492–593 of PTH1R, and 1–49 of LAT1.

MSA subsampling

MSA subsampling was carried out randomly by AF2, and depth values were controlled by modifying ‘max_msa_clusters’ and ‘max_extra_msa’ parameters prior to execution. The former parameter determines the number of randomly chosen sequence clusters provided to the AF2 neural network. The latter parameter determines the number of extra sequences used to compute additional summary statistics. Throughout this manuscript, we refer to the latter when describing the depth of the MSA used for prediction and set the former to half this value in all cases except when 5120 sequences were used, in which case we set the former to 512. No manual intervention was carried out to fine-tune the composition of these alignments.

Template-based predictions

Templates were fetched using the MMSeqs2 server used by ColabFold. All templates were manually inspected, and those with structures similar to the desired conformation of interest were retained. The following templates were used: OF MCT1 with FucB (PDB 3O7Pa and 3O7Qa, 18% sequence identity calculated using Needleman Wunsch Needleman and Wunsch, 1970; Madeira et al., 2019); OF LAT1 with AdiC (3OB6a and 5J4Ib, 22%); IF LAT1 with b(0,+)AT1 (6LI9, 45%), KCC3 (6Y5Ra, 12%), GkApcT (6F34a, 24%), NKCC1 (6NPLa, 6PZTa, and 6NPHa, 12%), BasC (6F2Ga and 6F2Wa, 29%), and GadC (4DJIa and 4DJKb, 21%); active PTH1R with GCGR (6LMLr and 6VCBr, 30%) PAC1 (6M1I, 34%), and CRF1 (6P9X, 34%); inactive PTH1R with GLP1 (6LN2, 30%) and GCGR (5YQZ, 5XF1, and 5EE7, 28%). Template processing then proceeded as previously described, except that the parameter ‘subsample_templates’ was set to True and the template similarity cutoff was reduced from 10% to 1%. Additionally, as only 2 of the 5 AF2 neural networks were parametrized to use templates, each of these 2 neural networks generated 25 models in order to arrive at 50 total models per MSA depth.

Structural analysis

TM scores were calculated using TM align (Zhang and Skolnick, 2005). PCA and RMSF calculations were carried out in CPPTRAJ (Roe and Cheatham, 2013). Loop residues were omitted from PCA.

Acknowledgements

This study was funded by the National Institutes of Health (HSM: GM 128087) and the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) through CRC 1423, project number 421152132, subproject Z04. The authors would like to thank Dr. John Jumper for explaining how the DeepMind team predicted the structure of LmrP in CASP14 and Mr. Taylor Jones for helpful discussions on modeling MCT1 without templates.

Funding Statement

The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication.

Contributor Information

Hassane S Mchaourab, Email: hassane.mchaourab@vanderbilt.edu.

Jens Meiler, Email: jens.meiler@vanderbilt.edu.

Janice L Robertson, Washington University in St Louis, United States.

Kenton J Swartz, National Institute of Neurological Disorders and Stroke, National Institutes of Health, United States.

Funding Information

This paper was supported by the following grants:

  • National Institutes of Health GM 128087 to Hassane S Mchaourab.

  • Deutsche Forschungsgemeinschaft CRC 1423, project number 421152132, subproject Z04 to Jens Meiler.

Additional information

Competing interests

No competing interests declared.

No competing interests declared.

Author contributions

Conceptualization, Data curation, Formal analysis, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review and editing.

Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review and editing.

Conceptualization, Funding acquisition, Project administration, Resources, Supervision, Writing – review and editing.

Conceptualization, Funding acquisition, Project administration, Resources, Supervision, Writing – review and editing.

Additional files

Transparent reporting form

Data availability

All scripts and data presented in this study are made available for download at https://github.com/delalamo/af2_conformations, (copy archived at swh:1:rev:d60db86886186e80622deaa91045caccaf4103d3).

The following dataset was generated:

Del Alamo D. 2022. Sampling alternative conformational states of transporters and receptors with AlphaFold2. GitHub. conformations

References

  1. Akdel M, Pires DEV, Porta Pardo E, Jänes J, Zalevsky AO, Mészáros B, Bryant P, Good LL, Laskowski RA, Pozzati G, Shenoy A, Zhu W, Kundrotas P, Ruiz Serra V, Rodrigues CHM, Dunham AS, Burke D, Borkakoti N, Velankar S, Frost A, Lindorff-Larsen K, Valencia A, Ovchinnikov S, Durairaj J, Ascher DB, Thornton JM, Davey NE, Stein A, Elofsson A, Croll TI, Beltrao P. A structural biology community assessment of AlphaFold 2 applications. bioRxiv. 2021 doi: 10.1101/2021.09.26.461876. [DOI] [PMC free article] [PubMed]
  2. Bavnhøj L, Paulsen PA, Flores-Canales JC, Schiøtt B, Pedersen BP. Molecular mechanism of sugar transport in plants unveiled by structures of glucose/H symporter STP10. Nature Plants. 2021;7:1409–1419. doi: 10.1038/s41477-021-00992-0. [DOI] [PubMed] [Google Scholar]
  3. Boehr DD, Nussinov R, Wright PE. The role of dynamic conformational ensembles in biomolecular recognition. Nature Chemical Biology. 2009;5:789–796. doi: 10.1038/nchembio.232. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Campbell E, Kaltenbach M, Correy GJ, Carr PD, Porebski BT, Livingstone EK, Afriat-Jurnou L, Buckle AM, Weik M, Hollfelder F, Tokuriki N, Jackson CJ. The role of protein dynamics in the evolution of new enzyme function. Nature Chemical Biology. 2016;12:944–950. doi: 10.1038/nchembio.2175. [DOI] [PubMed] [Google Scholar]
  5. Claxton DP, Jagessar KL, Mchaourab HS. Principles of Alternating Access in Multidrug and Toxin Extrusion (MATE) Transporters. Journal of Molecular Biology. 2021;433:166959. doi: 10.1016/j.jmb.2021.166959. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Coleman JA, Green EM, Gouaux E. X-ray structures and mechanism of the human serotonin transporter. Nature. 2016;532:334–339. doi: 10.1038/nature17629. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Coleman JA, Yang D, Zhao Z, Wen PC, Yoshioka C, Tajkhorshid E, Gouaux E. Serotonin transporter-ibogaine complexes illuminate mechanisms of inhibition and transport. Nature. 2019;569:141–145. doi: 10.1038/s41586-019-1135-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Cournia Z, Allen TW, Andricioaei I, Antonny B, Baum D, Brannigan G, Buchete N-V, Deckman JT, Delemotte L, Del Val C, Friedman R, Gkeka P, Hege H-C, Hénin J, Kasimova MA, Kolocouris A, Klein ML, Khalid S, Lemieux MJ, Lindow N, Roy M, Selent J, Tarek M, Tofoleanu F, Vanni S, Urban S, Wales DJ, Smith JC, Bondar A-N. Membrane Protein Structure, Function, and Dynamics: a Perspective from Experiments and Theory. The Journal of Membrane Biology. 2015;248:611–640. doi: 10.1007/s00232-015-9802-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Crawley SW, Gharaei MS, Ye Q, Yang Y, Raveh B, London N, Schueler-Furman O, Jia Z, Côté GP. Autophosphorylation activates Dictyostelium myosin II heavy chain kinase A by providing A ligand for an allosteric binding site in the alpha-kinase domain. The Journal of Biological Chemistry. 2011;286:2607–2616. doi: 10.1074/jbc.M110.177014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Debruycker V, Hutchin A, Masureel M, Ficici E, Martens C, Legrand P, Stein RA, Mchaourab HS, Faraldo-Gómez JD, Remaut H, Govaerts C. An embedded lipid in the multidrug transporter LmrP suggests a mechanism for polyspecificity. Nature Structural & Molecular Biology. 2020;27:829–835. doi: 10.1038/s41594-020-0464-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Del Alamo D, Govaerts C, Mchaourab HS. AlphaFold2 predicts the inward-facing conformation of the multidrug transporter LmrP. Proteins. 2021a;89:1226–1228. doi: 10.1002/prot.26138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Del Alamo D. Prediction of alternative conformations using AlphaFold 2. swh:1:rev:d60db86886186e80622deaa91045caccaf4103d3Software Heritage. 2021b https://archive.softwareheritage.org/swh:1:dir:732ab369a1b141030fbc5728d9ab3f50e222b3ea;origin=https://github.com/delalamo/af2_conformations;visit=swh:1:snp:cb0ae8b45df2c5f548867364bbfe52debb482bda;anchor=swh:1:rev:d60db86886186e80622deaa91045caccaf4103d3
  13. Drew D, Boudker O. Shared Molecular Mechanisms of Membrane Transporters. Annual Review of Biochemistry. 2016;85:543–572. doi: 10.1146/annurev-biochem-060815-014520. [DOI] [PubMed] [Google Scholar]
  14. Ehrenmann J, Schöppe J, Klenk C, Rappas M, Kummer L, Doré AS, Plückthun A. High-resolution crystal structure of parathyroid hormone 1 receptor in complex with a peptide agonist. Nature Structural & Molecular Biology. 2018;25:1086–1092. doi: 10.1038/s41594-018-0151-4. [DOI] [PubMed] [Google Scholar]
  15. Garaeva AA, Guskov A, Slotboom DJ, Paulino C. A one-gate elevator mechanism for the human neutral amino acid transporter ASCT2. Nature Communications. 2019;10:1–8. doi: 10.1038/s41467-019-11363-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Garibsingh RAA, Ndaru E, Garaeva AA, Shi Y, Zielewicz L, Zakrepine P, Bonomi M, Slotboom DJ, Paulino C, Grewer C, Schlessinger A. Rational design of ASCT2 inhibitors using an integrated experimental-computational approach. PNAS. 2021;118:e2104093118. doi: 10.1073/pnas.2104093118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Gusach A, Maslov I, Luginina A, Borshchevskiy V, Mishin A, Cherezov V. Beyond structure: emerging approaches to study GPCR dynamics. Current Opinion in Structural Biology. 2020;63:18–25. doi: 10.1016/j.sbi.2020.03.004. [DOI] [PubMed] [Google Scholar]
  18. Immadisetty K, Hettige J, Moradi M. Lipid-Dependent Alternating Access Mechanism of a Bacterial Multidrug ABC Exporter. ACS Central Science. 2019;5:43–56. doi: 10.1021/acscentsci.8b00480. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Josephs TM, Belousoff MJ, Liang Y-L, Piper SJ, Cao J, Garama DJ, Leach K, Gregory KJ, Christopoulos A, Hay DL, Danev R, Wootten D, Sexton PM. Structure and dynamics of the CGRP receptor in apo and peptide-bound forms. Science (New York, N.Y.) 2021;372:eabf7258. doi: 10.1126/science.abf7258. [DOI] [PubMed] [Google Scholar]
  20. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Žídek A, Potapenko A, Bridgland A, Meyer C, Kohl SAA, Ballard AJ, Cowie A, Romera-Paredes B, Nikolov S, Jain R, Adler J, Back T, Petersen S, Reiman D, Clancy E, Zielinski M, Steinegger M, Pacholska M, Berghammer T, Bodenstein S, Silver D, Vinyals O, Senior AW, Kavukcuoglu K, Kohli P, Hassabis D. Highly accurate protein structure prediction with AlphaFold. Nature. 2021a;596:583–589. doi: 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Žídek A, Potapenko A, Bridgland A, Meyer C, Kohl SAA, Ballard AJ, Cowie A, Romera-Paredes B, Nikolov S, Jain R, Adler J, Back T, Petersen S, Reiman D, Clancy E, Zielinski M, Steinegger M, Pacholska M, Berghammer T, Silver D, Vinyals O, Senior AW, Kavukcuoglu K, Kohli P, Hassabis D. Applying and improving AlphaFold at CASP14. Proteins. 2021b;89:1711–1721. doi: 10.1002/prot.26257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Kazmier K, Claxton DP, Mchaourab HS. Alternating access mechanisms of LeuT-fold transporters: trailblazing towards the promised energy landscapes. Current Opinion in Structural Biology. 2017;45:100–108. doi: 10.1016/j.sbi.2016.12.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Kuk ACY, Mashalidis EH, Lee SY. Crystal structure of the MOP flippase MurJ in an inward-facing conformation. Nature Structural & Molecular Biology. 2017;24:171–176. doi: 10.1038/nsmb.3346. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Kuk ACY, Hao A, Guan Z, Lee SY. Visualizing conformation transitions of the Lipid II flippase MurJ. Nature Communications. 2019;10:1736. doi: 10.1038/s41467-019-09658-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Liang YL, Belousoff MJ, Fletcher MM, Zhang X, Khoshouei M, Deganutti G, Koole C, Furness SGB, Miller LJ, Hay DL, Christopoulos A, Reynolds CA, Danev R, Wootten D, Sexton PM. Structure and Dynamics of Adrenomedullin Receptors AM and AM Reveal Key Mechanisms in the Control of Receptor Phenotype by Receptor Activity-Modifying Proteins. ACS Pharmacology & Translational Science. 2020;3:263–284. doi: 10.1021/acsptsci.9b00080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Madeira F, Park YM, Lee J, Buso N, Gur T, Madhusoodanan N, Basutkar P, Tivey ARN, Potter SC, Finn RD, Lopez R. The EMBL-EBI search and sequence analysis tools APIs in 2019. Nucleic Acids Research. 2019;47:W636–W641. doi: 10.1093/nar/gkz268. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Mariani V, Biasini M, Barbato A, Schwede T. lDDT: A local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics (Oxford, England) 2013;29:2722–2728. doi: 10.1093/bioinformatics/btt473. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Martens C, Stein RA, Masureel M, Roth A, Mishra S, Dawaliby R, Konijnenberg A, Sobott F, Govaerts C, Mchaourab HS. Lipids modulate the conformational dynamics of a secondary multidrug transporter. Nature Structural & Molecular Biology. 2016;23:744–751. doi: 10.1038/nsmb.3262. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Martens C, Shekhar M, Borysik AJ, Lau AM, Reading E, Tajkhorshid E, Booth PJ, Politis A. Direct protein-lipid interactions shape the conformational landscape of secondary transporters. Nature Communications. 2018;9:1–12. doi: 10.1038/s41467-018-06704-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Masureel M, Martens C, Stein RA, Mishra S, Ruysschaert J-M, Mchaourab HS, Govaerts C. Protonation drives the conformational switch in the multidrug transporter LmrP. Nature Chemical Biology. 2014;10:149–155. doi: 10.1038/nchembio.1408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M. ColabFold - Making protein folding accessible to all. bioRxiv. 2021 doi: 10.1101/2021.08.15.456425. [DOI] [PMC free article] [PubMed]
  32. Muller MP, Jiang T, Sun C, Lihan M, Pant S, Mahinthichaichan P, Trifan A, Tajkhorshid E. Characterization of Lipid-Protein Interactions and Lipid-Mediated Modulation of Membrane Protein Function through Molecular Simulation. Chemical Reviews. 2019;119:6086–6161. doi: 10.1021/acs.chemrev.8b00608. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology. 1970;48:443–453. doi: 10.1016/0022-2836(70)90057-4. [DOI] [PubMed] [Google Scholar]
  34. Nicoludis JM, Gaudet R. Applications of sequence coevolution in membrane protein biochemistry. Biochimica et Biophysica Acta. Biomembranes. 2018;1860:895–908. doi: 10.1016/j.bbamem.2017.10.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Ollikainen N, Smith CA, Fraser JS, Kortemme T. Flexible backbone sampling methods to model and design protein alternative conformations. Methods in Enzymology. 2013;523:61–85. doi: 10.1016/B978-0-12-394292-0.00004-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Ovchinnikov S, Kinch L, Park H, Liao Y, Pei J, Kim DE, Kamisetty H, Grishin NV, Baker D. Large-scale determination of previously unsolved protein structures using evolutionary information. eLife. 2015;4:e09248. doi: 10.7554/eLife.09248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Pak MA, Markhieva KA, Novikova MS, Petrov DS, Vorobyev IS, Maksimova ES, Kondrashov FA, Ivankov DN. Using AlphaFold to predict the impact of single mutations on protein stability and function. bioRxiv. 2021 doi: 10.1101/2021.09.19.460937. [DOI] [PMC free article] [PubMed]
  38. Pereira J, Simpkin AJ, Hartmann MD, Rigden DJ, Keegan RM, Lupas AN. High-accuracy protein structure prediction in CASP14. Proteins. 2021;89:1687–1699. doi: 10.1002/prot.26171. [DOI] [PubMed] [Google Scholar]
  39. Ping Y-Q, Mao C, Xiao P, Zhao R-J, Jiang Y, Yang Z, An W-T, Shen D-D, Yang F, Zhang H, Qu C, Shen Q, Tian C, Li Z, Li S, Wang G-Y, Tao X, Wen X, Zhong Y-N, Yang J, Yi F, Yu X, Xu HE, Zhang Y, Sun J-P. Structures of the glucocorticoid-bound adhesion receptor GPR97–Go complex. Nature. 2021;589:620–626. doi: 10.1038/s41586-020-03083-w. [DOI] [PubMed] [Google Scholar]
  40. Roe DR, Cheatham TE. PTRAJ and CPPTRAJ: Software for Processing and Analysis of Molecular Dynamics Trajectory Data. Journal of Chemical Theory and Computation. 2013;9:3084–3095. doi: 10.1021/ct400341p. [DOI] [PubMed] [Google Scholar]
  41. Saldaño T, Escobedo N, Marchetti J, Zea DJ, Mac Donagh J, Velez Rueda AJ, Gonik E, Melani AG, Novomisky Nechcoff J, Salas MN, Peters T, Demitroff N, Fernandez Alberti S, Palopoli N, Fornasari MS, Parisi G. Impact of protein conformational diversity on AlphaFold predictions. bioRxiv. 2021 doi: 10.1101/2021.10.27.466189. [DOI] [PubMed]
  42. Shaw DE, Maragakis P, Lindorff-Larsen K, Piana S, Dror RO, Eastwood MP, Bank JA, Jumper JM, Salmon JK, Shan Y, Wriggers W. Atomic-level characterization of the structural dynamics of proteins. Science (New York, N.Y.) 2010;330:341–346. doi: 10.1126/science.1187409. [DOI] [PubMed] [Google Scholar]
  43. Steinegger M, Söding J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nature Biotechnology. 2017;35:1026–1028. doi: 10.1038/nbt.3988. [DOI] [PubMed] [Google Scholar]
  44. Tanaka Y, Hipolito CJ, Maturana AD, Ito K, Kuroda T, Higuchi T, Katoh T, Kato HE, Hattori M, Kumazaki K, Tsukazaki T, Ishitani R, Suga H, Nureki O. Structural basis for the drug extrusion mechanism by a MATE multidrug transporter. Nature. 2013;496:247–251. doi: 10.1038/nature12014. [DOI] [PubMed] [Google Scholar]
  45. Tunyasuvunakool K, Adler J, Wu Z, Green T, Zielinski M, Žídek A, Bridgland A, Cowie A, Meyer C, Laydon A, Velankar S, Kleywegt GJ, Bateman A, Evans R, Pritzel A, Figurnov M, Ronneberger O, Bates R, Kohl SAA, Potapenko A, Ballard AJ, Romera-Paredes B, Nikolov S, Jain R, Clancy E, Reiman D, Petersen S, Senior AW, Kavukcuoglu K, Birney E, Kohli P, Jumper J, Hassabis D. Highly accurate protein structure prediction for the human proteome. Nature. 2021;596:590–596. doi: 10.1038/s41586-021-03828-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Wang J, Hua T, Liu ZJ. Structural features of activated GPCR signaling complexes. Current Opinion in Structural Biology. 2020;63:82–89. doi: 10.1016/j.sbi.2020.04.008. [DOI] [PubMed] [Google Scholar]
  47. Wang N, Jiang X, Zhang S, Zhu A, Yuan Y, Xu H, Lei J, Yan C. Structural basis of human monocarboxylate transporter 1 inhibition by anti-cancer drug candidates. Cell. 2021;184:370–383. doi: 10.1016/j.cell.2020.11.043. [DOI] [PubMed] [Google Scholar]
  48. Xu J, Zhang Y. How significant is a protein structure similarity with TM-score = 0.5? Bioinformatics (Oxford, England) 2010;26:889–895. doi: 10.1093/bioinformatics/btq066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Xu L, Chen B, Schihada H, Wright SC, Turku A, Wu Y, Han G-W, Kowalski-Jahn M, Kozielewicz P, Bowin C-F, Zhang X, Li C, Bouvier M, Schulte G, Xu F. Cryo-EM structure of constitutively active human Frizzled 7 in complex with heterotrimeric Gs. Cell Research. 2021;31:1311–1314. doi: 10.1038/s41422-021-00525-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Xue J, Xie T, Zeng W, Jiang Y, Bai XC. Cryo-EM structures of human ZnT8 in both outward- and inward-facing conformations. eLife. 2020;9:23. doi: 10.7554/eLife.58823. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Yan R., Zhao X, Lei J, Zhou Q. Structure of the human LAT1-4F2hc heteromeric amino acid transporter complex. Nature. 2019;568:127–130. doi: 10.1038/s41586-019-1011-z. [DOI] [PubMed] [Google Scholar]
  52. Yan R, Li Y, Müller J, Zhang Y, Singer S, Xia L, Zhong X, Gertsch J, Altmann KH, Zhou Q. Mechanism of substrate transport and inhibition of the human LAT1-4F2hc amino acid transporter. Cell Discovery. 2021;7:16. doi: 10.1038/s41421-021-00247-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Yang S, Wu Y, Xu T-H, de Waal PW, He Y, Pu M, Chen Y, DeBruine ZJ, Zhang B, Zaidi SA, Popov P, Guo Y, Han GW, Lu Y, Suino-Powell K, Dong S, Harikumar KG, Miller LJ, Katritch V, Xu HE, Shui W, Stevens RC, Melcher K, Zhao S, Xu F. Crystal structure of the Frizzled 4 receptor in a ligand-free state. Nature. 2018;560:666–670. doi: 10.1038/s41586-018-0447-x. [DOI] [PubMed] [Google Scholar]
  54. Zakrzewska S, Mehdipour AR, Malviya VN, Nonaka T, Koepke J, Muenke C, Hausner W, Hummer G, Safarian S, Michel H. Inward-facing conformation of a multidrug resistance MATE family transporter. PNAS. 2019;116:12275–12284. doi: 10.1073/pnas.1904210116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Zhang Y, Skolnick J. Scoring function for automated assessment of protein structure template quality. Proteins. 2004;57:702–710. doi: 10.1002/prot.20264. [DOI] [PubMed] [Google Scholar]
  56. Zhang Y, Skolnick J. TM-align: A protein structure alignment algorithm based on the TM-score. Nucleic Acids Research. 2005;33:2302–2309. doi: 10.1093/nar/gki524. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Zhang H, Chen K, Tan Q, Shao Q, Han S, Zhang C, Yi C, Chu X, Zhu Y, Xu Y, Zhao Q, Wu B. Structural basis for chemokine recognition and receptor activation of chemokine receptor CCR5. Nature Communications. 2021;12:1–12. doi: 10.1038/s41467-021-24438-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Zhao L-H, Ma S, Sutkeviciute I, Shen D-D, Zhou XE, de Waal PW, Li C-Y, Kang Y, Clark LJ, Jean-Alphonse FG, White AD, Yang D, Dai A, Cai X, Chen J, Li C, Jiang Y, Watanabe T, Gardella TJ, Melcher K, Wang M-W, Vilardaga J-P, Xu HE, Zhang Y. Structure and dynamics of the active human parathyroid hormone receptor-1. Science (New York, N.Y.) 2019;364:148–153. doi: 10.1126/science.aav7942. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Zheng Y, Han GW, Abagyan R, Wu B, Stevens RC, Cherezov V, Kufareva I, Handel TM. Structure of CC Chemokine Receptor 5 with a Potent Chemokine Antagonist Reveals Mechanisms of Chemokine Recognition and Molecular Mimicry by HIV. Immunity. 2017;46:1005–1017. doi: 10.1016/j.immuni.2017.05.002. [DOI] [PMC free article] [PubMed] [Google Scholar]

Editor's evaluation

Janice L Robertson 1

del Alamo and colleagues illustrate that restricting the depth of the input multiple sequence alignment allows AlphaFold2 to predict diverse conformational ensembles of transporters and receptors, as opposed to single static models reflecting individual states. Although they are limited to a small number of test cases of membrane proteins, the examples are of interest to members of the community. This work presents a validation of a simple approach that may be applicable to all proteins and is thus an exciting advance that is expected to be of broad interest.

Decision letter

Editor: Janice L Robertson1
Reviewed by: Janice L Robertson2

Our editorial process produces two outputs: i) public reviews designed to be posted alongside the preprint for the benefit of readers; ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Sampling the conformational landscapes of transporters and receptors with AlphaFold2" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, including Janice L Robertson as the Reviewing Editor and Reviewer #1, and the evaluation has been overseen by a Reviewing Editor and Kenton Swartz as the Senior Editor.

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

In general, the reviewers were enthusiastic about your finding that reducing the size of the sequence alignments input into AlphaFold2 increases conformational diversity for predictions of transporter membrane protein folds. While the test set is small, the validation provided is convincing, and the results are likely to be broadly useful to many who study protein conformational changes. However, it was found that many details were lacking about the methods, which will limit the ability for others to reproduce these findings or advance this approach further. With that, the following revisions are required in order to describe the methods in appropriate detail and increase the quantitative presentation of the analysis. In addition, it is important to temper general claims made throughout the paper to acknowledge these findings are based on a small test set of proteins. Since these essential revisions focus mainly on writing with some minor additions to the analysis, we expect that these changes will be tractable within a reasonable time frame.

Essential revisions:

1) Elaboration of the methods used. Additional details are needed in order to be able to evaluate the validity and reproducibility of the approach. Specifically,

– Include a brief description of the AF2 protocol and each point at which variability is introduced, i.e. by sequence alignment, template choice or recycling.

– The alignments used to develop the models should be provided. Specific details on how the visual inspection of the alignments guided their refinement should also be included. What is padding of the MSAs? How were the MSAs trimmed from 5120 to 32? What was the distribution of lengths of the sequences included? What were the sequences that ended up being included and what was the sequence diversity in the sets used?

– Earlier in the paper, it is stated that loops were removed from the sequence alignments. However, they are later discussed as being generated in the models. Provide more details about when the loops were included in the structure prediction.

– For some of the targets, the template-based modeling clearly improved sampling of various conformations and for others it didn't. How were the template selected for the template-based modeling?

– What are the PDBs used in the structural analysis? These should be listed explicitly in the pertinent figure and in the methods.

– Define pLDDT.

– What does "eliminating postprocessing with OpenMM" constitute?

– How were misfolded models identified? Providing a reference is not sufficient here.

– To address the predictive power of this approach please clarify which models were used for the PCA. Were the principal components computed from low-MSA AlphaFold2 predictions only, rather than from the large-MSA AF2 predictions, which would make the point moot since the PC reflect the range of conformational changes observed in multiple models, not a subset. Previous studies (Bahar and colleagues) suggested that PCA allows for prediction, but that PC1 is not always the useful component and so the question arises of how to select the correct PC to make the prediction?

2) Additional analysis:

– Analysis of the model accuracy with alignment quality. How do the current results depend on alignment quality and diversity? Which sequences are included in the 32, and how do your findings depend on this selection?

– Analysis of the model accuracy with sequence length. While sequence information is examined, the authors say that no general pattern was apparent regarding the ideal MSA depth. Yet, a more common strategy, namely, to compare sequence sets using a factor related to the length (L) of the protein (or perhaps the core of the protein being modeled) may reveal more. Indeed, by reducing the dataset to 32 sequences, only the longest proteins were starting to include misfolded examples. Overall, it would be more straightforward to compare models built with, e.g. L*2, L/2 and L/5 sequences. While this requires building additional models it would also provide a clearer outcome and strategy that future users could follow. A bonus may be that it would reduce the chances of misfolded models that need to be filtered out. At the minimum, the authors should reframe the data they have as a function of each protein's length.

– Analysis of template usage. What were the templates used? Was the performance of AF2 dependent on the sequence similarity between the template(s) and the target?

– Quantification of conformations. There are many occasions where the discussion of structural similarities/differences are qualitative, e.g. “virtually every transporter model superimposed nearly perfectly with the training set conformation, and none resembled the alternative conformation”. This statement should be accompanied by quantitative data. Furthermore, the different known conformational states, i.e. IF, OF and occluded, require a quantitative definition to support statements like “One target, MCT1, was exclusively modeled by AF2 in either IF or fully occluded conformations regardless of MSA depth. Notably, these results closely parallel those reported by DeepMind during their attempt to model multiple conformations of LmrP in CASP14.”.

– Along these lines, it is reported that conformational variability is not obtained by the targets that were included in the AF2 training set, and yet the extent of conformational diversity appears similar to that analysis presented in Figure 1. For example, MurJ appears to show the same degree of conformational sampling with 32 sequences as for ASCT2. A more objective analysis of the conformational sampling is required to define the dynamic range explored by the structural conformations, especially since some of the endpoint structures are quite similar to each other.

3) Please respond to and address the additional recommendations provided by the reviewers.

Reviewer #1 (Recommendations for the authors):

1. The conformational ensemble from AF2 appears to move along certain structural paths in the different analyses. How does this compare to a linear interpolation between the endpoint structures?

2. In Figure 1, it is recommended that the axes of Figure 1B be scaled similarly to the format used in Figure S3 since the experimental TM-score differences are quite different between the different proteins. Aside from ASCT2, and potentially MCT1 with templates, the dynamic range of the conformational change appears to be minimal, but this may just be difficult to see due to the current plot format. In addition, move Figure S1 into this main figure to allow the reader to discern the structural and conformational variability in this test set. Finally, please add all of the pdbs used for the experimental comparison structures, both in this figure, and in the methods.

3. Is the term “ground truth structures” referring to the crystal structures or other experimental structures? Please change this term as experimental structures do not correspond to a “truth” but is a physically accessible conformational state of the protein under those experimental conditions.

Reviewer #2 (Recommendations for the authors):

1. First, I would disagree with the title of Figure S5 and the beginning of the corresponding title, which seem to be categorical about the lack of exploration of alternate conformations for these examples, but then somewhat contradict the rest of the paragraph, where it is explained that some cases (MurJ and CCR5) behave differently from the others. These discrepancies should be resolved.

2. Second, I think it would be of value for the readership to mention that no function is included to describe the membrane in these modelling processes – even when the lipids themselves may be critical to shift these conformational equilibria. This observation actually makes the authors’ findings all the more remarkable, but also perhaps harder to interpret.

Reviewer #3 (Recommendations for the authors):

1. Change Lat1 -> LAT1

2. The following statement is unclear and should be elaborated. “Several preprints have provided evidence that AF2, despite its accuracy, likely does not learn the energy landscapes underpinning protein folding and function39,53,54. We believe that our results bolster these conclusions and highlight the need for further development of artificial intelligence methods capable of learning the conformational flexibility intrinsic to protein structures.”

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled “Sampling the conformational landscapes of transporters and receptors with AlphaFold2” for further consideration by eLife. Your revised article has been evaluated by Kenton Swartz (Senior Editor) and a Reviewing Editor.

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below:

1. The reviewers found that the current title may lead the reader to misinterpretation. An alternate title “Sampling alternative conformational states of transporters and receptors with AlphaFold2” is more appropriate and should be adopted.

2. The findings in Figure 1 Suppl2, that the number of sequences isn’t correlated with an increase in conformational homogeneity, and displays erratic dependence for some proteins (especially ASCT2, Lat1 and STP10), are surprising. Consequently, it seems necessary to alter some statements in the manuscript, accordingly. For example, in the abstract: “reducing the depth of the input MSAs is conducive to the generation of accurate models in multiple conformations by AF2”.

eLife. 2022 Mar 3;11:e75751. doi: 10.7554/eLife.75751.sa2

Author response


Essential revisions:

1) Elaboration of the methods used. Additional details are needed in order to be able to evaluate the validity and reproducibility of the approach. Specifically,

– Include a brief description of the AF2 protocol and each point at which variability is introduced, i.e. by sequence alignment, template choice or recycling.

(0.1.1) We have added several sentences introducing the general pipeline at the beginning of “Results and Discussion” (references omitted for clarity):

“The default three-stage AF2 pipeline consists of (1) querying of sequence databases and generation of an MSA, (2) inference via a neural network using a randomly resampled subset of this MSA containing up to 5120 sequences, which is repeated a total of three times (a process termed “recycling”), and (3) resolution of steric clashes and bond geometry using a constrained all-atom molecular dynamics simulation. The neural networks used for prediction were trained on all structures deposited in the protein data bank (PDB) on or before April 30th, 201811. Therefore, by necessity, this study is restricted to proteins whose structures were absent from the PDB before this date and have since been determined at atomic resolution in two or more conformations.”

Additionally, to reflect both this comment and other comments made by Reviewers (see below), the following paragraph was modified to detail how this pipeline was changed:

“However, the resulting models were largely identical to one another and failed to shed light on the target protein’s conformational space. To diversify the models generated by AF2, we reduced the number of recycles to one and restricted the depth of the randomly subsampled MSAs to contain as few as 16 sequences. To sample the conformational landscape more exhaustively, we generated fifty models of each protein for each MSA size, while skipping the final MD simulation to reduce the pipeline's total computational cost.”

Finally, we have rewritten the Methods section to accommodate additional detail for the prediction pipeline described herein:

“Overview of the prediction pipeline

Prediction runs were executed using AlphaFold v2.0.1 and a modified version of ColabFold54 that is available at www.github.com/delalamo/af2_conformations. The pipeline used in this study differs from the default AF2 pipeline in several respects. First, all MSAs were obtained using the MMSeqs2 server55, rather than the default databases. Second, template search was disabled, except when explicitly performed with specific templates of interest (see below). Third, the number of recycles was set to one, rather than three by default. Finally, models were not refined following their prediction. Unlike the ColabFold pipeline, however, this study utilized all five neural networks when predicting structures without templates, with ten predictions per neural network per MSA size. Additionally, as only two of the five neural networks can use templates to supplement MSAs for structure prediction, they each performed twenty-five predictions per MSA size when performing template-based predictions. The following residues were omitted from modeling: 1-131 and 401-461 of CGRPR, 1-247 of FZD7, 1-175 and 492-593 of PTH1R, and 1-49 of LAT1.”

“MSA subsampling

MSA subsampling was carried out randomly by AF2, and depth values were controlled by modifying "max_msa_clusters" and "max_extra_msa" parameters prior to execution. The former parameter determines the number of randomly chosen sequence clusters provided to the AF2 neural network. The latter parameter determines the number of extra sequences used to compute additional summary statistics. Throughout this manuscript, we refer to the latter when describing the depth of the MSA used for prediction and set the former to half this value in all cases except when 5120 sequences were used, in which case we set the former to 512. No manual intervention was carried out to fine-tune the composition of these alignments.”

“Template-based predictions

Templates were fetched using the MMSeqs2 server used by ColabFold. All templates were manually inspected, and those with structures similar to the desired conformation of interest were retained. The following templates were used: outward-facing MCT1 with FucB (PDB 3O7Pa and 3O7Qa, 18% sequence identity calculated using Needleman Wunsch56,57); outward-facing LAT1 with AdiC (3OB6a and 5J4Ib, 22%); inward-facing LAT1 with b(0,+)AT1 (6LI9, 45%), KCC3 (6Y5Ra, 12%), GkApcT (6F34a, 24%), NKCC1 (6NPLa, 6PZTa, and 6NPHa, 12%), BasC (6F2Ga and 6F2Wa, 29%), and GadC (4DJIa and 4DJKb, 21%); active PTH1R with GCGR (6LMLr and 6VCBr, 30%) PAC1 (6M1I, 34%), and CRF1 (6P9X, 34%); inactive PTH1R with GLP1 (6LN2, 30%) and GCGR (5YQZ, 5XF1, and 5EE7, 28%). Template processing then proceeded as previously described, except that the parameter “subsample_templates” was set to True and the template similarity cutoff was reduced from 10% to 1%. Additionally, as only two of the five AF2 neural networks were parametrized to use templates, each of these two neural networks generated twenty-five models in order to arrive at fifty total models per MSA depth.”

“Structural analysis

TM-scores were calculated using TM-align33. Principal component analysis and RMSF calculations were carried out in CPPTRAJ58. Loop residues were omitted from PCA.”

– The alignments used to develop the models should be provided. Specific details on how the visual inspection of the alignments guided their refinement should also be included. What is padding of the MSAs? How were the MSAs trimmed from 5120 to 32? What was the distribution of lengths of the sequences included? What were the sequences that ended up being included and what was the sequence diversity in the sets used?

(0.1.2) We have clarified in “Results and Discussion” that all manipulation of the MSA was performed automatically by the AF2 program. Our pipeline in this study simply reduced the depth of the subsampled MSAs without manual tuning the content. Our changes are described above in response to the previous comment. Additionally, we have edited the text in “Abstract” to emphasize this point:

“Whereas models generated using the default AF2 pipeline are conformationally homogeneous and nearly identical to one another, reducing the depth of the input multiple sequence alignments (MSAs) by stochastic subsampling led to the generation of accurate models in multiple conformations.”

We note that this critical point is reinforced in the “Methods” section (see 0.1.1 above). Finally, throughout the text, we have modified the text to emphasize that AF2 randomly determined the composition of the subsampled MSAs used for structure prediction (for example, by removing the word “padding” on page 3). We therefore believe that details regarding the optimal composition of the MSAs required to obtain alternative conformations is beyond the scope of this publication. Moreover, to our knowledge, AF2 currently does not support a way to extract the subsampled MSA used for the prediction.

– Earlier in the paper, it is stated that loops were removed from the sequence alignments. However, they are later discussed as being generated in the models. Provide more details about when the loops were included in the structure prediction.

(0.1.3) We have clarified the text to mention that sequence truncation was limited to the N- and C-termini:

“The sequences of all targets were truncated at the N- and C-termini to remove large soluble domains attached to the membrane proteins and/or intrinsically disordered regions.”

We have also modified “Methods” to account for the removal of loop residues during PCA.

– For some of the targets, the template-based modeling clearly improved sampling of various conformations and for others it didn't. How were the template selected for the template-based modeling?

(0.1.4) We have added the relevant text to the manuscript:

“Therefore, we investigated the effect of providing templates of homologs in exclusively OF conformations alongside MSAs of various sizes (see Methods for details on template selection). Accurate OF models were obtained only with MSAs containing 16 to 32 sequences and constituted a minor population in an ensemble dominated by IF models.”

Additional changes were introduced in Methods (see 0.1.1 above).

– What are the PDBs used in the structural analysis? These should be listed explicitly in the pertinent figure and in the methods.

(0.1.5) We have modified Figures 1 and the caption of Figure 1 —figure supplement 4 (formerly Figure S5) to include all relevant PDB accession codes. Some of the modifications made to Figure 1 are described below.

– Define pLDDT.

(0.1.6) We have modified the relevant section in “Results and Discussion” to include a definition:

“In contrast with a recent preprint35, the predicted flexibility values failed to correlate with their pLDDT values, which reflect the confidence of the AF2 prediction of each residue’s local environment36.”

– What does "eliminating postprocessing with OpenMM" constitute?

(0.1.7) We have modified the text to provide clarification (see 0.1.1 above).

– How were misfolded models identified? Providing a reference is not sufficient here.

(0.1.8) In light of both comments made by the Reviewers below, as well as new calculations carried out in response to these comments, we have revised this paragraph to include both new quantitative descriptions of results and a PCA-based approach for identifying and removing misfolded outlier models:

“Accurate models of all eight protein targets were obtained for at least one conformation (TM-score ≥0.9), consistent with published performance statistics (Figure 1B). MSAs with hundreds or thousands of sequences were generally observed to engender tighter clustering in conformations specific to each protein. Decreasing the depth of the subsampled MSAs, by contrast, appeared to promote the generation of alternative conformations. The increased diversity coincided with the generation of misfolded or outlier models. However, unlike the models of interest that resembled experimentally determined structures, misfolded models virtually never co-clustered and could thus be identified and excluded from further analysis (example shown in Figure 1 —figure supplement 1). Increasing the depth of subsampled MSAs had the desirable effect of eliminating these models, while also limiting the extent to which alternative conformations were sampled. Thus, our results revealed a delicate balance that must be achieved to generate models that are both diverse and natively folded. No general pattern was readily apparent regarding the ideal MSA depth required to achieve this balance, even when accounting for sequence length of the target (Figure 1 —figure supplement 2).”

We have also replaced the previous supplemental Figure (formerly name Figure S2, now Figure 1 —figure supplement 1), which only shows an isolated example of an outlier model, with an example principal component analysis of ASCT2 that we encountered during this study. This was the same approach we used when generating Figure 3 and Figure 3 —figure supplement 2 (formerly called Figure S6).

– To address the predictive power of this approach please clarify which models were used for the PCA. Were the principal components computed from low-MSA AlphaFold2 predictions only, rather than from the large-MSA AF2 predictions, which would make the point moot since the PC reflect the range of conformational changes observed in multiple models, not a subset. Previous studies (Bahar and colleagues) suggested that PCA allows for prediction, but that PC1 is not always the useful component and so the question arises of how to select the correct PC to make the prediction?

(0.1.9) We have clarified in the text that dimensionality reduction using PCA was carried out on all models obtained for all MSA depth values:

“In our benchmark set, the first principal component (PC1) captured 64.9±16.1% of the structural variations among all the models generated using MSAs with 32 or more sequences (Figure 3B).”

We also edited the caption of Figure 3:

We also added Figure 3 —figure supplement 1 showing both PC1 and PC2 to illustrate the limited interpretive power of PC1.

2) Additional analysis:

– Analysis of the model accuracy with alignment quality. How do the current results depend on alignment quality and diversity? Which sequences are included in the 32, and how do your findings depend on this selection?

(0.2.1) We clarified in our text that the composition of the sequences being subsampled is decided entirely randomly by AF2 (see 0.1.2 above).

– Analysis of the model accuracy with sequence length. While sequence information is examined, the authors say that no general pattern was apparent regarding the ideal MSA depth. Yet, a more common strategy, namely, to compare sequence sets using a factor related to the length (L) of the protein (or perhaps the core of the protein being modeled) may reveal more. Indeed, by reducing the dataset to 32 sequences, only the longest proteins were starting to include misfolded examples. Overall, it would be more straightforward to compare models built with, e.g. L*2, L/2 and L/5 sequences. While this requires building additional models it would also provide a clearer outcome and strategy that future users could follow. A bonus may be that it would reduce the chances of misfolded models that need to be filtered out. At the minimum, the authors should reframe the data they have as a function of each protein's length.

(0.2.2) We appreciate this suggestion and have added Figure 1 —figure supplement 2 to compare structural variation as a function of MSA depth normalized by sequence length. We reran the protocol with a more comprehensive range of MSA depths in an attempt to identify a pattern. However, due to the small size of our test set, our results are inconclusive. Nevertheless, they have allowed us to make a clarifying point regarding the modeling of proteins in the training set (discussed below).

– Analysis of template usage. What were the templates used? Was the performance of AF2 dependent on the sequence similarity between the template(s) and the target?

(0.2.3) We have modified the “Methods” section to describe our protocol and have added Table S3 listing the templates used (see 0.1.4 above).

– Quantification of conformations. There are many occasions where the discussion of structural similarities/differences are qualitative, e.g. “virtually every transporter model superimposed nearly perfectly with the training set conformation, and none resembled the alternative conformation”. This statement should be accompanied by quantitative data. Furthermore, the different known conformational states, i.e. IF, OF and occluded, require a quantitative definition to support statements like “One target, MCT1, was exclusively modeled by AF2 in either IF or fully occluded conformations regardless of MSA depth. Notably, these results closely parallel those reported by DeepMind during their attempt to model multiple conformations of LmrP in CASP14.”.

(0.2.4) We have made several changes to the text as recommended:

“Accurate models of all eight protein targets were obtained for at least one conformation (TM-score > 0.9), consistent with published performance statistics.”

“One target, MCT1, was exclusively modeled by AF2 in either IF or fully occluded conformations; over 99% of the models had TM-scores of ≥0.9 and <0.9 to the IF and OF structures, respectively, regardless of MSA depth.”

“Overall, these results demonstrate that both conformations of all eight protein targets could be predicted with AF2 to high accuracy (TM-score ≥0.9) by using MSAs that are far shallower than the default.”

Additionally, in response to comments made by another Reviewer, we have removed the reference to the CASP14 result in this subsection of Results and Discussion.

– Along these lines, it is reported that conformational variability is not obtained by the targets that were included in the AF2 training set, and yet the extent of conformational diversity appears similar to that analysis presented in Figure 1. For example, MurJ appears to show the same degree of conformational sampling with 32 sequences as for ASCT2. A more objective analysis of the conformational sampling is required to define the dynamic range explored by the structural conformations, especially since some of the endpoint structures are quite similar to each other.

(0.2.5) We have modified the language accordingly:

“Unlike the results presented above, the majority of the transporter models generated this way were more similar to the conformation present in the training set than the conformation absent from the training set (i.e., their TM-scores were greater; Figure 1 —figure supplement 4).”

Additionally, we have plotted these results in Figure 1 —figure supplement 2 above to demonstrate their structural homogeneity relative to proteins absent from the training set (see 0.2.2 above).

Reviewer #1 (Recommendations for the authors):

1. The conformational ensemble from AF2 appears to move along certain structural paths in the different analyses. How does this compare to a linear interpolation between the endpoint structures?

(1.1) This question dovetails with the recommendation outlined by the Editor above. We have therefore added a supplemental figure to show that the first principal component describes only a subset of the variation observed in the models (see 0.1.9 above).

This Figure is referenced in the following sentence in “Results and Discussion” subsection “Distributions of predicted models relative to the experimental structures”:

“In our benchmark set, the first principal component (PC1) captured 64.9±16.1% of the structural variations among the models generated using MSAs with 32 or more sequences (Figure 3B), while comparison of PC1/PC2 values suggested that the predicted dynamics deviate from simple interpolation of two end states (Figure 3 —figure supplement 1).”

2. In Figure 1, it is recommended that the axes of Figure 1B be scaled similarly to the format used in Figure S3 since the experimental TM-score differences are quite different between the different proteins. Aside from ASCT2, and potentially MCT1 with templates, the dynamic range of the conformational change appears to be minimal, but this may just be difficult to see due to the current plot format. In addition, move Figure S1 into this main figure to allow the reader to discern the structural and conformational variability in this test set. Finally, please add all of the pdbs used for the experimental comparison structures, both in this figure, and in the methods.

(1.2) We have modified Figure 1 in accordance with these recommendations (please see 0.1.5 above).

3. Is the term “ground truth structures” referring to the crystal structures or other experimental structures? Please change this term as experimental structures do not correspond to a “truth” but is a physically accessible conformational state of the protein under those experimental conditions.

(1.3) We have changed the text to avoid using this term in the text:

“For most proteins considered, we report a striking correlation between the breadth of structures predicted by AF2 and the cryo-EM and/or X-ray crystal structures.”

“To quantitatively place the predicted conformational variance in the context of the experimentally determined structures, we used principal component analysis (PCA), which reduces the multidimensional space to a smaller space representative of the main conformational motions.”

“(In Figure 3B) – Scatter plots comparing each model's position along PC1 and its structural similarity to experimentally determined structures.”

Reviewer #2 (Recommendations for the authors):

1. First, I would disagree with the title of Figure S5 and the beginning of the corresponding title, which seem to be categorical about the lack of exploration of alternate conformations for these examples, but then somewhat contradict the rest of the paragraph, where it is explained that some cases (MurJ and CCR5) behave differently from the others. These discrepancies should be resolved.

(2.1) We have changed the title of the section per the Reviewer’s comments:

“Limited conformational sampling is observed for proteins with structures in the training set”

We have also rewritten the paragraph to better reflect the extent to which models generated with shallow MSAs sample alternative conformations:

“The conformational diversity of these models, including those generated using shallow MSAs, was far more limited than what was generally observed for the proteins discussed above, with the exception of MCT1 (Figure 1 —figure supplement 2). Although conformational diversity was demonstrated to a limited extent by the generation of occluded models of MurJ and PfMATE, none of the models observed adopted the alternative conformer. By contrast, while models of CCR5 were less biased toward the training set conformation, deep MSAs reduced conformational diversity. This divergence in performance may stem from the composition of the AF2 training set, which featured the structures of many active GPCRs but no structures, for example, of inward-facing MATEs45.”

Finally, we edited a statement in “Concluding remarks”:

“Moreover, this approach showed limited success when applied to transporters whose structures were used to train AF2, hinting at the possibility that traditional methods may still be required to capture alternative conformers46,47.”

2. Second, I think it would be of value for the readership to mention that no function is included to describe the membrane in these modelling processes – even when the lipids themselves may be critical to shift these conformational equilibria. This observation actually makes the authors’ findings all the more remarkable, but also perhaps harder to interpret.

(2.2) We have added a sentence to reflect the membrane’s absence from structural inference:

“Several preprints have provided evidence that AF2, despite its accuracy, likely does not learn the energy landscapes underpinning protein folding and function35,49,50. Moreover, AF2 does not directly account for the lipid environment, which have been experimentally shown to bias the conformational equilibria of membrane proteins51–53. As our results show that exploration of the conformational space is in part a byproduct of low sequence information provided for inference, they highlight the need for further development of artificial intelligence methods capable of learning the conformational flexibility intrinsic to protein structures.”

Reviewer #3 (Recommendations for the authors):

1. Change Lat1 -> LAT1

(3.1.1) We have made the modification to the name of the protein throughout the text, Figures, and Tables.

2. The following statement is unclear and should be elaborated. “Several preprints have provided evidence that AF2, despite its accuracy, likely does not learn the energy landscapes underpinning protein folding and function39,53,54. We believe that our results bolster these conclusions and highlight the need for further development of artificial intelligence methods capable of learning the conformational flexibility intrinsic to protein structures.”

(3.1.2) In response to another comment made by Reviewer #2, we have edited this final paragraph (see 2.2 above).

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below:

1. The reviewers found that the current title may lead the reader to misinterpretation. An alternate title “Sampling alternative conformational states of transporters and receptors with AlphaFold2” is more appropriate and should be adopted.

We have changed the manuscript title as recommended.

2. The findings in Figure 1 Suppl2, that the number of sequences isn’t correlated with an increase in conformational homogeneity, and displays erratic dependence for some proteins (especially ASCT2, Lat1 and STP10), are surprising. Consequently, it seems necessary to alter some statements in the manuscript, accordingly. For example, in the abstract: “reducing the depth of the input MSAs is conducive to the generation of accurate models in multiple conformations by AF2”.

We have rewritten several statements throughout the text per the Reviewers' recommendations:

Abstract: “Whereas models of most proteins generated using the default AF2 pipeline are conformationally homogeneous and nearly identical to one another, reducing the depth of the input multiple sequence alignments (MSAs) by stochastic subsampling led to the generation of accurate models in multiple conformations.”

Introduction: “Our results demonstrate that reducing the depth of the input MSAs is often conducive to the generation of accurate models in multiple conformations by AF2, suggesting that the algorithm's outstanding predictive performance can be extended to sample alternative structures of the same target.”

Results and Discussion: “Decreasing the depth of the subsampled MSAs, by contrast, appeared to promote the generation of alternative conformations in most proteins.”

Results and Discussion: “The use of shallow MSAs was instrumental to obtaining structurally diverse models in most proteins, and in one case (MCT1) accurate modeling of alternative conformations also required the manual curation of template models.”

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Del Alamo D. 2022. Sampling alternative conformational states of transporters and receptors with AlphaFold2. GitHub. conformations [DOI] [PMC free article] [PubMed]

    Supplementary Materials

    Transparent reporting form

    Data Availability Statement

    All scripts and data presented in this study are made available for download at https://github.com/delalamo/af2_conformations, (copy archived at swh:1:rev:d60db86886186e80622deaa91045caccaf4103d3).

    The following dataset was generated:

    Del Alamo D. 2022. Sampling alternative conformational states of transporters and receptors with AlphaFold2. GitHub. conformations


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES