Expanding the toolkit for membrane protein modeling in Rosetta

Julia Koehler Leman; Benjamin K Mueller; Jeffrey J Gray

doi:10.1093/bioinformatics/btw716

. 2016 Dec 15;33(5):754–756. doi: 10.1093/bioinformatics/btw716

Expanding the toolkit for membrane protein modeling in Rosetta

Julia Koehler Leman ^1,^2,^*, Benjamin K Mueller ^3,⁴, Jeffrey J Gray ¹

PMCID: PMC5860042 PMID: 28011777

Abstract

Motivation: A range of membrane protein modeling tools has been developed in the past 5–10 years, yet few of these tools are integrated and make use of existing functionality for soluble proteins. To extend existing methods in the Rosetta biomolecular modeling suite for membrane proteins, we recently implemented RosettaMP, a general framework for membrane protein modeling. While RosettaMP facilitates implementation of new methods, addressing real-world biological problems also requires a set of accessory tools that are used to carry out standard modeling tasks.

Results: Here, we present six modeling tools, including de novo prediction of single trans-membrane helices, making mutations and refining the structure with different amounts of flexibility, transforming a protein into membrane coordinates and optimizing its embedding, computing a Rosetta energy score, and visualizing the protein in the membrane bilayer. We present these methods with complete protocol captures that allow non-expert modelers to carry out the computations.

Availability and Implementation: The presented tools are part of the Rosetta software suite, available at www.rosettacommons.org.

Contact: julia.koehler.leman@gmail.com

Supplementary information: Supplementary data are available at Bioinformatics online.

1 Introduction

Membrane proteins are involved in a variety of essential cellular functions, comprise about 30% of gene products (Tan et al., 2008), and are targeted by over 50% of drugs on the market (Bakheet and Doig, 2009; Overington et al., 2006). They are extremely difficult to study by experimental methodologies, which is highlighted by the limited number of structures in the Protein Data Bank (PDB), amounting to only 2% (Rose et al., 2013) of deposited structures. Computational structure prediction, modeling and design are therefore critical in facilitating our understanding of membrane protein structure and function (Koehler Leman et al., 2015).

The number of membrane protein prediction tools is on the rise and the quality of these tools increases in accordance with the database size of membrane protein structures for derivation and testing. Most of these methods are stand-alone tools that require their own implementation of membrane-specific score functions and representation of the membrane bilayer. Further, the user is required to become familiar with each tool, its prediction accuracies and file format conversions. In contrast, adapting existing soluble protein tools for membrane proteins has enormous merit in the development of new computational approaches.

One extensive tool for structure prediction, docking and design is the software suite Rosetta (Leaver-Fay et al., 2011). In addition to the variety of prediction tools that are readily available for biomolecular modeling, Rosetta also includes widely tested energy functions for soluble environments, which are a combination of knowledge-based and physics-based terms (Leaver-Fay et al., 2013). Rosetta is further used for large-scale, highly parallel, high-throughput applications. It is developed by a consortium of laboratories known as the Rosetta Commons, including hundreds of scientists worldwide and licensed by over 20 000 academic users as well as many pharmaceutical companies, enabling the development of drug therapies worldwide. Rosetta is therefore ideally suited for extension to improve membrane protein modeling (Fig. 1).

Fig. 1 — RosettaMP tools are described in detail in the Supplementary Material. The provided protocol captures contain complete command lines that allow non-experts to use these modeling tools. The names of the executables are given in red in the center at the bottom and the numbers correspond to the section numbering in the Supplement: (1) *de novo* modeling of a single transmembrane helix, (2) transformation of a membrane protein into the membrane coordinate frame, (3) creation of a span file from the protein structure, (4) residue mutation with various amounts of protein flexibility, (5) computation of an energy score and visualization of clashes, (6) visualization of the protein in the membrane bilayer

We recently created an implementation of a general platform for membrane protein modeling in Rosetta, termed RosettaMP (Alford et al., 2015). This framework includes a representation of the membrane bilayer and connects to the previously established high- and low-resolution score functions for the hydrophobic environment in the membrane (Barth et al., 2007; Yarov-Yarovoy et al., 2006, 2012). We further showed that RosettaMP can be combined with existing Rosetta applications to create new protocols for membrane protein modeling, for instance for high-resolution refinement, prediction of free energy changes upon mutation, protein-protein docking and assembly of symmetric complexes (Alford et al., 2015). Even though the implementation of completely novel protocols is simplified, the development of modeling methods with high predictive power requires thorough testing, analysis and verification of prediction accuracies against existing or similar methods. This process is time-consuming and laborious. In addition to creating complex, novel applications that require careful testing and benchmarking, modeling membrane proteins for biological applications also requires a number of smaller, ‘standard’ applications that are not completely novel tools, yet are necessary and extremely useful for biomolecular modeling.

Here, we describe six membrane protein modeling tools that were implemented in Rosetta. They expand Rosetta’s existing toolset for membrane protein modeling and are useful and of general interest for a variety of modeling problems applied to biological systems. We provide complete protocols with specific command lines that allow researchers without extensive modeling background to carry out these computations. The applications are either new implementations of well-known modeling problems or were previously inaccessible to the user in this form. The membrane score functions used for these protocols were described previously (Alford et al., 2015; Barth et al., 2007; Yarov-Yarovoy et al., 2006, 2012) and unless otherwise noted, the applications use the high-resolution membrane score function mpframework_smooth_fa_2012.wts.

In addition to describing the setup of Rosetta and the preparation of input files, we illustrate how to (i) create a (transmembrane) helix from an amino acid sequence without the use of peptide fragments, (ii) transform a protein structure into the membrane coordinate frame and (iii) create a Rosetta span file from a known structure. We further explain how to (iv) make mutations in a known structure, (v) compute an energy score in the membrane bilayer and (vi) visualize the protein in the membrane.

2 Methods

Here we introduce six tools for membrane protein modeling in Rosetta with complete protocol captures, which are described in detail in the Supplementary Material and online at https://www.rosettacommons.org/demos/latest/demos-by-category#protocol-captures_membranes or http://juliakoehlerleman.blogspot.com/p/tutorials-protocol-captures.html. These tutorials allow non-expert modelers to carry out a variety of modeling tasks. We further provide additional scripts to validate and visualize many of the features presented.

A single trans-membrane helix can now be easily modeled de novo from sequence alone without the need for fragment insertion. In conjunction with high-resolution refinement, this is an ideal starting point for modeling single trans-membrane helix domains, allowing extension towards modeling full constructs of single trans-membrane span proteins when structures of intra- and extracellular domains are known.
We present an application that allows simple transformation of a protein into the membrane coordinate frame. Protein embedding in the membrane can be further optimized using the high-resolution membrane score function. During optimization, the method searches for an optimal position and orientation of the membrane along the membrane normal (z-axis) and four planes for an optimal normal vector; the outcome is deterministic.
We further present the creation of a spanning topology file which is required for the majority of membrane protein modeling applications in Rosetta.
In addition, we introduce a protocol to model mutations with different levels of backbone and side-chain flexibility, which will be useful to assess the effect of a mutation onto the protein structure.
The scoring method is one of the most basic, yet useful applications. The standard method scores a protein in a given coordinate frame. Optionally, the protein can be transformed into membrane coordinates and the protein embedding can be optimized before scoring. This tool is useful to analyze Rosetta output models and investigate differences between them. Per-residue scores in the PDB files are useful for more detailed analyses, and can be easily visualized with the provided script.
Visualizing the protein orientation with respect to the membrane is essential for examining and validating results. The three applications differ by input and task and can also be used to troubleshoot newly developed protocols created in PyRosetta (Chaudhury et al., 2010) or RosettaScripts (Fleishman et al., 2011). One of the tools allows real-time visualization in PyMOL which is useful for making movies of Rosetta simulations.

3 Results and discussion

In this paper, we present six applications for membrane protein modeling in Rosetta. All protocols contain complete protocol captures (see Supplementary Material and https://www.rosettacommons.org/demos/latest/demos-by-category#protocol-captures_membranes or http://juliakoehlerleman.blogspot.com/p/tutorials-protocol-captures.html) increasing the usability of these methods for non-expert modelers and broadening the user-base of Rosetta protocols in the scientific community. Most of the applications presented here address standard modeling problems that are indispensable for tackling biological questions involving membrane proteins. These tools are best used in conjunction with major modeling protocols to address real-world scientific questions.

4 Conclusion

We presented six accessory tools for membrane protein modeling and design in the RosettaMP framework, which can be used in conjunction with major modeling protocols, that include but are not limited to ones in the Rosetta software suite. To improve ease of use of these tools presented here, we provide complete command lines and protocol captures that will enable non-expert modelers to carry out these computations.

Supplementary Material

Supplementary Data

Click here for additional data file.^{(2.5MB, zip)}

Acknowledgements

The authors would like to thank Rebecca Alford for the implementation of the mp_viewer and for technical help. We also thank the anonymous reviewers for their suggestions.

Funding

Funding was provided from NIH R01 GM-078221 to JJG and JKL, NIH T32 NS007491 to BKM and RosettaCommons to JKL.

Conflict of Interest: none declared.

Footnotes

Associate Editor: Anna Tramontano

References

Alford R.F. et al. (2015) An integrated framework advancing membrane protein modeling and design. PLoS Comput. Biol., 11, e1004398.. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bakheet T.M., Doig A.J. (2009) Properties and identification of human protein drug targets. Bioinformatics, 25, 451–457. [DOI] [PubMed] [Google Scholar]
Barth P. et al. (2007) Toward high-resolution prediction and design of transmembrane helical protein structures. Proc. Natl. Acad. Sci. U. S. A., 104, 15682–15687. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chaudhury S. et al. (2010) PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta. Bioinformatics, 26, 689–691. [DOI] [PMC free article] [PubMed] [Google Scholar]
Fleishman S.J. et al. (2011) RosettaScripts: a scripting language interface to the Rosetta macromolecular modeling suite. PLoS One, 6, e20161. [DOI] [PMC free article] [PubMed] [Google Scholar]
Koehler Leman J. et al. (2015) Computational modeling of membrane proteins. Proteins Struct. Funct. Bioinf., 83, 1–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
Leaver-Fay A. et al. (2011) ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol., 487, 545–574. [DOI] [PMC free article] [PubMed] [Google Scholar]
Leaver-Fay A. et al. (2013) Scientific benchmarks for guiding macromolecular energy function improvement. Methods Enzymol., 523, 109–143. [DOI] [PMC free article] [PubMed] [Google Scholar]
Overington J.P. et al. (2006) How many drug targets are there? Nat. Rev. Drug Discov., 5, 993–996. [DOI] [PubMed] [Google Scholar]
Rose P.W. et al. (2013) The RCSB Protein Data Bank: new resources for research and education. Nucleic Acids Res., 41, D475–D482. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tan S. et al. (2008) Membrane proteins and membrane proteomics. Proteomics, 8, 3924–3932. [DOI] [PubMed] [Google Scholar]
Yarov-Yarovoy V. et al. (2006) Multipass membrane protein structure prediction using Rosetta. Proteins, 62, 1010–1025. [DOI] [PMC free article] [PubMed] [Google Scholar]
Yarov-Yarovoy V. et al. (2012) Structural basis for gating charge movement in the voltage sensor of a sodium channel. Proc. Natl. Acad. Sci. U. S. A., 109, E93–E102. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

Click here for additional data file.^{(2.5MB, zip)}

[btw716-B1] Alford R.F. et al. (2015) An integrated framework advancing membrane protein modeling and design. PLoS Comput. Biol., 11, e1004398.. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btw716-B2] Bakheet T.M., Doig A.J. (2009) Properties and identification of human protein drug targets. Bioinformatics, 25, 451–457. [DOI] [PubMed] [Google Scholar]

[btw716-B3] Barth P. et al. (2007) Toward high-resolution prediction and design of transmembrane helical protein structures. Proc. Natl. Acad. Sci. U. S. A., 104, 15682–15687. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btw716-B5] Chaudhury S. et al. (2010) PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta. Bioinformatics, 26, 689–691. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btw716-B6] Fleishman S.J. et al. (2011) RosettaScripts: a scripting language interface to the Rosetta macromolecular modeling suite. PLoS One, 6, e20161. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btw716-B7] Koehler Leman J. et al. (2015) Computational modeling of membrane proteins. Proteins Struct. Funct. Bioinf., 83, 1–24. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btw716-B8] Leaver-Fay A. et al. (2011) ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol., 487, 545–574. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btw716-B9] Leaver-Fay A. et al. (2013) Scientific benchmarks for guiding macromolecular energy function improvement. Methods Enzymol., 523, 109–143. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btw716-B10] Overington J.P. et al. (2006) How many drug targets are there? Nat. Rev. Drug Discov., 5, 993–996. [DOI] [PubMed] [Google Scholar]

[btw716-B11] Rose P.W. et al. (2013) The RCSB Protein Data Bank: new resources for research and education. Nucleic Acids Res., 41, D475–D482. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btw716-B12] Tan S. et al. (2008) Membrane proteins and membrane proteomics. Proteomics, 8, 3924–3932. [DOI] [PubMed] [Google Scholar]

[btw716-B13] Yarov-Yarovoy V. et al. (2006) Multipass membrane protein structure prediction using Rosetta. Proteins, 62, 1010–1025. [DOI] [PMC free article] [PubMed] [Google Scholar]

[btw716-B14] Yarov-Yarovoy V. et al. (2012) Structural basis for gating charge movement in the voltage sensor of a sodium channel. Proc. Natl. Acad. Sci. U. S. A., 109, E93–E102. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Expanding the toolkit for membrane protein modeling in Rosetta

Julia Koehler Leman

Benjamin K Mueller

Jeffrey J Gray

Abstract

1 Introduction

Fig. 1.

2 Methods

3 Results and discussion

4 Conclusion

Supplementary Material

Acknowledgements

Funding

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Expanding the toolkit for membrane protein modeling in Rosetta

Julia Koehler Leman

Benjamin K Mueller

Jeffrey J Gray

Abstract

1 Introduction

Fig. 1.

2 Methods

3 Results and discussion

4 Conclusion

Supplementary Material

Acknowledgements

Funding

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases