Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Sep 2.
Published in final edited form as: Structure. 2014 Sep 2;22(9):1223–1224. doi: 10.1016/j.str.2014.08.004

Less Is More: Structures of Difficult Targets with Minimal Constraints

Neil R Lloyd 1, Deborah S Wuttke 1,*
PMCID: PMC4260533  NIHMSID: NIHMS646077  PMID: 25185825

Abstract

By merging recent experimental and computational methodology advances, resolution-adapted structural recombination Rosetta has emerged as a powerful strategy for solving the structure of traditionally challenging targets. In this issue of Structure, Sgourakis and colleagues solve the structure of one such target, the immunoevasin protein m04, using this approach.


By all measures, the field of structural biology has been remarkably successful. Methodological advances in X-ray crystallography, NMR spectroscopy, and electron microscopy have dramatically expanded the breadth of critical biomolecular systems accessible to rapid high-resolution structural characterization. Testimony to these successes include both the Protein Data Bank reaching the 100,000 deposited structures benchmark and the explosion of papers using structural data to derive biological insights.

Even with these advances, however, many targets remain recalcitrant to all attempts to solve their high-resolution structures. Even after expression hurdles are overcome, many proteins are only sparingly or transiently soluble or only form low-resolution crystals if they crystallize at all. While progress is being made in engineering proteins to enhance their solubility or crystallizability, there is as of yet no guarantee of success. Furthermore, these approaches can be quite labor intensive and frequently entail screening dozens to hundreds of constructs. Generating a well-behaved sample is a significant bottleneck in structure determination.

Concurrently, computational modeling and prediction approaches have advanced, providing more reliable models with greater accuracy (Kryshtafovych et al., 2014). Particularly exciting are the successes achieved with strategies, such as Rosetta and I-TASSER, that sample protein fragments derived from the structural database (Dantas et al., 2003 and Roy et al., 2010). One strength of this family of approaches is the relative accuracy of the local structures achieved. These strategies, however, become conformation sampling limited at even modest protein sizes, restricting their use to smaller systems. Additionally, robust strategies for cross-validation of the predicted structures are required.

A particularly exciting advance has been the recent development of hybrid techniques that combine the best features of structure determination and modeling approaches while simultaneously addressing the caveats of each. Termed resolution- adapted structural recombination (RASREC) Rosetta, some of the most easily accessible experimental NMR data are incorporated directly into a Rosetta-type calculation, allowing the calculation to hone in on the experimentally defined conformational space (Raman et al., 2010; Lange and Baker, 2012). The data provided by NMR are ideally suited to improve and augment the Rosetta calculation (illustrated for the difficult target Est3 in Figure 1). The secondary structure information derived from backbone chemical shift values is used to bias the selection of fragments in the library for structure prediction. Importantly, the use of a small set of long-range nuclear Overhauser effect (NOE) data serves to constrain the conformational space sampled by the calculation, reining in the explosion of search space needed as the size of the protein increases. The structure can be further refined using orientational data provided by residual dipolar couplings (RDCs). As resonance assignments and long-range amide-amide NOEs are usually available early in the traditional NMR structure determination pipeline and RDCs are readily obtained from aligned samples with backbone assignments, this is a great experimental match. This is particularly true for samples that are not stable at the high concentrations necessary for the complete side chain assignments required to assign the NOEs needed in a traditional structure determination, which are also confounded by ambiguities due to resonance overlap, particularly in larger systems. Advantageously, methyl/methyl NOEs, readily obtainable in even large systems using selectively labeled samples, further constrain the calculation (Tugarinov et al., 2006). The original implementation of this approach described impressive convergence with structures determined using traditional approaches (Lange and Baker, 2012). This algorithm has been improved and has been shown to give accurate structures for several proteins ranging in size from 15 to 40 kDa (Lange and Baker, 2012; Warner et al., 2011; Lange et al., 2012).

Figure 1. Impact of Sparse NMR Restraints on RASREC-Rosetta Structural Ensembles.

Figure 1

Example shown is from the RASREC-Rosetta structure ofEst3 (170aminoacids) (Rao et al., 2014). Superpositions of the 20 lowest energy structures obtained using increasing sparse NMR restraints are shown to illustrate the improvement in both convergence and topology as additional types of data are included. (A) Ensembles obtained by adding 252 dihedral angle constraints from chemical shift (CS) data (B), 37HN-HNNOEdistance constraints (C), 112RDCs orientation constraints (D), and 97 HN-CH3 and CH3-CH3 NOEs distance constraints.

In this issue of Structure, Sgourakis et al. (2014) use this strategy to solve the structure of the murine cytomegalovirus (MCMV) immunoevasin protein known as m04/gp34. The MCMV immunoevasin proteins are employed by the virus to sabotage the host immune response by binding major histocompatibility class I (MHC-1) and interfering with antigen presentation. As such, greater understanding of this family of proteins is a boon toward developing future therapeutics. Although a size typically amenable to routine structure determination, m04/gp34 was recalcitrant to crystallization and was not soluble long enough to complete NMR data collection. However, sufficient NMR data could be acquired to solve the structure using RASREC-Rosetta. The authors discovered that this protein adopted a novel β-topology reminiscent of an immunoglobulin (Ig) fold, although detailed analysis suggested such distinctive deviations from the canonical Ig fold that a convergent evolutionary mechanism is proposed. These important insights derived from the structure will undoubtedly open up new avenues of research related to the role of protein glycosylation and mechanisms of MHC-1 binding. This structure, defined by an ensemble with a root-mean-square-deviation of 1.25 Å for heavy atoms, is achieved with only 1.5 long-range restraint per ten residues.

In addition to achieving the structure of an important and difficult target, these first applications also provide benchmarks for experimentally cross validating RASREC-Rosetta structures (Sgourakis et al., 2014; Warner et al., 2011; Rao et al., 2014). While the approach has been validated in several cases by independent structure determination (Raman et al., 2010; Warner et al., 2011; Lange et al., 2012b), widespread adoption of the approach requires system-specific validation to build confidence in the resulting structures. Fortunately, NMR techniques can also provide this critical supporting data. To ensure that the final structure is not biased by a specific restraint, both Sgourakis et al. (2014) and Rao et al. (2014) report structures calculated with only a subset of all the sparse data and show that the topology is not impacted, although convergence is reduced. Importantly, side chain packing was not impacted, because much of that comes from the database constraints. Sgourakis et al. (2014) also use the Qfree metric, typically employed in standard structure determinations, to report on the consistency of the structures with restraints not used in the calculation (Cornilescu et al., 1998). Relaxation and H/D exchange data independently report on the accuracy of the calculated topology by pinpointing flexible loops and accessible surfaces. Mapping these supporting data on the structure further increases confidence in the placement of these elements. While the community is still to arrive at a consensus when it comes to validation of structures derived via hybrid methods, the standard methods employed for reporting and validating NMR structures are arguably highly suitable for this purpose. Although some may argue that the use of extensive database information to obtain high-resolution structures blurs the distinction between structures and predictions, experimental data and parallel conventional structure determination provide testimony that the combination is a powerful way to obtain accurate structures and may prove invaluable where either approach alone would fail.

With these successes of RASREC-Rosetta, it is clear this approach is poised to become a leading strategy in structure determination. As both the efficiency of the approach and the quality of structures obtained becomes widely appreciated, RASREC-Rosetta may become the method of choice used even when routine structure determination is feasible. It will be exciting to see its extension to a wider range of systems, including membrane proteins and complexes.

Acknowledgments

We would like to thank Arthur Pardi for helpful comments and Timsi Rao for generating ensembles. We acknowledge the National Science Foundation, NIH (R01 GM059414 and T32 GM065103) for supporting our program.

References

  1. Cornilescu G, Marquardt JL, Ottiger M, Bax A. J Am Chem Soc. 1998;120:6836–6837. [Google Scholar]
  2. Dantas G, Kuhlman B, Callender D, Wong M, Baker D. J Mol Biol. 2003;332:449–460. doi: 10.1016/s0022-2836(03)00888-x. [DOI] [PubMed] [Google Scholar]
  3. Kryshtafovych A, Fidelis K, Moult J. Proteins. 2014;82(Suppl 2):164–174. doi: 10.1002/prot.24448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Lange OF, Baker D. Proteins. 2012;80:884–895. doi: 10.1002/prot.23245. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Lange OF, Rossi P, Sgourakis NG, Song Y, Lee HW, Aramini JM, Ertekin A, Xiao R, Acton TB, Montelione GT, Baker D. Proc Natl Acad Sci USA. 2012;109:10873–10878. doi: 10.1073/pnas.1203013109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Raman S, Lange OF, Rossi P, Tyka M, Wang X, Aramini J, Liu G, Ramelot TA, Eletsky A, Szyperski T, et al. Science. 2010;327:1014–1018. doi: 10.1126/science.1183649. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Rao T, Lubin JW, Armstrong GS, Tucey TM, Lundblad V, Wuttke DS. Proc Natl Acad Sci USA. 2014;111:214–218. doi: 10.1073/pnas.1316453111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Roy A, Kucukural A, Zhang Y. Nat Protoc. 2010;5:725–738. doi: 10.1038/nprot.2010.5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Sgourakis NG, Nataranjan K, Ying J, Vogeli B, Boyd LF, Margulies DH, Bax A. Structure. 2014;22(this issue):1263–1273. doi: 10.1016/j.str.2014.05.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Tugarinov V, Kanelis V, Kay LE. Nat Protoc. 2006;1:749–754. doi: 10.1038/nprot.2006.101. [DOI] [PubMed] [Google Scholar]
  11. Warner LR, Varga K, Lange OF, Baker SL, Baker D, Sousa MC, Pardi A. J Mol Biol. 2011;411:83–95. doi: 10.1016/j.jmb.2011.05.022. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES