Abstract
Although chaperone‐assisted protein crystallization remains a comparatively rare undertaking, the number of crystal structures of polypeptides fused to maltose‐binding protein (MBP) that have been deposited in the Protein Data Bank (PDB) has grown dramatically during the past decade. Altogether, 102 fusion protein structures were detected by Basic Local Alignment Search Tool (BLAST) analysis. Collectively, these structures comprise a range of sizes, space groups, and resolutions that are typical of the PDB as a whole. While most of these MBP fusion proteins were equipped with short inter‐domain linkers to increase their rigidity, fusion proteins with long linkers have also been crystallized. In some cases, surface entropy reduction mutations in MBP appear to have facilitated the formation of crystals. A comparison of the structures of fused and unfused proteins, where both are available, reveals that MBP‐mediated structural distortions are very rare.
Keywords: chaperone‐assisted crystallization, crystallization chaperone, crystallization tag, maltose‐binding protein, MBP fusion protein, surface entropy reduction mutagenesis
MBP as a Crystallization Chaperone
In 2003, Smyth et al. reviewed the crystal structures of fusion proteins with large affinity tags.1 At that time, the list contained one glutathione S‐transferase (GST), one thioredoxin (TRX), and four maltose‐binding protein (MBP) fusions. Among these six reported fusion protein structures, only three (all of them MBP fusions) were ever deposited in the PDB (1MG1, 1HSJ, 1MH3). Presently, over a decade later, there are still only three TRX (3DXB, 4KCA, 4KCB) and six GST (1B8X, 1BG5, 1DUG, 1GNE, 3QMZ, 4AI6) fusion protein structures in the PDB, but the number of MBP fusion protein structures has skyrocketed to more than 100 (Fig. 1; Table 1). It therefore seems like an opportune time to examine the large number of MBP fusion structures in some detail and see what lessons can be drawn from them.
Figure 1.

Cumulative number of MBP fusion protein structures deposited in the PDB between 1999 and 2015.
Table 1.
Structures of MBP Fusion Proteins
| PDB | Protein | Organism | Length | Linker sequence | RES. | YR. | S.G. | REF. |
|---|---|---|---|---|---|---|---|---|
| 1Y4C | Designed helical protein | NA | 108 | DEALKDAQTNSSSNNNNNNNNNLGIEGRSSEEL | 1.9 | 2004 | P212121 | 6 |
| 4O4B | Inositol hexakisphosphate kinase EhIP6KA | E. hystolytica | 239 | DEALKDAQTNSGSDITSLYKKAGSAAAVLEENLYFQGSFTDGQYL | 1.8 | 2013 | P212121 | 16 |
| 1YTV | V1 vasopressin receptor fragment | H. sapiens | 59 | DEALKDAQTNSSSNNNNNNNNNNLGIEENLYFQGSSFPCC | 1.8 | 2005 | P21 | 23 |
| 4JKM | Beta‐glucuronidase | C. perfringens | 599 | DEALKDAQTNSSSNNNNNNNNNNRDLGTENLYFQSNAMLYPI | 2.3 | 2013 | C2221 | 20 |
| 3HST | RNase H domain | M. tuberculosis | 141 | DEALKDAQTNSSSNNNNNNNNNNLGIEGRSVKVV | 2.2 | 2009 | P21 | 36 |
| 3OAI | Myelin protein extracellular domain | H. sapiens | 115 | DEALKDAQTNNNNNNNNNNNNNNNNNNNNIVVYT | 2.1 | 2010 | P21 | 11 |
| 1T0K | L30/RNA complex | S. cerevisiae | 105 | DEALKDAQTNSSSVPGRGSIEGRMAPVK | 3.2 | 2004 | P41212 | 5 |
| 1NMU | L30 | S. cerevisiae | 104 | DEALKDAQTNSSSVPGRGSIEGRAAPVKS | 2.3 | 2003 | P212121 | 3 |
| 3A3C | Tim40 core domain | S. cerevisiae | 73 | DEALKDAQTNSSSVPGRGSIEGRPEFAY | 2.5 | 2009 | P212121 | 27 |
| 4GIZ | E6 | HPV16 | 142 | DAALAAAQTNAAAELTLQELLGEERFQDPQ | 2.5 | 2012 | P212121 | 42 |
| 4IRL | GBP‐NLRP1 CARD domain | D. rerio | 95 | DAALAAAQTNAARAAAASEFVD | 1.5 | 2013 | P212121 | 13 |
| 3MQ9 | Tetherin | H. sapiens | 88 | DEALKDAQTRIT/AARDG | 2.8 | 2010 | P21 | 39 |
| 4GLI | Survival motor neuron protein | H. sapiens | 32 | DEALKDAQTRIT/MLISW | 1.9 | 2012 | C2 | 47 |
| 4PQK | RepA | S. aureus | 120 | DEALKDAQT/HNFDD | 3.4 | 2014 | P1 | 55 |
| 4B3N | TRIM5alpha | M. mulatta | 202 | DEALKDAQTRIT/RRVFR | 3.3 | 2012 | C2 | 45 |
| 3LBS | PRR‐IC domain | H. sapiens | 20 | DEALKDA/ADPGYD | 2.2 | 2010 | P212121 | 10 |
| 3LC8 | PRR‐IC domain | H. sapiens | 20 | DEALKDA/ADPGYD | 2.0 | 2010 | P212121 | 10 |
| 4R0Y | GKAP GH1 domain | R. norvegicus | 140 | DEALAMDGHWFL | 2.0 | 2014 | P21212 | 57 |
| 2OK2 | MutS C‐domain | E. coli | 31 | DEALKDAQTHMEETSP | 2.0 | 2007 | C2 | 7 |
| 4WMS | MCL1 | H. sapiens | 149 | DEALKDAQTGSELYRQ | 1.9 | 2014 | P21212 | 60 |
| 4WMTa | MCL1 | H. sapiens | 149 | DEALKDAQTGSELYRQ | 2.4 | 2014 | P21212 | 60 |
| 4WMUa | MCL1 | H. sapiens | 149 | DEALKDAQTGSELYRQ | 1.6 | 2014 | P21212 | 60 |
| 4WMVb | MCL1 | H. sapiens | 149 | DEALKDAQTGSELYRQ | 2.4 | 2014 | P21212 | 60 |
| 4WMWb | MCL1 | H. sapiens | 149 | DEALKDAQTGSELYRQ | 1.9 | 2014 | P21212 | 60 |
| 4WMXb | MCL1 | H. sapiens | 149 | DEALKDAQTGSELYRQ | 2.0 | 2014 | P21212 | 60 |
| 4WGI | MCL1 | H. sapiens | 149 | DEALKDAQTGSELYRQ | 1.9 | 2014 | P21212 | 59 |
| 3CSG | Monobody | NA | 90 | DEALKDAQTRITKGSSVPTNL | 1.8 | 2008 | P41 | 29 |
| 2OBG | Monobody | NA | 90 | DEALKDAQTRITKGSSVPTNL | 2.3 | 2006 | P41 | 25 |
| 3CSB | Monobody | NA | 91 | DEALKDAQTRITKGSSGSSVPTNL | 2.0 | 2008 | P41212 | 29 |
| 3WAI |
Oligosaccharyltransferase C‐domain |
A. fulgidus | 368 | DEALKDAQTNKQWYD | 1.9 | 2013 | C2 | 44 |
| 2VGQ | IPS‐1 CARD domain | H. sapiens | 94 | DEALKDAQTNSAMAFA | 2.1 | 2007 | P41212 | 26 |
| 3Q25 | Alpha‐synuclein fragment | H. sapiens | 19 | DEALKDAQTNSSSMDVFM | 1.9 | 2010 | P43212 | 43 |
| 3Q26 | Alpha‐synuclein fragment | H. sapiens | 33 | DEALKDAQTNSSSKAKEG | 1.5 | 2010 | P212121 | 43 |
| 3Q27 | Alpha‐synuclein fragment | H. sapiens | 26 | DEALKDAQTNSSSKTKEG | 1.3 | 2010 | P21 | 43 |
| 3Q28 | Alpha‐synuclein fragment | H. sapiens | 22 | DEALKDAQTNSSSKTKEQ | 1.6 | 2010 | P21 | 43 |
| 3Q29 | Alpha‐synuclein fragment | H. sapiens | 19 | DEALKDAQTNSSSKTKEQ | 2.3 | 2010 | P212121 | 43 |
| 2ZXT | Tim40 core domain | S. cerevisiae | 95 | DEALKDAQTNSSSVPGRG | 3.0 | 2009 | P212121 | 27 |
| 1MG1 | Gp21 ectodomain | HTLV | 84 | DAALAAAQTNAAAMSLAS | 2.5 | 1999 | H3 | 21 |
| 4OZQ | KIF14 | M. musculus | 349 | DAALAAAQTNAAAENSQV | 2.7 | 2014 | P1 | 18 |
| 2XZ3 | BLV TM hairpin | B. taurus | 92 | DAALAAAQTNAAALSHQR | 1.9 | 2011 | H3 | 8 |
| 4H1G | Kar3 motor domain | C. albicans | 344 | DAALAAAQTNAAALKGNI | 2.1 | 2012 | P21 | 48 |
| 3G7V | Amylin | H. sapiens | 37 | DAALAAAQTNAAAKCNTA | 1.9 | 2011 | P21 | 34 |
| 3G7W | Amylin | H. sapiens | 22 | DAALAAAQTNAAAKCNTA | 1.7 | 2009 | P41212 | 34 |
| 2NVU | NEDD8 activating enzyme E1 | H. sapiens | 429 | DAALAAAQTNAAADWEGR | 2.8 | 2006 | P3221 | 24 |
| 3OB4 | Ara h 2 | A. duranensis | 130 | DAALAAAQTNAAARRCQS | 2.7 | 2010 | C2 | – |
| 4EGC | Six1 | H. sapiens | 189 | DAALAAAQTNAAAMSMLP | 2.0 | 2013 | P21212 | 12 |
| 3O3U | RAGE | H. sapiens | 210 | DAALAAAQTNAAAASAQN | 1.5 | 2010 | P21212 | 41 |
| 4KYC | Polymerase foot domain | Menangle virus | 50 | DAALAAAQTNAAASGYKM | 1.9 | 2013 | P43212 | 15 |
| 4KYD | Polymerase foot domain | HPIV4b | 50 | DAALAAAQTNAAADALKI | 2.2 | 2013 | H3 | 15 |
| 4KYE | Polymerase foot domain | HPIV4b | 50 | DAALAAAQTNAAADALKI | 2.6 | 2013 | C2 | 15 |
| 4QVH | Phosphopantetheinyl transferase PptT | M. tuberculosis | 227 | DAALAAAQTNAAAMTVGT | 1.7 | 2014 | P212121 | 56 |
| 3PY7 | E6 | BPV1 | 152 | DAALAAAQTNAAAMDDLD | 2.3 | 2010 | C2221 | 42 |
| 3F5F | 2‐O‐Sulfotransferase | G. gallus | 288 | DAALAAAQTNAAADEEDD | 2.6 | 2008 | P63 | 33 |
| 4NDZ | 2‐O‐Sulfotransferase | G. gallus | 288 | DAALAAAQTNAAADEEDD | 3.4 | 2013 | P212121 | – |
| 4KV3 | EccD1 ubiquitin‐like domain | M. tuberculosis | 90 | DAALAAAQTNAAAMATTR | 2.2 | 2013 | P65 | – |
| 4EDQ | Myosin‐binding protein C | M. musculus | 107 | DAALAAAQTNAAADDPIG | 1.6 | 2012 | P1 | – |
| 3DM0 | Rack1 | A. thaliana | 324 | DAALAAAQTNAAAGLVLK | 2.4 | 2008 | P21 | 31 |
| 3H4Z | Der p 7 | D. pteronyssinus | 190 | DAALAAAQTNAAADPIHY | 2.3 | 2009 | C2 | 9 |
| 4EXK | STM14_2015 | S. enterica | 103 | DAALAAAQTNAAANTPSP | 1.3 | 2012 | P3221 | — |
| 4BL8 | SUFU | H. sapiens | 460 | DAALAAAQTNAAAPGLHA | 3.0 | 2013 | P212121 | 46 |
| 4BL9 | SUFU | H. sapiens | 385 | DAALAAAQTNAAAPGLHA | 2.8 | 2013 | P1 | 46 |
| 4BLA | SUFU | H. sapiens | 385 | DAALAAAQTNAAAPGLHA | 3.5 | 2013 | P21212 | 46 |
| 4BLB | SUFU | H. sapiens | 382 | DAALAAAQTNAAAPGLHA | 2.8 | 2013 | P21 | 46 |
| 4BLD | SUFU | H. sapiens | 382 | DAALAAAQTNAAAPGLHA | 2.8 | 2013 | P21 | 46 |
| 4PE2 | PilA1 CD160 | C. difficile | 149 | DAALAAAQTNAAASNINK | 1.7 | 2014 | I121 | 17 |
| 4TSM | PilA1 CD160 | C. difficile | 149 | DAALAAAQTNAAASNINK | 1.9 | 2014 | C2 | 17 |
| 4OGM | PilA1 CD160 | C. difficile | 149 | DAALAAAQTNAAASNINK | 2.2 | 2014 | P212121 | 17 |
| 3EF7 | ZP3 ZPN domain | M. musculus | 110 | DAALAAAQTNAAAVKVEC | 3.1 | 2008 | P21212 | 30 |
| 3D4G | ZP3 ZPN domain | M. musculus | 110 | DAALAAAQTNAAAVKVEC | 2.3 | 2008 | P1 | 30 |
| 3D4C | ZP3 ZPN domain | M. musculus | 110 | DAALAAAQTNAAAVKVEC | 2.9 | 2008 | I222 | 30 |
| 4KEG | SPLUNC1 | H. sapiens | 196 | DEALAAAQTNAAASPTGL | 2.5 | 2013 | P3221 | – |
| 4N4X | SPLUNC1 | H. sapiens | 214 | DEALAAAQTNAAASPTGL | 2.5 | 2013 | P3221 | – |
| 4O2X | ClpS | P. falciparum | 136 | DEALAAAQTNAAANLEKI | 2.7 | 2013 | P61 | – |
| 4IKM | CARD8 CARD domain | H. sapiens | 92 | DAALAAAQTNAVDAAFVK | 2.5 | 2012 | P6522 | 50 |
| 4IFP | NLRP1 CARD domain | H. sapiens | 84 | DAALAAAQTNAVDLHFVD | 2.0 | 2012 | C2 | 49 |
| 3VD8 | AIM2 PYD domain | H. sapiens | 93 | DAALAAAQTNAVDMESKY | 2.1 | 2012 | P212121 | 51 |
| 4WJV | Nsa2 | S. cerevisiae | 22 | DEALAAAQTNAAAAMDTDG | 3.2 | 2014 | C2 | 19 |
| 1MH3 | MATa1 homeodomain | S. cerevisiae | 50 | DAALAAAQTAAAAAISPQA | 2.1 | 2002 | P43212 | 22 |
| 1MH4 | MATa1 homeodomain | S. cerevisiae | 50 | DAALAAAQTAAAAAISPQA | 2.3 | 2002 | P212121 | 22 |
| 4JBZ | Mcm10 coiled coil | X. laevis | 32 | DAALAAAQTNAAAMGVCQEK | 2.4 | 2013 | C2 | 14 |
| 3MP1 | Sgf29 Tudor domain | S. cerevisiae | 149 | DEALAAAQTNAAAEFGSSYW | 2.6 | 2010 | P212121 | 38 |
| 3MP6 | Sgf29 Tudor domain | S. cerevisiae | 149 | DEALAAAQTNAAAEFGSSYW | 1.5 | 2010 | P212121 | 38 |
| 3MP8 | Sgf29 Tudor domain | S. cerevisiae | 149 | DEALAAAQTNAAAEFGSSYW | 1.9 | 2010 | P212121 | 38 |
| 1HSJ | SarR | S. aureus | 115 | DEALAAAQTNAAAEFMSKIN | 2.3 | 2000 | P1 | 2 |
| 1R6Z | Argonaute 2 PAZ domain | D. melanogaster | 116 | DEALKDAQTNAAAEFVDISH | 2.8 | 2003 | P65 | 4 |
| 4RWF | CALCLR:RAMP2 | H. sapiens | 217 | DEALKDAQTNAAAEFGGTVK | 1.8 | 2014 | P212121 | 58 |
| 4RWG | CALCLR:RAMP1 | H. sapiens | 219 | DEALKDAQTNAAAEFTTACQ | 2.4 | 2014 | C2 | 58 |
| 4MY2 | Norrin | H. sapiens | 103 | DEALKDAQTNAAAEFIMDSD | 2.4 | 2013 | P21212 | 53 |
| 3C4M | PTH1R extracellular domain | H. sapiens | 159 | DEALKDAQTNAAAEFDDVMT | 1.9 | 2008 | P21 | 28 |
| 3L2J | PTH1R extracellular domain | H. sapiens | 159 | DEALKDAQTNAAAEFDDVMT | 3.2 | 2009 | C2221 | 37 |
| 4LOG | NR2E3 ligand binding domain | H. sapiens | 194 | DEALKDAQTNAAAEFLDSIH | 2.7 | 2013 | P21212 | 52 |
| 3N93 | CFR2alpha extracellular domain | H. sapiens | 102 | DEALKDAQTNAAAEFAALLH | 2.5 | 2010 | C2221 | – |
| 3N94 | PAC1R extracellular domain | H. sapiens | 95 | DEALKDAQTNAAAEFAIFKK | 1.8 | 2010 | P212121 | 40 |
| 3N95c | CRFR2ɑ extracellular domain | H. sapiens | 102 | DEALKDAQTNAAAEFAALLH | 2.7 | 2010 | P21 | – |
| 3N96c | CRFR2ɑ extracellular domain | H. sapiens | 102 | DEALKDAQTNAAAEFAALLH | 2.7 | 2010 | P21 | – |
| 3EHS | CRFR1 extracellular domain | H. sapiens | 96 | DEALKDAQTNAAAEFSLQDQ | 2.8 | 2008 | P41212 | 32 |
| 3EHT | CRFR1 extracellular domain | H. sapiens | 96 | DEALKDAQTNAAAEFSLQDQ | 3.4 | 2008 | P41212 | 32 |
| 3EHU | CRFR1 extracellular domain | H. sapiens | 96 | DEALKDAQTNAAAEFSLQDQ | 2.8 | 2013 | P1 | 32 |
| 4NUF | SHP | M. musculus | 206 | DEALKDAQTNAAAEFPHRTC | 2.8 | 2013 | P212121 | 54 |
| 4XAI | TLX ligand‐binding domain | T. castaneum | 201 | DEALKDAQTNAAAEFPSAIC | 2.6 | 2014 | P212121 | 61 |
| 4XAJ | TLX ligand‐binding domain | H. sapiens | 202 | DEALKDAQTNAAAEFTESVC | 3.6 | 2014 | P212121 | 61 |
| 3H3G | PTH1R extracellular doomain | H. sapiens | 165 | DEALKDAQTNAAAEFDDVMT | 1.9 | 2009 | P41212 | 35 |
NA, not applicable; RES., resolution (Å); YR., year deposited in PDB; S.G., space group; REF, reference.
Identical MBP fusion proteins co‐crystallized with different ligands.
Identical MBP fusion proteins soaked with different ligands.
Identical MBP fusion proteins co‐crystallized with different urocortin 2 peptides.
Why are there so many more structures of MBP fusion proteins in the PDB than of any other variety of fusion protein? In some cases MBP fusion protein structures simply have been presented as faits accomplis, with no backstory whatsoever.2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 In most instances, however, at least a brief description of the difficulties encountered with expression, solubility, stability, or crystallization of the target protein (or a combination of these factors) is provided as a rationale for crystallizing the MBP fusion protein.21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 Remarkably, in only two cases is any mention made of attempts to crystallize proteins using multiple fusion partners. In one instance (1MG1), a GST fusion protein was also constructed, but it proved to be insoluble.21 In the other case, several different N‐terminal fusion partners were tested as crystallization chaperones for the MCL1 protein.60 The lysozyme fusion protein was insoluble. Although the corresponding SUMO, TRX, and MBP fusion proteins were soluble, only the latter yielded crystals (4WMS). It seems likely that the frequent choice of MBP as a fusion partner stems from a combination of its reputation for exerting a favorable impact on the yield, solubility, and stability of its fusion partners62, 63, 64 and its historical precedent as a successful crystallization tag.21, 65
Origin and Diversity of Inter‐Domain Linkers
The list of MBP fusion protein structures in Table 1 is organized according to the nature of their inter‐domain linkers. The linker residues (not part of either protein) are colored red. The first 11 entries are fusion proteins with relatively long linkers. The poly‐asparagine motif originates from the pMal‐C2 MBP fusion vector and its derivatives,66 which are marketed by New England Biolabs. If there is a logical explanation underpinning the choice of poly‐asparagine for a linker, it is by now forgotten and buried deeply in the historical record. The linker in 4O4B exhibits the signature sequence of Gateway recombinational cloning. The origin of the linker sequences in 1TOK, 1NMU, and 3A3C is uncertain. The existence of these 11 fusion protein structures appears to contradict the dogma that conformational flexibility between domains of a fusion protein is inimical to crystallization. That nearly all of the long linkers include a designed site for endoproteolysis (IEGR for Factor Xa or ENLYFQG for TEV protease) suggests that crystallization of these fusion proteins was undertaken as a fallback strategy (perhaps because proteolytic removal of the tag failed) rather than as the primary objective. Not surprisingly, there is no interpretable electron density for most of the linker residues in these structures.
The next six entries in Table 1 are fusion proteins with no inter‐domain linkers, which would be expected to have the greatest degree of rigidity. A slash mark (/) denotes the boundary between MBP and its fusion partners in these cases. There are a variety of ways to create translational fusions with seamless junctions (i.e., without residual cloning artifacts), the most common being overlap extension PCR.67 The C terminus of wild‐type MBP ends with the sequence DAQTRITK,68 which adopts an α‐helical conformation. The amino acid sequences reveal that one or more residues were deleted from the C terminus of MBP during the construction of these fusion proteins, although no rationale was provided for this in the accompanying publications.
The next group of linkers consists of the short sequences N, NS, HM, GS, GSS, AMD, GSSGSS, or NSSS. The latter linker is derived from the NEB vector pMal‐C2, wherein the three consecutive serine residues correspond to a unique SacI restriction site in the plasmid DNA that can be used for cloning.66 Hence, fusion protein expression vectors with the NSSS linker were constructed by ligating a passenger protein PCR product with an in‐frame SacI site at its proximal end. (“Passenger protein” is a term used to describe an open reading frame that is fused to the C terminus of MBP). In the case of the NS linker, the “sticky ends” generated by SacI were removed prior to ligation with a blunt‐ended PCR product.69 It is unclear what purpose is served by the asparagine residue that precedes the three serines; arginine occupies this position in wild‐type MBP. Yet, for whatever reason, the vast majority of the linker sequences in Table 1 include this amino acid substitution. Exceptions include the monobody fusion proteins with GSS (3CSG and 2OBG) or GSSGSS (3CSB) linkers.29
By far the most common inter‐domain linker consists of the four residues NAAA. There are 36 fusion protein structures with this linker sequence along with three amino acid substitutions near the C terminus of MBP (colored blue in Table 1). The three consecutive alanine residues coincide with a unique NotI restriction site in the plasmid DNA (Fig. 2). NotI recognizes the octanucleotide sequence GCGGCCGC. This enables passenger proteins to be joined in frame to the C terminus of MBP using the NotI site, which is added to the proximal end of the insert by PCR. The origin of the three point mutations near the C terminus of MBP can be traced back to the very first MBP fusion protein structure to have been determined (1MG1), the HTLV gp21 ectodomain fused to MBP.21 It was anticipated that the gp21 moiety would be a homotrimer, thus three of the charged residues near the C‐terminus of MBP were changed to alanine in a preemptive effort to avoid electrostatic repulsion. Although gp21 is indeed trimeric, the mutated residues were found not to be located in close proximity to one another in the crystal structure. It seems that this early vector design was subsequently disseminated widely in the structural biology community. There are a few variations on this theme. Several fusion proteins with the NAAA linker lack the proximal amino acid substitution in MBP. In 1MH3 and 1MH4, two of the earliest MBP fusion proteins to be crystallized, five consecutive alanine residues were utilized with a PstI site imbedded in them for cloning.22 In the 22 fusion proteins with the linker sequence NAAAEF, the EF sequence corresponds to an EcoRI site (GAA TTC) that was used for cloning (H. E. Xu, Personal Communication), although this was is not mentioned in the accompanying publications.28, 32, 35, 37, 40 Similarly, in the three fusion proteins with the linker sequence NAVD, the VD dipeptide probably corresponds to a SalI restriction site (GTC GAC), but this could not be verified from the published information.49, 50, 51
Figure 2.

Schematic diagram of a popular MBP fusion junction for crystallographic applications. See text for discussion.
Of course it would be useful to know if there is a particular linker or linkers that are most likely to yield useful crystals. Unfortunately, however, a definitive answer to this question is lacking because negative results are rarely reported. Nevertheless, the preponderance of the NAAA linker among the MBP fusion proteins that have been crystallized suggests that this would be a good choice.
Purification of Fusion Proteins
At this time eleven of the MBP fusion protein structures in the PDB have no accompanying publications (Table 1), and therefore no information is available about how these proteins were purified. The majority of the other 91 fusion proteins were purified first by amylose affinity chromatography and then by gel filtration. Consequently, the structures of most MBP fusion proteins in the PDB contain a ligand in the maltose‐binding pocket. In a small number of cases an additional purification step, usually ion exchange chromatography, was also employed.
MBP undergoes a substantial conformational change upon binding to maltose,70 and the two distinct conformations of MBP could give rise to different crystal contacts. For this reason, some fusion proteins were engineered to also include a hexahistidine tag attached either to the N terminus of MBP or the C terminus of the passenger protein. This enabled these fusion proteins to be partially purified by immobilized metal affinity chromatography rather than amylose affinity chromatography (with maltose), allowing crystallization screens to be performed both in the presence and absence of maltose if desired. In at least one case (3WAI), this strategy proved to be crucial, as only the ligand‐free form of the fusion protein could be crystallized.44 In another case, different crystal forms were obtained in the presence and absence of ligand.46 Curiously, however, even when MBP fusion proteins included a polyhistidine tag joined either to the N terminus of MBP (4O4B, 4JKM, 3MQ9, 4B3N, 2OK2, 3CSG, 2VGQ, 4WJV) or to the C‐terminus of the passenger protein (4IRL, 4BL8, 4BL9, 4BLA, 4BLB, 4BLD, 4PE2, 4TSM, 4OGM, 3EF7, 3D4G, 3D4C, 4O2X, 4IKM, 4IFP, 3VD8, 3C4M, 3L2J, 4LOG, 4RWF, 4RWG, 3N93, 3N94, 3N95, 3N96, 3EHS, 3EHT, 3EHU, 3H3G), most of the time amylose affinity chromatography was employed during purification or maltose was added to the protein prior to crystallization anyway. The use of both affinity techniques might facilitate the removal of truncated fusion proteins if the polyhistidine tag is joined to the C‐terminus of the passenger protein because then it would be flanked by N‐ and C‐terminal tags, creating an “affinity sandwich.” It is more difficult to understand why maltose was added to His‐MBP fusion proteins prior to screening for crystals.
Although, as discussed above, crystallization of an MBP fusion protein is generally undertaken as a fallback strategy after conventional methods have failed, Dr. H. Eric Xu of the Van Andel Research Institute has adopted a fusion‐based approach as a primary method for the production of class B G protein‐coupled receptor (GPCR) ligand‐binding domains.28 These (normally) extracellular domains, which often contain disulfide bonds, are produced in the cytosol of Escherichia coli as MBP fusion proteins. MBP maintains its fusion partners in a soluble state62, 64 while co‐expression of the bacterial protein disulfide isomerase DsbC in a trxB/gor strain71 aids in the oxidation and reshuffling of disulfide bonds. In some cases, additional disulfide reshuffling is carried out with partially purified fusion proteins in vitro. The MBP fusion proteins also have C‐terminal polyhistidine tags. Therefore, successive affinity chromatography on nickel and amylose columns ensures that only full‐length fusion proteins are purified. Residual aggregates are removed by gel filtration chromatography. The last 15 structures listed in Table 1 were obtained in this manner.
Phasing
In almost all cases, the initial phases used to solve the structures of MBP fusion proteins were obtained by molecular replacement, using MBP as a search model. The ability to obtain phases in this manner was recognized early on as a potential advantage of using MBP as a crystallization tag.1 At least 10 different MBP structures from the PDB, both ligand‐free and ligand‐bound, have been used as search models for molecular replacement. This approach has worked well because, at 42 kDa, MBP usually accounts for a substantial fraction of the X‐ray scattering atoms in the crystal lattice. In some cases, models of the passenger protein were used in addition to or instead of MBP for phasing. One of the fusion protein structures (4EGC) was phased de novo by anomalous scattering of selenomethionine residues.12 Another as yet unpublished structure (4IKM) contained iodotyrosine but evidently was solved by molecular replacement anyway.
One might expect that as the protein to which MBP is fused becomes larger and larger, the application of molecular replacement would become more difficult. Indeed, this appears to be the case. The structure of 4JKM, the largest MBP fusion protein to be crystallized, was solved by molecular replacement using a homolog of the 599 residue β‐glucuronidase fusion partner as a search model rather than MBP.20 The structure of MBP‐NEDD8 E1, which was crystallized as a multiprotein complex (2NVU), was also phased by molecular replacement without the assistance of MBP.24 The structure of 3WAI, on the other hand, which has a fusion partner of approximately the same size as MBP, was solved by molecular replacement with MBP, although the phases were subsequently improved by anomalous scattering from selenomethionine residues.44 Other MBP fusion protein structures with very large fusion partners (4H1G, 3DM0,4BL8, 4BL9, 4BLA, 4BLB, 4BLD) were phased by molecular replacement using a combination of MBP and a homolog of the fusion partner as search models.31, 46, 48
General Characteristics of MBP Fusion Protein Structures
Passenger proteins range in length from 19 to 599 residues (Table 1). About 71% of the passenger proteins are fewer than 150 residues in length, with the greatest number being between 101 and 150 amino acids long [Fig. 3(a)]. It is not known whether large passenger proteins crystallize less readily than smaller ones when fused to MBP or if more attempts have been made to crystallize small fusion proteins than large ones. Some of the shortest passengers are fragments of larger proteins (<60 residues) that may be too small to adopt stably folded conformations.
Figure 3.

General characteristics of MBP fusion proteins. (a) lengths of passenger proteins in intervals of amino acid residues. (b) Resolution of MBP fusion protein structures in Å (black bars, left ordinate scale) versus the sum total of all unique structures in the PDB (gray squares, right ordinate scale). (c) percentage of MBP fusion proteins crystallized in each of 16 different space groups (black) and percentage of all unique proteins in the PDB that have been crystallized in the same space groups (gray).
The resolution of MBP fusion protein structures resembles a normal distribution between 1.3 and 3.5 Å with an average slightly higher than 2 Å [Fig. 3(b)]. Sixty‐seven percent of the structures are 2.5 Å resolution or better. The average resolution of all structures in the PDB is slightly higher, but it should be noted that the average molecular weight of the MBP fusion proteins is greater than the average molecular weight of all structures in the PDB. MBP fusion proteins have been crystallized in 16 different space groups. Their relative frequency parallels that of the PDB as a whole, with P212121, P21212, P21, and C2 being the most common [Fig. 3(c)]. The average solvent content of the 102 MBP fusion protein crystals is 54% compared with an average of 50% for all protein crystals in the PDB. The most remarkable thing about these structures is how unremarkable they are: in virtually all respects, they are rather typical of other structures in the PDB.
While the vast majority of MBP fusion protein structures are single polypeptides, a handful of them have been crystallized as multiprotein complexes or in complex with oligopeptides. Examples of the latter type are 4WJV, 4RWF, 4RWQG, 3N93, 3N95, 3N96, 3EHT, 3EHU, 4NUF, 4XAI, 4XAJ, and 3H3G. In 4EGC, MBP fused to human Six1 was co‐crystallized with Eya domain of Eya2.12 The most spectacular example of a multiprotein complex containing an MBP fusion protein to have been crystallized thus far is the ubiquitin‐like protein NEDD8 in complex with its heterodimeric E1 ligase.24 In this complex, the catalytic subunit of the E1 ligase is fused to MBP. There is also one structure of an MBP fusion protein in complex with an RNA molecule (1T0K): Saccharomyces cerevisiae ribosomal protein L30 with a fragment of rRNA.5
Surface Entropy Reduction Mutants
A protein that fails to crystallize can sometimes be induced to do so by replacing clusters of long, flexible amino acid side chains on its surface with alanine residues. This general technique, termed surface entropy reduction mutagenesis (SERM), was originally described by Derewenda and his colleagues.72 The Pedersen laboratory was the first to combine the use of MBP as a crystallization chaperone with the SERM technique.65 Nineteen of the MBP fusion protein structures deposited in the PDB include surface entropy reduction mutations in MBP (Table 2). The amino acid substitutions are D83A/K84A, E173A/N174A, K240A, or a combination of these. In 14 of these 19 structures, one or more of the mutated residues participates in crystal contacts (being closer than 4.5 Å apart73), as has frequently been observed in other instances of SERM.74 A detailed analysis of the crystal contacts mediated by the mutated residues in these structures failed to reveal any conserved interactions between MBP molecules (data not shown). Intuitively, and in the absence of any evidence to the contrary, it seems that the most promising strategy may be to utilize the MBP variant in which all of the surface entropy reduction mutations are present simultaneously. The introduction of additional mutations might further increase the probability of obtaining crystals.
Table 2.
Surface Entropy Reduction Mutations in MBP and Their Participation in Crystal Contacts
| PDB code | D83A/K84A | E173A/N174A | K240A | Reference |
|---|---|---|---|---|
| 3DMO | ✓ | ✓ (+) | – | |
| 3H4Z | ✓ (+) | ✓ (+) | ✓ (+) | 9 |
| 4EXK | ✓ (+) | ✓ (+) | ✓ (+) | – |
| 4IKM | ✓ | ✓ | ✓ | 49 |
| 4 IFP | ✓ (+) | ✓ | ✓ (+) | 48 |
| 3VD8 | ✓* | ✓* | ✓ | 50 |
| 4IRL | ✓ (+) | ✓ | ✓ (+) | 13 |
| 4JBZ | ✓ | ✓ (+) | ✓ | 14 |
| 3OB4 | ✓ | ✓ | ✓ (+) | – |
| 4EGC | ✓ (+) | 12 | ||
| 4KYC | ✓ | 15 | ||
| 3PY7 | ✓ | ✓ | 41 | |
| 4QVH | ✓ (+) | 55 | ||
| 4WJV | ✓ | ✓ (+) | 19 | |
| 4OGM | ✓ (+) | ✓ | ✓ | 17 |
| 4TSM | ✓ | ✓ (+) | ✓ | 17 |
| 4KYE | ✓ (+) | 15 | ||
| 4KYD | ✓ (+) | 15 | ||
| 4GIZ | ✓ | ✓ | 41 |
✓ Signifies the presence of the indicated surface entropy reduction mutations in MBP. (+) Indicates participation of the mutated residue(s) in one or more crystal contact (closer than 4.5 Å). *N174A is close to K84A in an adjacent molecule (7.3 Å). Although not close enough to constitute a crystal contact, the original side chains probably would have clashed in this packing arrangement.
Impact of MBP on the Structure of its Fusion Partners
Perhaps the greatest concern about the crystallization chaperone approach in general is that the chaperone, MBP in this case, may distort or otherwise interfere with the conformation of its fusion partner. In other words, there may be significant structural differences between the fused and unfused forms of a protein. In 24 cases, it was possible to compare the structures of unfused proteins solved by X‐ray crystallography or NMR with those of the corresponding MBP fusions to assess how frequently such differences occur. All pairs of protein structures that have been determined in both fused and unfused states were aligned using PDBeFold75 and PyMOL76 and the results are summarized in Table 3.
Table 3.
Structural Alignments of Fused and Unfused proteins
| MBP fusion | Unfused | R.M.S.D. (Å) | Matches |
|---|---|---|---|
| 4O4B | 4O4C | 0.74 | 233/239 |
| 3OAI | 1NEU | 0.94 | 113/115 |
| 4B3N | 2LM3a | 0.63 | 175/202b |
| 2OK2 | 3ZLJ | 0.93 | 31/31 |
| 3WAI | 3WAJ | 0.52 | 294/368b, c |
| 2VGQ | 4P4H | 0.67 | 93/94 |
| 1NMU | 1CK2a | 1.65 | 100/104 |
| 3G7V | 2KB8a | 3.97d | 25/37b |
| 2NVU | 3GZN | 0.65 | 338/429e |
| 3O3U | 4LP4 | 1.05 | 209/210 |
| 4KV3 | 4KV2 | 0.55 | 90/90 |
| 4EDQ | 3CX2 | 0.84 | 105/107 |
| 3H4Z | 3UV1 | 0.95 | 187/190 |
| 4EXK | 2LV4a | 1.43 | 103/103 |
| 4BL8 | 4KM9 | 1.30 | 354/371b, f |
| 4IFP | 3KAT | 0.71 | 82/84 |
| 3VD8 | 4O7Q | 1.03 | 91/93 |
| 4KEG | 4KGO | 1.20 | 160/196b |
| 1R6Z | 1VYNa | 1.71 | 112/116 |
| 4QVH | 4QJK | 0.68 | 224/227 |
| 4WMS | 4WMR | 1.17 | 145/149 |
| 4WMTg | 4WMR | 0.82 | 148/149 |
| 4WMUg | 4WMR | 0.92 | 148/149 |
| 4EXK | 2LV4a | 1.43 | 103/103 |
| 4N4X | 4KGH | 1.01 | 157/197b |
NMR structure.
Missing electron density for segment(s) of the passenger protein.
Conformational shift in the position of N‐terminal domains, but N‐terminal domains align very well.
Both of these polypeptides adopt an α‐helical conformation, but the helices do not align well.
Conformational shift in the position of C‐terminal domains, but C‐terminal domains align very well.
C‐termini do not align well.
These fusion proteins are identical to 4MWS but were co‐crystallized with two different small molecules.
The average r.m.s.d. for C‐α atoms between the fused passenger proteins and their unfused counterparts is on the order of approximately 1 Å in most cases, indicating very close agreement between their structures. Minor differences between the conformations of fused and unfused proteins occur most frequently at the ends of the passenger proteins and less frequently in the loops between secondary structure elements. Not unexpectedly, larger r.m.s.d. values are usually observed when the unfused structure was determined by NMR (e.g., 1CK2, 2KB8, 2LV4, 1VYN), but in none of these cases are the structures radically different.
Large conformational shifts occur in only two pairs of fused and unfused structures. The alignment between 3WAI and 3WAJ returns an average r.m.s.d. for C‐α atoms of just 0.52 Å over 294 of 368 residues. The non‐matching residues correspond to the N‐terminal about 70 residues of the polypeptide that is fused to MBP, which is only a fragment of the full‐length Archaeoglubus fulgidus AglB enzyme. On the other hand, in the structure of unfused, full‐length AglB (3WAJ) these 70 residues are preceded by a large α‐helical N‐terminal domain and they are sandwiched between it and the remainder of the protein, effectively locking them into the observed conformation.77 The N‐terminal α‐helical domain is absent from the truncated AglB that is fused to MBP so it is unable to constrain the movement of the 70 N‐terminal residues of this AglB fragment. Additionally, in the structure of the MBP fusion protein, the C‐terminal α‐helix of MBP forms a contiguous, extended helix with the N‐terminal segment of the AglB fragment, which may contribute to its differing orientation in the fusion protein. The conformational shift of this “mini domain” can be seen when the fragment of AglB from the MBP fusion protein structure is aligned with the corresponding residues from the structure of the full‐length protein [Fig. 4(a)]. However, the structures of the two mini domains align very well [Fig. 4(b)], indicating that although their relative positions are different in 3WAI and 3WAJ, their folds are unchanged. This is the only known case in which there is some evidence to suggest that MBP may have been at least partially responsible for a structural distortion (rearrangement) of its fusion partner.
Figure 4.

Conformational shifts observed in fused and unfused structures. (a) structural alignment of 3WAI (magenta) and 3WAJ (blue). (b) structural alignment of N‐terminal “mini domains” of AglB in 3WAI (magenta) and 3WAJ (blue). (c) structural alignment of MBP‐UBA3 from 2NVU (magenta) with unfused UBA3 from 3GZN (blue). (d,) structural alignment of the ubiquitin fold domains from fused (magenta) and unfused (blue) UBA3.
The alignment between the relevant polypeptide chains in 2NVU and 3GZN yields an average r.m.s.d. of 0.65 Å for matching residues (338/429). This case is complicated by the fact that both the fused and unfused structures have been crystallized as multiprotein complexes. The complexes are composed of the ubiquitin‐like protein NEDD8 in complex with its heterodimeric E1 activating enzyme, which consists of a regulatory domain (APPBP1) and a catalytic domain (UBA3). In one complex (2NVU), UBA3 is fused to the C terminus of MBP.24 The non‐matching residues in the UBA3 alignment can be ascribed to a substantial conformational shift in the orientation of its C‐terminal domain (also known as the ubiquitin fold domain or UBF) [Fig. 4(c)], yet the fold of this domain is the same in the two molecules [Fig. 4(d)]. An additional protein, the E2 enzyme Ubc12, is present in the complex with unfused UBA3 (3GZN).77 According to Huang et al.,24 the conformational shift of the UBA3 UBF domain is required to create the binding surface for Ubc12. Moreover, in the 2NVU structure, all crystal contacts between MBP are with regions of APPBP1, UBA3, and NEDD8 that have identical conformations in previous structures that do not include MBP.24, 78 Therefore, this structural rearrangement is not caused by MBP.
In summary, based on the examination of 24 pairs of fused and unfused structures, it seems that no major structural derangements have occurred entirely as a consequence of being fused to MBP. This does not preclude their existence, however. It may be that MBP‐induced structural distortions do occur in some cases but fail at the expression, purification, or crystallization stages and so they are not observed.
One might also wonder if the structure of MBP is altered by its fusion partners. When structural alignments of the MBP domains from all 105 fusion protein structures were performed with high resolution structures of unfused MBP (either 1ANF for the ligand‐bound conformation or 1LLF for the unliganded protein), no significant structural deviations were detected (data not shown), except for 4JKM wherein the N‐terminal residues of MBP may have been traced incorrectly due to poor electron density in this region.
Future Directions
The growing number of MBP fusion protein structures in the PDB is evidence that the crystallization chaperone strategy is being utilized more frequently, and we can expect the number of such structures will continue to increase. As shown above, there is little evidence that MBP alters the structure of its fusion partners. MBP appears to be a generally effective crystallization chaperone, but other proteins may be equally or even more effective. Thus far, however, very little effort has been made to identify alternatives to MBP. Additionally, the repertoire of tools for chaperone‐assisted crystallization might be further expanded by employing proteins that have been selected to bind tightly to MBP, such as monobodies,25, 29 Fab fragments79, 80 and Designed Ankyrin Repeat Proteins (DARPins)81 for co‐crystallization with MBP fusion proteins. This approach may increase the probability of obtaining crystals without the need to express and purify more than one MBP fusion protein.
Methods
Fusion protein structures were identified by using the Basic Local Alignment Search Tool (BLAST)82 server to query the PDB with the amino acid sequences of E. coli MBP,68 E. coli TRX,83 and Schistosoma japonicum GST.84 The matches were then inspected manually to identify the fusion proteins. Crystal contacts involving surface entropy reduction mutations in MBP were identified by manual inspection using the “measure” tool in PyMOL.76 Structural alignments were performed with the PDBeFold server75 and PyMOL76 after removing irrelevant atoms from the PDB files. Data from the Protein Data Bank (PDB) that were used in this analysis were extracted in August 2015.
Acknowledgments
The author wishes to thank Alexander Wlodawer and Danielle Needle for critical feedback on the manuscript. I am also indebted to Dr. George Lountos, and to Drs. Marek Grabowski, Wladek Minor and Zbigniew Dauter for their help with the calculation of the solvent content of MBP fusion protein crystals and protein crystals in the PDB as a whole, respectively.
The author declares that he has no competing interests to disclose.
References
- 1. Smyth DR, Mrozkiewicz MK, McGrath WJ, Listwan P, Kobe B (2003) Crystal structures of fusion proteins with large affinity tags. Protein Sci 12:1313–1322. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Liu Y, Manna A, Li R, Martin WE, Murphy RC, Cheung AL, Zhang G (2001) Crystal structure of the SarR protein from Staphylococcus aureus . Proc Natl Acad Sci USA 98:6877–6882. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Chao JA, Prasad GS, White SA, Stout CD, Williamson JR (2003) Inherent protein structural flexibility at the RNA‐binding interface of L30e. J Mol Biol 326:999–1004. [DOI] [PubMed] [Google Scholar]
- 4. Song JJ, Liu J, Tolia NH, Schneiderman J, Smith SK, Martienssen RA, Hannon GJ, Joshua‐Tor L (2003) The crystal structure of the argonaute2 PAZ domain reveals and RNA binding motif in RNAi effector complexes. Nat Struct Biol 10:1026–1032. [DOI] [PubMed] [Google Scholar]
- 5. Chao JA, Williamson JR (2004) Joint X‐ray and NMR refinement of the yeast L30e‐mRNA complex. Structure 12:1165–1176. [DOI] [PubMed] [Google Scholar]
- 6. LaPorte SL, Forsyth CM, Cunningham BC, Miercke LJ, Akhavan D, Stroud RM (2005) De novo design of an IL‐4 antagonist and its structure at 1.9 Å. Proc Natl Acad Sci USA 102:1889–1894. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Mendillo ML, Putnam CD, Kolodner RD (2007) Escherichia coli MutS tetramerization domain structure reveals that stable dimers but not tetramers are essential for DNA mismatch repair in vivo . J Biol Chem 282:16345–16354. [DOI] [PubMed] [Google Scholar]
- 8. Lamb D, Schüttelkopf AW, van Aalten DMF, Brighty DW (2011) Charge‐surrounded pockets and electrostatic interactions with small ions modulate the activity of retroviral fusion proteins. PLoS Pathog 7:e1001268. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Mueller GA, Edwards LL, Aloor JJ, Fessler MB, Glesner J, Pomés A, Chapman MD, London RE, Pedersen LC (2010) The structure of the dust mite allergen Der p 7 reveals similarities to innate immune proteins. J Allergy Clin Immunol 125:909–917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Zhang Y, Gao X, Garavito RM (2011) Structural analysis of the intracellular domain of (pro)renin receptor fused to maltose‐binding protein. Biochem Biophys Res Commun 407:674–679. [DOI] [PubMed] [Google Scholar]
- 11. Liu Z, Wang Y, Yedidi RS, Brunzelle JS, Kovari IA, Sohi J, Kamholz J, Kovari LC (2012) Crystal structure of the extracellular domain of human myelin protein zero. Proteins 80:307–313. [DOI] [PubMed] [Google Scholar]
- 12. Patrick AN, Cabrera JH, Smith AL, Chen XS, Ford HL, Zhao R (2013) Structure‐function analyses of the human SIX1‐EYA2 complex reveals insights into metastasis and BOR syndrome. Nat Struct Mol Biol 20:447–453. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Jin T, Huang M, Smith P, Jiang J, Xiao S (2013) Structure of the caspase‐recruitment domain from a zebrafish guanylate‐binding protein. Acta Cryst F69:855–860. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Du W, Josephrajan A, Adhikary S, Bowles T, Bielinsky AK, Eichman BF (2013) Mcm10 self‐association is mediated by an N‐terminal coiled‐coil domain. PLoS ONE 8:e70518. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Yegambaram K, Bulloch EMM, Kingston RL (2013) Protein domain definition should allow for conditional disorder. Protein Sci 22:1502–1518. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Wang H, DeRose EF, London RE, Shears SB (2014) IP6K structure and the molecular determinants of catalytic specificity in an inositol phosphate kinase family. Nat Commun 5:4178. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Piepenbrink KH, Maldarelli GA, Martinez de la Peña CF, Dingle TC, Mulvey GL, Lee A, von Rosenvinge E, Armstrong GD, Donnenberg MS, Sundberg EJ (2015) Structural and evolutionary analyses show unique stabilization strategies in the type IV pili of Clostridium difficile . Structure 23:385–396. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Arora K, Talje L, Asenjo AB, Andersen P, Atchia K, Joshi M, Sosa H, Allingham JS, Kwok BH (2014) KIF14 binds tightly to microtubules and adopts a rigor‐like conformation. J Mol Biol 426:2997–3015. [DOI] [PubMed] [Google Scholar]
- 19. Baßler J, Paternoga H, Holdermann I, Thoms M, Granneman S, Barrio‐Garcia C, Nyarko A, Lee W, Stier G, Clark SA, Schraivogel D, Kallas M, Beckmann R, Tollervey D, Barbar E, Sinning I, Hurt E (2014) A network of assembly factors is involved in remodeling rRNA elements during preribosome maturation. J Cell Biol 207:481–498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Wallace BD, Roberts AB, Pollet RM, Ingle JD, Biernat KA, Pellock SJ, Venkatesh MK, Guthrie L, O'Neal SK, Robinson SJ, Dollinger M, Figueroa E, McShane SR, Cohen RD, Jin J, Frye SV, Zamboni WC, Pepe‐Ranney C, Mani S, Kelly L, Redinbo MR (2015) Structure and inhibition of microbiome b‐glucuronidases essential to the alleviation of cancer drug toxicity. Chem Biol 22:1238–1249. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Kobe B, Center RJ, Kemp BE, Poumbourios P (1999) Crystal structure of human T cell leukemia virus type 1 gp21 ectodomain crystallized as a maltose‐binding protein chimera reveals structural evolution of retroviral transmembrane proteins. Proc Natl Acad Sci USA 96:4319–4324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Ke A, Wolberger C (2003) Insights into binding cooperativity of MATa1/MATα2 from the crystal structure of a MATa1 homeodomain‐maltose binding protein chimera. Protein Sci 12:306–312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Adikesavan NV, Mahmood SS, Stanley N, Xu Z, Wu N, Thibonnier M, Shoham M (2005) A C‐terminal segment of the V1R vasopressin receptor is unstructured in the crystal structure of its chimera with maltose‐binding protein. Acta Cryst F61:341–345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Huang DT, Hunt HW, Zhuang M, Ohi MD, Holton JM, Schulman BA (2007) Basis for a ubiquitin‐like protein thioester switch toggling E1‐E2 affinity. Nature 445:394–398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Koide A, Gilbreth RN, Esaki K, Tereshko V, Koide S (2007) High‐affinity single‐domain binding proteins with a binary‐code interface. Proc Natl Acad Sci USA 104:6632–6637. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Potter JA, Randall RE, Taylor GL (2008) Crystal structure of human IPS‐1/MAVS/VISA/Cardif caspase activation recruitment domain. BMC Struct Biol 8:11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Kawano S, Yamano K, Naoé M, Momose T, Terao K, Nishikawa S, Watanabe N, Endo T (2009) Structural basis of yeast Tim40/Mia40 as an oxidative translocator in the mitochondrial intermembrane space. Proc Natl Acad Sci USA 106:14403–14407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Pioszak AA, Xu HE (2008) Molecular recognition of parathyroid hormone by its G protein‐coupled receptor. Proc Natl Acad Sci USA 105:5034–5039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Gilbreth RN, Esaki K, Koide A, Sidju SS, Koide S (2008) A dominant conformational role for amino acid diversity in minimalist protein‐protein interfaces. J Mol Biol 381:407–418. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Monné M, Han L, Schwend T, Burendahl S, Jovine L (2008) Crystal structure of the ZP‐N domain of ZP3 reveals the core fold of animal egg coats. Nature 456:653–657. [DOI] [PubMed] [Google Scholar]
- 31. Ullah H, Scappini EL, Moon AF, Williams LV, Armstrong DL, Pedersen LC (2008) Structure of a signal transduction regulator, RACK1, from Arabidopsis thaliana . Protein Sci 17:1771–1780. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Pioszak AA, Parker NR, Suino‐Powell K, Xu HE (2008) Molecular recognition of corticotropin‐releasing factor by its G‐protein coupled receptor CRFR1. J Biol Chem 283:32900–32912. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Bethea HN, Xu D, Liu J, Pedersen LC (2008) Redirecting the substrate specificity of heparin sulfate 2‐O‐sulfotransferase by structurally guided mutagenesis. Proc Natl Acad Sci USA 105:18724–18729. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Wiltzius JJW, Sievers SA, Sawaya MR, Eisenberg D (2009) Atomic structures of IAPP (amylin) fusions suggest a mechanism for fibrillation and the role of insulin in the process. Protein Sci 18:1521–1530. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Pioszak AA, Parker NR, Gardella TJ, Xu HE (2009) Structural basis for parathyroid hormone‐related protein binding to the parathyroid hormone receptor and design of conformation‐selective peptides. J Biol Chem 284:28382–28391. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Watkins HA, Baker EN (2010) Structural and functional characterization of an RNase HI domain from the bifunctional protein Tv2228c from Mycobacterium tuberculosis . J Bacteriol 192:2878–2886. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Pioszak AA, Harikumar KG, Parker NR, Xu HE (2010) Dimeric arrangement of the parathyroid hormone receptor and a structural mechanism for ligand‐induced dissociation. J Biol Chem 285:12435–12444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Bian C, Xu C, Ruan J, Lee KK, Burke TL, Tempel W, Barsyte D, Li J, Wu M, Zhou BO, Fleharty BE, Paulson A, Allai‐Hassani A, Zhou JQ, Grant PA, Workman JL, Zang J, Min J (2011) Sgf29 binds histone H3K4me2/3 and is required for SAGA complex recruitment and histone H3 acetylation. Embo J 30:2829–2842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Yang H, Wang J, Jia X, McNatt MW, Zang T, Pan B, Meng W, Wang HW, Bieniasz PD, Xiong Y (2010) Structural insight into the mechanisms of enveloped virus tethering by tetherin. Proc Natl Acad Sci USA 107:18428–18432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Kumar S, Pioszak A, Zhang C, Swaminathan K, Xu HE (2011) Crystal structure of the PAC1R extracellular domain unifies a consensus fold for hormone recognition by class B G‐protein coupled receptors. PLoS ONE 6:e19682. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Park H, Boyington JC (2010) The 1.5 Å crystal structure of human receptor for advanced glycation endproducts (RAGE) ectodomains reveals unique features determining ligand binding. J Biol Chem 285:40762–40770. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Zanier K, Charbonnier S, Sidi AO, McEwen AG, Ferrario MG, Poussin‐Courmontagne P, Cura V, Brimer N, Babah KO, Ansari T, Muller I, Stote RH, Cavarelli J, Vande Pol S, Travé G (2013) Structural basis for hijacking of cellular LxxLL motifs by papillomavirus E6 oncoproteins. Science 339:694–698. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Zhao M, Cascio D, Sawaya MR, Eisenberg D (2011) Structures of segments of α‐synuclein fused to maltose‐binding protein suggest intermediate states during amyloid formation. Protein Sci 20:996–1004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Matsumoto S, Shimada A, Kohda D (2013) Crystal structure of the C‐terminal globular domain of the third paralog of the Archaeoglobus fulgidus oligosaccharyltransferases. BMC Struct Biol 13:11 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Yang H, Ji X, Zhao G, Ning J, Zhao Q, Aiken C, Gronenborn AM, Zhang P, Xiong Y (2012) Structural insight into HIV‐1 capsid recognition by rhesus TRIM5α. Proc Natl Acad Sci USA 109:18372–18377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Cherry AL, Finta C, Karlström M, Jin Q, Schwend T, Astorga‐Wells J, Zubarev RA, Del Campo M, Criswell AR, de Sanctis D, Jovine L, Toftgård R (2013) Structural basis of SUFU‐GLI interaction in human hedgehog signaling regulation. Acta Cryst D69:2563–2579. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Martin R, Gupta K, Ninan NS, Perry K, Van Duyne GD (2012) The survival motor neuron protein forms soluble glycine zipper oligomers. Structure 20:1929–1939. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Delorme C, Joshi M, Allingham JS (2012) Crystal structure of the Candida albicans Kar3 kinesin motor domain fused to maltose‐binding protein. Biochem Biophys Res Commun 428:427–432. [DOI] [PubMed] [Google Scholar]
- 49. Jin T, Curry J, Smith P, Jiang J, Xiao TS (2013) Structure of the NLRP1 caspase recruitment domain suggests potential mechanisms for its association with procaspase‐1. Proteins 81:1266–1270. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Jin T, Huang M, Smith P, Jiang J, Xiao TS (2013) The structure of the CARD8 caspase‐recruitment domain suggests its association with the FIIND domain and procaspases through adjacent surfaces. Acta Cryst F69:482–487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Jin T, Perry A, Smith P, Jiang J, Xiao TS (2013) Structure of the absent in melanoma 2 (AIM2) pyrin domain provides insights into the mechanisms of AIM2 autoinhibition and inflammasome assembly. J Biol Chem 288:13225–13235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Tan MH, Zhou XE, Soon FF, Li X, Li J, Yong EL, Melcher K, Xu HE (2013) The crystal structure of the orphan nuclear receptor NR2E3/PNR ligand binding domain reveals a dimeric auto‐repressed conformation. PLoS ONE 8:e74359. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Ke J, Harikumar KG, Erice C, Chen C, Gu X, Wang L, Parker N, Cheng Z, Xu W, Williams BO, Melcher K, Miller LJ, Xu HE (2013) Structure and function of Norrin in assembly and activation of Frizzled4‐Lrp5/6 complex. Genes Dev 27:2305–2319. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Zhi X, Zhou XE, He Y, Zechner C, Suino‐Powell KM, Kliewer SA, Melcher K, Mangelsdorf DJ, Xu HE (2014) Structural insights into gene repression by the orphan nuclear receptor SHP. Proc Natl Acad Sci USA 111:839–844. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. Schumacher MA, Tonthat NK, Kwong SM, Chinnam NB, Liu MA, Skurray RA, Firth N (2014) Mechanism of staphylococcal multiresistance plasmid replication origin assembly by the RepA protein. Proc Natl Acad Sci USA 111:9121–9126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Jung J, Bashiri G, Johnston JM, Brown AS, Ackerley DF, Baker EN (2014) Crystal structure of the essential Mycobacterium tuberculosis phosphopantetheinyl transferase PptT, solved as a fusion protein with maltose binding protein. J Struct Biol 188:274–278. [DOI] [PubMed] [Google Scholar]
- 57. Tong J, Yang H, Eom SH, Chun C, Im YJ (2014) Structure of the GH1 domain of guanylate kinase‐associated protein from Rattus norvegicus . Biochem Biophys Res Commun 452:130–135. [DOI] [PubMed] [Google Scholar]
- 58. Booe JM, Walker CS, Barwell J, Kuteyi G, Simms J, Jamaluddin MA, Warner ML, Bill RM, Harris PW, Brimble MA, Poyner DR, Hay DL, Pioszak AA (2015) Structural basis for receptor activity‐modifying protein‐dependent selective peptide recognition by a G protein‐coupled receptor. Mol Cell 58:1040–1052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Fang C, D'Souza B, Thompson CF, Clifton MC, Fairman JW, Fulroth B, Leed A, McCarren P, Wang L, Wang Y, Feau C, Kaushik VK, Palmer M, Wei G, Golub TR, Hubbard BK, Serrano‐Wu MH (2014) Single diastereomer of a macrolactam core binds specifically to myeloid cell leukemia 1 (MCL1). ACS Med Chem Lett 5:1308–1312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Clifton MC, Dranow DM, Leed A, Fulroth B, Fairman JW, Abendroth J, Atkins KA, Wallace E, Fan D, Xu G, Ni ZJ, Daniels D, Van Drie J, Wei G, Burgin AB, Golub TR, Hubbard BK, Serrano‐Wu MH (2015) A maltose‐binding protein fusion construct yields a robust crystallography platform for MCL1. PLoS ONE 10:e0125010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Zhi X, Zhou XE, He Y, Searose‐Xu K, Zhang CL, Tsai CC, Melcher K, Xu HE (2015) Structural basis for corepressor assembly by the orphan nuclear receptor TLX. Genes Dev 29:440–450. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62. Kapust RB, Waugh DS (1999) Escherichia coli maltose binding protein is uncommonly effective at promoting the solubility of polypeptides to which it is fused. Protein Sci 8:1668–1674. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63. Raran‐Kurussi S, Keefe K, Waugh DS (2015) Positional effects of fusion partners on the yield and solubility of MBP fusion proteins. Protein Expr Purif 110:159–164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64. Raran‐Kurussi S, Waugh DS (2012) The ability to enhance the solubility of its fusion partners is an intrinsic property of maltose‐binding protein but their folding is either spontaneous or chaperone‐mediated. PLos ONE 7:e49589. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65. Moon AF, Mueller GA, Zhong X, Pederson LC (2010) A syngergistic approach to protein crystallization: combination of a fixed‐arm carrier with surface entropy reduction. Protein Sci 19:901–913. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Riggs P (2000) Expression and purification of recombinant proteins by fusion to maltose‐binding protein. Mol Biotechnol 15:51–63. [DOI] [PubMed] [Google Scholar]
- 67. Ho SN, Hunt HD, Horton RM, Pullen JK, Pease LR (1989) Site‐directed mutagenesis by overlap extension using the polymerase chain reaction. Gene 77:51–59. [DOI] [PubMed] [Google Scholar]
- 68. Duplay P, Bedouelle H, Fowler A, Zabin I, Saurin W, Hofnung M (1984) Sequences of the malE gene and of its product, the maltose‐binding protein of Escherichia coli K12. J Biol Chem 259:10606–10613. [PubMed] [Google Scholar]
- 69. Wartell RM, Reznikoff WS (1980) Cloning restriction endonuclease fragments with protruding single‐stranded ends. Gene 9:307–319. [DOI] [PubMed] [Google Scholar]
- 70. Döring K, Surrey T, Nollert P, Jähnig F (1999) Effects of ligand binding on the internal dynamics of maltose‐binding protein. Eur J Biochem 266:477–483. [DOI] [PubMed] [Google Scholar]
- 71. Bessette PH, Aslund F, Beckwith J, Georgiou G (1999) Efficient folding of proteins with multiple disulfide bonds in the Escherichia coli cytoplasm. Proc Natl Acad Sci USA 96:13703–13708. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72. Goldschmidt L, Eisenberg D, Derewenda ZS (2014) Salvage or recovery of failed targets by mutagenesis to reduce surface entropy. Methods Mol Biol 1140:201–209. [DOI] [PubMed] [Google Scholar]
- 73. Carugo O, Argos P (1997) Protein‐protein crystal‐packing contacts. Protein Sci 6:2261–2263. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74. Derewenda ZS (2011) It's all in the crystals…. Acta Cryst D67:243–248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75. Krissinel E, Henrick K (2004) Secondary‐structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Cryst D60:2256–2268. [DOI] [PubMed] [Google Scholar]
- 76. The PyMOL Molecular Graphics System , Version 1.7.4, Schrödinger, LLC.
- 77. Matsumoto S, Shimada A, Nyirenda J, Igura M, Kawano Y, Kohda D (2013) Crystal structures of an archaeal oligosaccharyltransferase provide insights into the catalytic cycle of N‐linked protein glycosylation. Proc Natl Acad Sci USA 110:17868–17873. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78. Brownell JE, Sintchak MD, Gavin JM, Liao H, Bruzzese FJ, Bump NJ, Soucy TA, Milhollen MA, Yang X, Burkhardt AL, Ma J, Loke HK, Lingaraj T, Wu D, Hamman KB, Spelman JJ, Cullis CA, Langston SP, Vyskocil S, Sells TB, Mallender WD, Visiers I, Li P, Claiborne CF, Rolfe M, Bolen JB, Dick LR (2010) Substrate‐assisted inhibition of ubiquitin‐like protein‐activating enzymes: the NEDD8 E1 inhibitor MLN4924 forms a NEDD8‐AMP mimetic in situ. Mol Cell 37:102–111. [DOI] [PubMed] [Google Scholar]
- 79. Rizk SS, Paduch M, Heithaus JH, Duguid EM, Sandstrom A, Kossiakoff AA (2011) Allosteric control of ligand‐binding affinity using engineered conformation‐specific effector proteins. Nat Struct Mol Biol 18:437–442. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80. Mukherjee S, Ura M, Hoey RJ, Kossiakoff AA (2015) A new versatile immobilization tag based on the ultra high affinity and reversibility of the calmodulin‐calmodulin binding peptide interaction. J Mol Biol 427:2707–2725. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81. Binz HK, Amstutz P, Kohl A, Stumpp MT, Briand C, Forrer P, Grütter MG, Plückthun A (2004) High‐affinity binders selected from designed ankyrin repeat protein libraries. Nat Biotechnol 22:575–582. [DOI] [PubMed] [Google Scholar]
- 82. Altschul S, Gish W, Miller W, Myers E, Lipman D (1990) Basic local alignment search tool. J Mol Biol 215:403–410. [DOI] [PubMed] [Google Scholar]
- 83. Höög JO, van Bahr‐Lindström H, Josephson S, Wallace BJ, Kushner SR, Jörnvall H, Holmgren A (1984) Nucleotide sequence of the thioredoxin gene from Escherichia coli . Biosci Rep 4:917–923. [DOI] [PubMed] [Google Scholar]
- 84. Smith DB, Rubira MR, Simpson RJ, Davern KM, Tiu WU, Board PG, Mitchell GF (1988) Expression of an enzymatically active parasite molecule in Escherichia coli: Schistosoma japonicum glutathione S‐transferase. Mol Biochem Parasitol 27:249–256. [DOI] [PubMed] [Google Scholar]
