Abstract
Gaussia luciferase (GLuc) is a small luciferase (18.2 kDa; 168 residues) and is thus attracting much attention as a reporter protein, but the lack of structural information is hampering further application. Here, we report the first solution structure of a fully active, recombinant GLuc determined by heteronuclear multidimensional NMR. We obtained a natively folded GLuc by bacterial expression and efficient refolding using a Solubility Enhancement Petide (SEP) tag. Almost perfect assignments of GLuc’s 1H, 13C and 15N backbone signals were obtained. GLuc structure was determined using CYANA, which automatically identified over 2500 NOEs of which > 570 were long-range. GLuc is an all-alpha-helix protein made of nine helices. The region spanning residues 10–18, 36–81, 96–145 and containing eight out of the nine helices was determined with a Cα-atom RMSD of 1.39 Å ± 0.39 Å. The structure of GLuc is novel and unique. Two homologous sequential repeats form two anti-parallel bundles made by 4 helices and tied together by three disulfide bonds. The N-terminal helix 1 is grabbed by these 4 helices. Further, we found a hydrophobic cavity where several residues responsible for bioluminescence were identified in previous mutational studies, and we thus hypothesize that this is a catalytic cavity, where the hydrophobic coelenterazine binds and the bioluminescence reaction takes place.
Subject terms: Solution-state NMR, Oxidoreductases
Introduction
Luciferase (Luc) is a generic term for bioluminescent enzymes that catalyze the oxidation of a substrate, often termed luciferin (1, pages xix-xxi). Together with GFP, Luc is widely employed as a reporter protein2–4. Gaussia Luciferase (GLuc) is a luciferase isolated from the marine Gaussia princeps5, which catalyzes a bright blue light by oxidizing coelenterazine. GLuc is small and has a molecular mass of 18.2 kDa (excluding the secretion tag). Nonetheless, its bioluminescence intensity is strong (200 fold higher than Firefly Luciferase and Renilla Luciferase, the two most widely used luciferase), and it is thus considered as a potential ideal reporter protein6. Attempts to improve or redesign GLuc’s bioluminescence characteristics included the lengthening of its half-life luminescence7–9, and the redshift of its light emission peak at 480 nm10,11, which is absorbed by tissues during in vivo applications12. However, structural information at atomic resolution is still not available, making the redesign process tedious.
GLuc contains 10 cysteines, and previous studies demonstrated that the natively folded GLuc contains five disulfide bonds. The presence of 5 disulfide bonds increases the risks of misfolding when GLuc is bacterially produced, resulting in a low yield13. In order to overcome this misfolding problem, several methods including fusion with pelB leader sequence7,14, cell-free systems15, low-temperature expression16 were reported, but the yield of natively folded GLuc remained insufficient for high-resolution structural studies. We previously developed a Solubility Enhancement Peptide tag (SEP tag17–19). We showed that by attaching a SEP tag containing nine aspartic acids to GLuc’s C-terminus, we could increase the solubility of GLuc, resulting in a spontaneous refolding and the formation of native disulfide bonds. Indeed, we obtained nearly 1 mg of soluble and functional GLuc from a 200 ml of E.coli cultured in Luria–Bertani (LB)11,13,20.
Here, we used the SEP-tag fused GLuc construct to produce a sufficient amount of 15N and 13C uniformly labeled GLuc for NMR studies. Heteronuclear multidimensional NMR spectroscopy enabled over 99% backbone 1H, 13C, and 15N chemical shifts of GLuc to be assigned. Flexible regions and highly stable regions were identified by 1H–15N heteronuclear NOE21 and H/D exchange experiments22. The three-dimensional structure calculated by using CYANA (ver 3.9823) were determined with a backbone (Cα) RMSD of 1.39Å ± 0.39Å (excluding residues in the flexible regions).
Results
Expression and purification of GLuc
The natively folded GLuc possesses ten cysteines that form five disulfide bonds, which can be easily misformed when the protein is expressed in E.coli, and the cysteines are air-oxidized in vitro. Here, we used a SEP-Tag, C9D, which solubilizes the protein during air-oxidization and refolding, thereby increasing the yield of natively folded and active GLuc20. The final yield of GLuc after tags cleavage and two times HPLC purification (Supplementary Fig. 1) was 1.5 mg per liter of M9 minimal medium culture, which was sufficient for NMR analysis. GLuc’s identity and folding were confirmed by, respectively, MALDI-TOF mass (15N labeled GLuc, calculated = 19,055.8 Da, experimental = 19,062.5 Da, see Supplementary Fig. 2) and bioluminescence activity measurement. The 15N labeled GLuc’s activity was essentially identical to that given in our previous report11,13 (Supplementary Fig. 3). To date, the yield of natively folded active GLuc is almost nil when expressed without the C9D tag20, and the solubilization tag was thus essential to achieve the present amount of protein, though it was removed once the protein was folded into its native conformation.
NMR analysis
The 1H–15N HSQC spectrum exhibited dispersed and sharp peaks (Fig. 1), indicating a stable and well-folded structure. Almost all backbone chemical shifts were visible in the heteronuclear NMR experiments, and over 99% of backbone 1H, 13C and 15N resonances of non-proline residues were unambiguously assigned. C136 was the only un-assigned backbone H–N chemical shifts. The broadened signals around residue C136 suggested that the region encompassing the C136/C148 disulfide bond was subjected to structural exchange, as suggested by the 15N relaxation dispersion of D138 and L140 (Table 1), and thus the C136 H–N pair was undetected.
Table 1.
Residue | R2(50 Hz)-R2(1 kHz) 1/s |
---|---|
V12 | 6.66 |
S16 | 4.84 |
T21 | 2.47 |
D26 | 2.12 |
G28 | 5.60 |
L37 | 2.88 |
A47 | 2.61 |
S61 | 6.93 |
M69 | 4.87 |
K70 | 2.07 |
G75 | 4.78 |
T79 | 6.43 |
T125 | 2.26 |
D138 | 4.46 |
L140 | 4.07 |
T150 | 4.46 |
A152 | 10.85 |
Differences of effective 15N transverse relaxation rates at CPMG rates of 50 Hz and 1 kHz are listed only for peaks that are resolved and their rate difference > 2 1/s. Residues that contribute to the central cavity were underlined. Larger difference was observed for residues in the C-terminal flexible region (indicated by bold letters) indicative of its structural dynamics. The two-state model analysis of CPMG rates of 50, 100, 150, 200, 250, 300, 400, 500, 600, 800, 1000 Hz showed that the estimated exchange rate was 2500 ± 800 1/s. Because the residues showing R2 dispersion spread around several blocks, we judged farther residue-specific analysis using single exchange rate is unreliable.
The side-chain atoms were automatically assigned by FLYA24 (a function of CYANA) using the aliphatic atoms identified in the 3D HCCH-TOCSY, 15N- and 13C-edited NOESY spectra. The assignments were confirmed by visual inspection and when necessary corrected manually using the NMR spectra viewer and analyzer MagRO25,26. We assigned over 82.4% of 1H, 13C and 15N atoms of entire GLuc molecule.
The secondary structure elements were analyzed by TALOS + using the 1H, 13C, 15N chemical shifts (Fig. 2a). TALOS + indicated that GLuc contains 36.9% helix and 4.7% sheets, in reasonable agreement with our previous prediction based on the consensus of seven publicly available secondary structure predictors (30% helix and 4% sheets) as well as with the results of our Circular Dichroism (CD) analysis (30% helix and 12% sheets)13. In addition, the location of helices calculated by TALOS + and the secondary structure prediction mostly overlapped (Supplementary Fig. 4).
Structure calculation and disulfide bond determination
Since GLuc was a monomer as demonstrated by AUC, all NOEs were used as intramolecular NOEs (Fig. 1). The statistics of NOEs assigned during the 19 CYANA runs are shown in Table 2. Even though distance constraints for hydrogen bonds, disulfide bonds, and some manually assigned NOEs were included in the CYANA calculations in addition to the standard automatically assigned NOEs, the target functions were reasonably small (4.07 ± 0.59), indicating that the resulting structures were consistent with the experimental data.
Table 2.
NOE distance restraints* | |
All | 2573.4 ± 42.4 |
Intra residue (|i − j|= 0) | 565.9 ± 14.7 |
Sequential (|i − j|= 1) | 728.6 ± 6.6 |
Medium-range (2 ≤|i −j|≤ 4) | 700.5 ± 18.1 |
Long-range (|i − j|> 5) | 579.7 ± 32.2 |
Dihedral angle | 183 |
Hydrogen bonds | 25 |
Disulfide bonds | 3 |
RMSD to the representative structure** | |
Backbone Cα (residues 10–18, 36–81, 96–145) | 1.39 Å ± 0.39 Å |
Heavy atoms (residues 10–18, 36–81, 96–145) | 1.84 Å ± 0.20 Å |
Backbone Cα (helices without α2) | 1.30 Å ± 0.36 Å |
Heavy atoms (helices without α2) | 1.74 Å ± 0.45 Å |
Ramachandran plot (representative structure) | |
Residues in most favored regions | 71.8% |
Residues in additionally allowed regions | 20.4% |
Residues in generously allowed regions | 4.9% |
Residues in disallowed regions | 2.8% |
19 rounds of CYANA calculation with different random seed37. * The averaged numbers of NOEs and their standard deviations are calculated over the 19 rounds of CYANA calculations. ** Averaged over the 18 structures (except for the representative structure).
Ellman’s assay indicated that all ten cysteines (C52, C56, C59, C65, C77, C120, C123, C127, C136, and C148) are oxidized in the active GLuc, and thus that they should form five disulfide bonds20. Three disulfide bonds C59/C120, C65/C77, and C136/C148 were unambiguously visible in the NMR structures, but the pairing of the remaining four cysteines (C52, C56, C123, and C127), which were close to each other, was less straightforward to determine. In order to determine the remaining two disulfide bonds, we set the distance between the gamma sulfur (Sγ) to > 2Å so that any cysteine could freely combine with any of the remaining three cysteines. We then identified cysteine pairs with Sγ distance < 3Å in the 380 structures obtained from 19 rounds. As a result, C52/C127 and C56/C123 were the most favored pairs and were observed in 92.4% and 56.1% of the calculated structures, respectively. On the other hand, C52/C56 and C123/C127 were observed in only 13.7% and 19.5% of the structures.
The disulfide bond combination between the cysteines (C52, C56, C59, C120, C123 and C127) were further characterized by limited proteolysis of GLuc with trypsin, and Liquid Chromatography–Mass Spectrometry (LC–MS) analysis of the fragments. A peptide fragment with Mw. 4259.01 Da was observed and was identified as the combination of fragments A50-R54, G55-K64 and D106-K129 (Supplementary Methods, Supplementary Table 1 and Supplementary Fig. 5). This indicated presence of the C52/C127 disulfide bond, and at least another bond between C56/C123 or C59/C120 fully corroborating the NMR calculations.
Overall fold and dynamic features of GLuc
The structure with the lowest overall target function among the 20 NMR-derived structures in each round was selected. Nineteen structures from 19 rounds were used for further analysis. Among the nineteen structures, seven structures formed the putatively correct disulfide bonds (C52/C127, C56/C123, C59/C120, C65/C77, C136/C148). We finally selected the structure with the lowest average pairwise RMSD (against all other eighteen structures) as the representative structure. This structure also forms the putatively correct disulfide bonds.
The nineteen superimposed NMR-derived structures with the lowest overall target function show that GLuc has nine helices (α1- α9, Figs. 2e and 3), and the location of all helices essentially corroborate the TALOS + prediction except for α2 (Fig. 2a,e). The N- (residues 1–9) and C-terminus (residues 146–168) of GLuc are highly disordered (Fig. 2c). GLuc’s main structure is formed by residues 10–145, in which the structure of residues10–18, 35–81 and 97–145 were well-defined with an average backbone RMSD to the representative structure for all other eighteen structures of 1.39Å (Table 2). Residues 19–34 and 82–96 were highly disordered and can be considered as intrinsically disordered regions (IDR27, Figs. 2d, and 3). The structures of helices α1 and α3-α9 were well-defined with an average backbone RMSD of 1.30Å (Table 2). It has been reported that α3-loop- α4-loop- α5 (α3- α5, residues 37–72) and α7-loop-α8-loop-α9 (α7–α9, residues 109–143) are repeat sequences13,28. The present structural analysis shows that GLuc’s two repeat sequences are connected by the second IDR (residues 82–96) and form an anti-parallel bundle (α3 + α8 pair and α4 + α7 pair) that surrounds the N-terminal α1 helix. The anti-parallel bundles are firmly tied by three disulfide bonds (C52/C127, C56/C123, and C59/C120), resulting in a high local stability (Fig. 3). Moreover, all residues in the well-defined region exhibited 1H–15N heteronuclear NOE values larger and more uniform than residues in the N- and C- terminus or in the two IDRs, confirming that the well-defined regions obtained by calculation were consistent with the rigid regions determined by HN NOE values (Fig. 2c,d).
Discussion
The structure of GLuc is novel, as we detected no similar structures in the Protein Data Bank using DALI29 (Supplementary Fig. 6). It is even quite different from the structures of Renilla luciferase (RLuc)30, Oplophorus Luciferase (OLuc)31 and apoaequorin32, which like GLuc uses coelenterazine as a substrate and are ATP independent luciferases. The anti-parallel bundle of helices, which exhibits pseudo twofold symmetry, in the GLuc fold can be divided into two moieties. Though both showed well-defined backbone structures, the experimental data indicated differences in the side chain packing stability. The side chains of residues 52–123 are tightly packed whereas those of residues19–51 and 124–151 are loosely packed (Fig. 3d). The high stability of the former one reveals good agreement with the residues showing extreme low H/D exchange rates, whereas residues in the latter one exhibited high H/D exchange rate indicative of a low stability (Fig. 2b and Supplementary Fig. 7). In the tightly packed moiety, we found several hydrophilic residues with well-determined side-chain structures. For instance, the chemical shifts of R76-Hε, Q112-Hε 1/2, and Q116-Hε 1/2 were clearly different from averaged values observed in a flexible side chain. Furthermore, many NOEs were assigned to these atoms corroborating the fact that these side chains are involved in hydrogen bonds stabilizing the tightly packed moiety. Interestingly the side-chains of R76 and Q112 are stacked to the aromatic rings of F113 and F104, respectively, apparently shifting NMR signals of these protons from ring current effect (Fig. 3d). Finally, the hydroxyl protons of T66 and T124 were also clearly visible, suggesting that they are involved in hydrogen bonds and thus in the N-terminal capping of helices α5 and α8, respectively (Fig. 3d).
Surface accessible analysis of the representative structure indicated a noticeable cavity located among the central α1, α4 and α7 (Fig. 4a,b and Supplementary Fig. 8). The cavity was made by 19 residues: N10, V12, A13, V14, S16, N17, F18, L60, S61, I63, K64, C65, R76, C77, H78, T79, F113, I114, V117 (The hydrophobic residues are underlined for reader's convenience; Fig. 4c). Similar cavities formed by these 19 residues were identified in all other eighteen NMR-derived structures, though the sizes and shapes of their cavities showed some variation because of the limited resolution of the NMR structures. The 19 residues are distributed on three structural segments: α4 + α7 most rigid block, R76-T79 short loop, and the central α1. As mentioned above, the α4 + α7 is a rigid block stabilized by three disulfide bonds: C52/C127, C56/C123 and C59/C120 (Fig. 3c). It has been reported that GLuc N (residue 1–91) and C (residue 92–168) fragments could be used for PCAs (Protein-fragments Complementation Assays), where interactions between the fragments are detected by luminescence. However, the luminescence intensity was about 10% of that of the full length, intact GLuc33 suggesting that without the aforementioned inter-fragment disulfide bond, the cavity is formed in a very dynamic and partial manner resulting in the weaker bioluminescence. In addition to the rigid structure of the cavity, we observed a structural exchange suggested by 15N relaxation dispersion (Table 1). S61 was close to both V12 and T79 (Fig. 4c), which exhibited the second-largest dispersions. We thus hypothesized that the structural exchange is related to the opened and closed form of this cavity.
A flexible docking simulation indicated that the cavity was large enough to accommodate coelenterazine; and this was verified for all seven models that formed five disulfide bonds (Supplementary Fig. 9). H/D exchange indicated that the amide protons of N17, L60, S61, F113, I114 and V117, which are located around the cavity, were visible after 20 min incubation in D2O, and among them, L60, S61, I114, and V117 signals were visible even after 18hrs (Fig. 2b and Supplementary Fig. 7). The four residues are located in the α4 + α7 most rigid block (Fig. 2b,e), indicating that the cavity wall is rigid. The hydrophobic character of the cavity’s interior suggests a putative role in recruiting coelenterazine, a small poorly soluble molecule. Furthermore, three activity-related residues R76, C77, H7810 are located in the short R76-T79 loop that is near the α4 + α7 most rigid block and stabilized through the C65/C77 disulfide bond. C65 and C77 are also associated with bioluminescence activity. In particular, the mutation of C77 resulted in a vanishing luminescence10. This can be rationalized by hypothesizing that the breakage of the C65/C77 disulfide bond would destroy the entire cavity structure and inactivate GLuc.
Sequence alignment also suggests that the cavity plays a role as a binding pocket for coelentarzine. First, the seven cavity forming residues: L60, S61, V117 (in the hydrophobic region); R76, H78 (in the activity-related loop); and C65, C77 (the disulfide bond) are highly conserved in 12 luciferases (MoLuc, MpLuc, etc. see Supplementary Fig. 10). C65, R76, C77, V117 were fully conserved, and L60, S61, H78 had a 92% conservation ratio. Furthermore, the structures of OLuc, RLuc and apoaequorin also contain a similar hydrophobic cavity. Altogether, these observations strongly suggest that the cavity constitutes the coelenterazine’s binding site and is thus essential for GLuc’s bioluminescence activity (Fig. 4c and Supplementary Fig. 10a). In addition, we noticed that the central α1 contains seven residues located near the cavity (N10, V12, A13, V14, S16, N17, F18) that are weakly conserved (Supplementary Fig. 10a), suggesting that α1 does not contribute to substrate recruiting nor to catalysis, but rather assists the cavity formation.
Additionally, we noticed that several residues in the C-terminal region (K141-D168) are remarkably conserved (Supplementary Fig. 10a) despite their high flexibility as assessed by heteronuclear NOE analysis (Fig. 2), which may suggest that they are functionally or perhaps structurally important. The sequence alignment of residues 27–97 with residues 98–168 indicates that K141-F151 in the C-terminal region has a high similarity with K70-Y80 where the aforementioned activity-related loop (R76-T79) is located (Supplementary Fig. 10b). Furthermore, our previous mutational analysis demonstrated that W143, L144 and F151 also play an important role in GLuc’s activity11, and it is of interest to note that these conserved residues are disordered in our NMR structure and can be defined as IDRs.
Finally, let us note that several lines of evidence suggested that these residues are not completely disordered. First, residues around F151 exhibited low 1H–15N NOE values (0.0 ~ − 0.2, Fig. 2c), suggesting a disordered state in the nano- or pico-second time scale, and the R2-dispersion experiments indicated that these residues experience a micro- or milli-second time scale exchange between the folded and unfolded states rather than in a perfectly flexible state (Table 1). This exchange between a folded and a less folded state was further corroborated by the observation of strong intra-residue and sequential NOEs in the 3D 15N-edited NOESY, and the relatively broad line shapes of the peaks in the 2D 1H–15N HSQC (data not shown). Taken together, our results raise the possibility of an active participation of flexible regions (that can be considered as IDRs) in the coelenterazine oxidation reaction.
Conclusion
We produced a recombinant 13C, 15N labeled GLuc in E.coli, and assigned nearly all of the backbone and most of the side chain chemical shifts. The N- and C-termini, as well as the segment located between α1 and α3 (encompassing α2) and the loop between α5 and α6 were flexible. GLuc’s structure is unique and is made of nine helices, constituting two anti-parallel bundles, which are formed by, respectively, helices α3-α4 and α7-α8 of parts of homologous sequential repeats. The helices are tied together by disulfide bonds to form a 4-finger structure with a pseudo-twofold symmetry surrounding the N-terminal helix 1. Finally, we identified a hydrophobic cavity where coelenterazine is most likely to bind and the catalytic reaction occurs, but it is difficult to discuss whether one or two coelenterazines can bind to the cavity, and how the isolated N-and C-terminal regions can retain some activity28,34. Presently, our structure does not provide preference for a single or a multiple binding site model. In addition, the large and fluctuating nature (Table 1) of the cavity and because the activity-related C-terminal IDR is located somewhat far from the cavity (Fig. 4b) may result in a catalytic reaction that does not adhere to traditional binding models. The fold of GLuc is novel, and we believe that the above reported structural/dynamic information will open an avenue for redesigning the bioluminescence activity of GLuc and thereby widen its scope of application.
Methods
Expression system
A DNA sequence encoding the wild-type GLuc gene (UniProtKB ID: Q9BLZ2) without the 17 residues secretion tag and with an E100A and G103R mutations that increased protein expression was synthesized as reported previously13. The GLuc sequence was flanked with an N terminal His-tag and a C terminal SEP-tag (Solubility Enhancement Peptide tag, C9D) to facilitate protein expression, refolding, and purification20. Two Factor Xa cleavage sites were inserted between GLuc and His-tag/SEP-Tag. The GLuc gene named GLuc-TG13 was inserted into pET21c (Novagen) at the NdeI/BamHI site to construct p21GLucTG with ampicillin resistance.
Protein expression and purification
p21GLucTG was transformed into BL21(DE3), and pre-cultured in 1 L Luria–Bertani (LB) medium at 37 °C and 250 rpm shaking. When OD590nm reached 1.0, E.coli cells were collected by soft centrifugation and transferred to a 1 L M9 medium containing 13C-glucose and 15NH4Cl. Isopropyl β-D-Thiogalactoside (IPTG) was added at 1 mM final concentration for inducing protein expression, and the temperature was lowered to 25 °C for minimizing the formation of inclusion bodies. After 4 h with shaking at 250 rpm, the cells were harvested by centrifugation and sonicated. GLuc was purified from the supernatant fraction using a Nickel Nitrilotriacetic Acid (NTA) column followed with overnight dialysis at 4 °C against 50 mM Tris–HCl, pH 8.0. GLuc was then air-oxidized for 3 days at the same conditions in order to form the five disulfide bridges. Residual misfolded GLuc was removed using a reversed phase High-Performance Liquid Chromatography (HPLC). The protein concentration was determined using a Bradford assay35, and Factor Xa was added to GLuc dissolved in 50 mM Tris–HCl, 100 mM NaCl, and 5 mM CaCl2 at a ratio of 1:100 (w/w), and the sample was again incubated for 8 h at 37 °C, 100 rpm for enzymatic cleavage of the His- and the SEP-Tags. Uncleaved GLuc was removed using, again, reversed phase HPLC. GLuc identity was confirmed by MALDI-TOF mass spectroscopy on an ABI SCIEX TOF/TOF 5800 (Thermo Fisher Scientific Inc., Massachusetts, USA). GLuc folding was confirmed by bioluminescence activity measurement on a FP-8000 fluorescence spectrophotometer (JASCO International co., Ltd, Tokyo, Japan). GLuc was freeze-dried and kept as a powder at − 30 °C until use.
Analytical Ultracentrifugation (AUC)
Sedimentation equilibrium experiments were carried out using an Optima XL-A analytical ultracentrifuge (Beckman-Coulter, Inc., Brea, California, USA) with a four-hole An60Ti rotor at 20 °C. Before centrifugation, GLuc samples were dialyzed overnight against 50 mM MES and 100 mM NaCl at pH 4.7. The solvent density (1.006983 g/cm3) was determined using DMA 5000 (Anton Paar). Each sample was then transferred into a cell with a six-channel centerpiece. The sample concentrations were 1.2, 0.6, and 0.3 mg/mL. Data were obtained at 12,000, 22,000, and 37,000 rpm. A total equilibration time of 24 h was used for each speed, with absorbance scans at 280 nm taken every 4 h to ensure that equilibrium had been reached. Data analysis was performed by global analysis of all of the data sets obtained at different concentrations and rotor speeds using SEDPHAT36.
NMR analysis and structure calculation
NMR experiments for resonance assignments and 1H–15N heteronuclear NOE experiment were conducted using 0.2 mM 15N single or 15N, 13C double labeled GLuc protein dissolved in 50 mM MES buffer pH 6.0 and 2 mM NaN3, at 293 K with 8%(v/v) D2O in a 5 mm Shigemi microtube (Shigemi co., Ltd, Tokyo, Japan). NMR spectra were acquired on a Bruker Avance-III 700 MHz spectrometer, equipped with a 5 mm CPTXI cryoprobe. Two-dimensional and three-dimensional NMR experiments (1H–15N HSQC, HNCACB, CBCA(CO)NH, HNCA, HNCO, HN(CA)CO) were performed for the backbone 15N and 13C assignments. 15N–TOCSY–HSQC, 15N–NOESY–HSQC, and HCCH–TOCSY were used for backbone and side-chain signal assignments. H/D exchange 1H–15N–HSQC experiment was performed under the above-described conditions but by dissolving GLuc’s freeze-dried powder in D2O instead of H2O. The transverse relaxation rate (R2) dispersion experiments for backbone 15N atoms were performed on a Bruker Avance-III 900 MHz spectrometer, using pulse scheme including constant time relaxation compensated CPMG pulse sequences37. A series of 2D experiments were acquired with various rates of CPMG pulse (50, 100, 150, 200, 250, 300, 400, 500, 600, 800, 1000 Hz) in the two of 20 ms CPMG blocks. The 15 N CPMG irradiation intensity was set to 3125 Hz (~ 35 ppm). To avoid the error in peak intensity arisen from the offset effect, two sets of relaxation experiments with different 15N irradiation centers (111 ppm and 125 ppm) were carried out. For each signal, we read the series of peak intensities using the spectrum giving the smaller errors by the offset effect.
Three-dimensional structure determination
Automated NOE assignments and structure calculations were performed using CYANA (ver. 3.98) on a PC-cluster equipped with 20-core Intel Xeon E5-4627v3 (3.0 GHz) and using the manually assigned chemical shifts and a list of NOE chemical shifts derived from the 3D 15N-, 13C-edited NOESY spectra of the aliphatic and aromatic regions. For each cycle of CYANA calculation, 20 out of 100 structures were selected after 10,000 steps of simulated annealing using distance constraints derived from automatically assigned NOEs. The detailed algorithm and strategy are described23. For the automated NOE assignments of the NOEs peaks, the tolerances were set to 0.04, 0.4 and 0.4 ppm for 1H, 15N and 13C signals, respectively. 19 rounds of CYANA calculations with different random seeds were performed. It should be noted that, for a minor number of the NOEs, the automated assignment was ambiguous and depended on the random seed. We thus calculated the structures using restraints that were slightly different from set to set, and we selected the best structure from the structures generated using each of the sets and reported the ensemble of structures. The treatment using many random seeds was previously discussed and the ensemble of structures calculated using ambiguous assignments becomes more diverse and safer. We can reduce the risk to make structure ensemble affected by wrong NOE assignments38. According to experimentally determined slow-exchanging backbone amide protons and threonine (Thr) side chain OH atoms, we applied 24 sets of distance constraints, two of them for hydrogen bonds including Thr-OH related to N-terminal capping of α-helices and 21 of them for backbone amide protons to the 19 rounds of CYANA calculations.
Other softwares
The secondary structure elements of GLuc were predicted by submitting the backbone chemical shifts into TALOS + (https://spin.niddk.nih.gov/bax/nmrserver/talos/)39. Root-Mean-Square Deviation (RMSD) was calculated using a Biopython.PDB module (https://biopython.org)40. Three-dimensional images of the NMR structures were generated using PyMOL. The flexible docking simulation was calculated using AutoDock (Ver. 4.2.6) and AutoDockTools (Ver.1.5.6)41.
Supplementary information
Acknowledgements
We thank members of the Kuroda Laboratory for discussion, and help and advice with the experimental operations. We are especially thankful to Dr. Tetsuya Kamioka and Mr. Fumiya Suzuki (TUAT) for advice and kind help with protein expression and purification. We are also thankful to Dr. Seketsu Fukuzawa (JEOL, Ltd) for advice and discussion on LC–MS analysis. We are grateful to Prof. Yanhong Bai, Zhengzhou University of Light Industry, for her generous support of this international joint research.
Author contributions
N.W., Y.K. and T.Y. conceived the project, analyzed the structure, and wrote the manuscript. N.W., K.T., and T.Y. performed the NMR experiments and analyzed the data, N.K and T.Y. performed and analyzed the structure calculation with N.W.’s assistance, S.U. and T.S. performed the AUC experiments. T.Y. and T.S. performed the LC-MS analysis. All authors contributed to finalize the manuscript and approved it. Data deposition The chemical shifts have been deposited in the Biological Magnetic Resonance Bank (BMRB) under the accession No.36288, and the atomic coordinates are deposited in the Protein Data Bank under accession number PDB-ID: 7D2O. The expression vector for GLuc-TG (p21GLucTG) is deposited in Addgene (ID:124660).
Funding
The work has financially supported by a grant-in-aid from the Japanese Society for the Promotion of Science (JSPS) KAKENHI-23651213 and 26560432 to Y.K., the institute for Global Innovation Research at TUAT, and the Doctoral Scientific Research Foundation of Zhengzhou University of Light Industry to N.W. (No. 2018BSJJ020).
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Nan Wu and Naohiro Kobayashi.
Contributor Information
Yutaka Kuroda, Email: ykuroda@cc.tuat.ac.jp.
Toshio Yamazaki, Email: toshio.yamazaki@riken.jp.
Supplementary information
is available for this paper at 10.1038/s41598-020-76486-4.
References
- 1.Shimomura O. Bioluminescence: Chemical Principles and Methods. Singapore: World Scientific; 2006. [Google Scholar]
- 2.Oyama H, et al. Gaussia luciferase as a genetic fusion partner with antibody fragments for sensitive immunoassay monitoring of clinical biomarkers. Anal. Chem. 2015;87:12387–12395. doi: 10.1021/acs.analchem.5b04015. [DOI] [PubMed] [Google Scholar]
- 3.Kaskova ZM, Tsarkova AS, Yampolsky IV. 1001 lights: luciferins, luciferases, their mechanisms of action and applications in chemical analysis, biology and medicine. Chem. Soc. Rev. 2016;45:6048–6077. doi: 10.1039/C6CS00296J. [DOI] [PubMed] [Google Scholar]
- 4.Yi S, Liu N-N, Hu L, Wang H, Sahni N. Base-resolution stratification of cancer mutations using functional variomics. Nat. Protoc. 2017;12:2323. doi: 10.1038/nprot.2017.086. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Sent-Gyorgyi, C. S. & Luciferases, B. J. B. fluorescent proteins, nucleic acids encoding the luciferases and fluorescent proteins and the use thereof in diagnostics, high throughput screening and novelty items. U.S. Patent 6232107-B (2001).
- 6.Tannous BA, Kim DE, Fernandez JL, Weissleder R, Breakefield XO. Codon-optimized Gaussia luciferase cDNA for mammalian gene expression in culture and in vivo. Mol. Ther. 2005;11:435–443. doi: 10.1016/j.ymthe.2004.10.016. [DOI] [PubMed] [Google Scholar]
- 7.Maguire CA, et al. Gaussia luciferase variant for high-throughput functional screening applications. Anal. Chem. 2009;81:7102–7106. doi: 10.1021/ac901234r. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Welsh JP, Patel KG, Manthiram K, Swartz JR. Multiply mutated Gaussia luciferases provide prolonged and intense bioluminescence. Biochem. Biophys. Res. Commun. 2009;389:563–568. doi: 10.1016/j.bbrc.2009.09.006. [DOI] [PubMed] [Google Scholar]
- 9.Degeling MH, et al. Directed molecular evolution reveals Gaussia luciferase variants with enhanced light output stability. Anal. Chem. 2013;85:3006–3012. doi: 10.1021/ac4003134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Kim SB, Suzuki H, Sato M, Tao H. Superluminescent variants of marine luciferases for bioassays. Anal. Chem. 2011;83:8732–8740. doi: 10.1021/ac2021882. [DOI] [PubMed] [Google Scholar]
- 11.Wu N, Kamioka T, Kuroda Y. A novel screening system based on VanX-mediated autolysis-Application to Gaussia luciferase. Biotechnol. Bioeng. 2016;113:1413–1420. doi: 10.1002/bit.25910. [DOI] [PubMed] [Google Scholar]
- 12.Gheysens O, Mottaghy FM. Method of bioluminescence imaging for molecular imaging of physiological and pathological processes. Methods. 2009;48:139–145. doi: 10.1016/j.ymeth.2009.03.013. [DOI] [PubMed] [Google Scholar]
- 13.Wu N, Rathnayaka T, Kuroda Y. Bacterial expression and re-engineering of Gaussia princeps luciferase and its use as a reporter protein. BBA-Proteins Proteom. 2015;1854:1392–1399. doi: 10.1016/j.bbapap.2015.05.008. [DOI] [PubMed] [Google Scholar]
- 14.Tannous BA. Gaussia luciferase reporter assay for monitoring biological processes in culture and in vivo. Nat. Protoc. 2009;4:582–591. doi: 10.1038/nprot.2009.28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Goerke AR, Loening AM, Gambhir SS, Swartz JR. Cell-free metabolic engineering promotes high-level production of bioactive Gaussia princeps luciferase. Metab. Eng. 2008;10:187–200. doi: 10.1016/j.ymben.2008.04.001. [DOI] [PubMed] [Google Scholar]
- 16.Rathnayaka T, Tawa M, Sohya S, Yohda M, Kuroda Y. Biophysical characterization of highly active recombinant Gaussia luciferase expressed in Escherichia coli. BBA-Proteins Proteom. 2010;1804:1902–1907. doi: 10.1016/j.bbapap.2010.04.014. [DOI] [PubMed] [Google Scholar]
- 17.Islam MM, Khan MA, Kuroda Y. Analysis of amino acid contributions to protein solubility using short peptide tags fused to a simplified BPTI variant. BBA-Proteins Proteom. 2012;1824:1144–1150. doi: 10.1016/j.bbapap.2012.06.005. [DOI] [PubMed] [Google Scholar]
- 18.Khan MA, Islam MM, Kuroda Y. Analysis of protein aggregation kinetics using short amino acid peptide tags. BBA-Proteins Proteom. 2013;1834:2107–2115. doi: 10.1016/j.bbapap.2013.06.013. [DOI] [PubMed] [Google Scholar]
- 19.Nautiyal K, Kuroda Y. A SEP tag enhances the expression, solubility and yield of recombinant TEV protease without altering its activity. New Biotechnol. 2018;42:77–84. doi: 10.1016/j.nbt.2018.02.006. [DOI] [PubMed] [Google Scholar]
- 20.Rathnayaka T, et al. Solubilization and folding of a fully active recombinant Gaussia luciferase with native disulfide bonds by using a SEP-Tag. BBA-Proteins Proteom. 2011;1814:1775–1778. doi: 10.1016/j.bbapap.2011.09.001. [DOI] [PubMed] [Google Scholar]
- 21.Bah A, et al. Folding of an intrinsically disordered protein by phosphorylation as a regulatory switch. Nature. 2014;519:106. doi: 10.1038/nature13999. [DOI] [PubMed] [Google Scholar]
- 22.Yi Q, Baker D. Direct evidence for a two-state protein unfolding transition from hydrogen-deuterium exchange, mass spectrometry, and NMR. Protein Sci. 2008;5:1060–1066. doi: 10.1002/pro.5560050608. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Güntert P, Buchner L. Combined automated NOE assignment and structure calculation with CYANA. J. Biomol. NMR. 2015;62:453–471. doi: 10.1007/s10858-015-9924-9. [DOI] [PubMed] [Google Scholar]
- 24.Schmidt E, Güntert P. A new algorithm for reliable and general NMR resonance assignment. J. Am. Chem. Soc. 2012;134:12817–12829. doi: 10.1021/ja305091n. [DOI] [PubMed] [Google Scholar]
- 25.Johnson BA, Blevins RA. NMR View: a computer program for the visualization and analysis of NMR data. J. Biomol. NMR. 1994;4:603–614. doi: 10.1007/BF00404272. [DOI] [PubMed] [Google Scholar]
- 26.Kobayashi N, et al. KUJIRA, a package of integrated modules for systematic and interactive analysis of NMR data directed to high-throughput NMR structure studies. J. Biomol. NMR. 2007;39:31–52. doi: 10.1007/s10858-007-9175-5. [DOI] [PubMed] [Google Scholar]
- 27.Ota M, et al. An assignment of intrinsically disordered regions of proteins based on NMR structures. J. Struct. Biol. 2013;181:29–36. doi: 10.1016/j.jsb.2012.10.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Inouye S, Sahara Y. Identification of two catalytic domains in a luciferase secreted by the copepod Gaussia princeps. Biochem. Biophys. Res. Commun. 2008;365:96–101. doi: 10.1016/j.bbrc.2007.10.152. [DOI] [PubMed] [Google Scholar]
- 29.Hasegawa H, Holm L. Advances and pitfalls of protein structural alignment. Curr. Opin. Struc. Biol. 2009;19:341–348. doi: 10.1016/j.sbi.2009.04.003. [DOI] [PubMed] [Google Scholar]
- 30.Loening AM, Fenn TD, Gambhir SS. Crystal structures of the luciferase and green fluorescent protein from Renilla reniformis. J. Mol. Biol. 2007;374:1017–1028. doi: 10.1016/j.jmb.2007.09.078. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Tomabechi Y, et al. Crystal structure of nanoKAZ: the mutated 19 kDa component of Oplophorus luciferase catalyzing the bioluminescent reaction with coelenterazine. Biochem. Biophys. Res. Commun. 2016;470:88–93. doi: 10.1016/j.bbrc.2015.12.123. [DOI] [PubMed] [Google Scholar]
- 32.Head JF, Inouye S, Teranishi K, Shimomura O. The crystal structure of the photoprotein aequorin at 2.3 A resolution. Nature. 2000;405:372–376. doi: 10.1038/35012659. [DOI] [PubMed] [Google Scholar]
- 33.Remy I, Michnick SW. A highly sensitive protein-protein interaction assay based on Gaussia luciferase. Nat. Methods. 2006;3:977–979. doi: 10.1038/nmeth979. [DOI] [PubMed] [Google Scholar]
- 34.Hunt EA, et al. Truncated variants of Gaussia luciferase with tyrosine linker for site-specific bioconjugate applications. Sci. Rep. 2016;6:26814. doi: 10.1038/srep26814. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Bradford MM. A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal. Biochem. 1976;72:248–254. doi: 10.1016/0003-2697(76)90527-3. [DOI] [PubMed] [Google Scholar]
- 36.Vistica J, et al. Sedimentation equilibrium analysis of protein interactions with global implicit mass conservation constraints and systematic noise decomposition. Anal. Biochem. 2004;326:234–256. doi: 10.1016/j.ab.2003.12.014. [DOI] [PubMed] [Google Scholar]
- 37.Tollinger M, Skrynnikov NR, Mulder FAA, Forman-Kay JD, Kay LE. Slow dynamics in folded and unfolded states of an SH3 domain. J. Am. Chem. Soc. 2001;123:11341–11352. doi: 10.1021/ja011300z. [DOI] [PubMed] [Google Scholar]
- 38.Buchner L, Güntert P. Increased reliability of nuclear magnetic resonance protein structures by consensus structure bundles. Structure. 2015;23:425–434. doi: 10.1016/j.str.2014.11.014. [DOI] [PubMed] [Google Scholar]
- 39.Shen Y, Delaglio F, Cornilescu G, Bax A. TALOS+: a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts. J. Biomol. NMR. 2009;44:213–223. doi: 10.1007/s10858-009-9333-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Manderick B, Hamelryck T. PDB file parser and structure class implemented in Python. Bioinformatics. 2003;19:2308–2310. doi: 10.1093/bioinformatics/btg299. [DOI] [PubMed] [Google Scholar]
- 41.Morris GM, et al. AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility. J. Comput. Chem. 2009;30:2785–2791. doi: 10.1002/jcc.21256. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.