Abstract
The production of highly purified native soluble proteins in large quantities is crucial for studying protein structure and function. Odorant binding proteins (OBPs) are small, soluble, extracellular proteins with multiple disulfide bonds, whose functions include, but are not limited to, binding hydrophobic molecules and delivering them to their corresponding receptors expressed on insect olfactory receptor neurons. Expression of proteins with multiple disulfide bonds like OBPs usually results in insolubility and low yield, which has been a significant barrier to understanding their biological roles and physiological functions. In the E. coli system, expression of OBPs often results in insoluble inclusion bodies or a limited amount of periplasmic soluble proteins. Although expression of OBPs in eukaryotic systems such as Sf9 insect cells or yeast Pichia pastoris can increase the solubility of the protein, the process remains insufficient. Additionally, monitoring the purity and native apo state of the protein is critical for establishing the correct conformation of the protein. In this study, we employed an E. coli host with an altered intracellular environment to produce cytosolic soluble OBP44a protein, which yielded over 100 mg/L. We monitored the integrity of disulfide bonds throughout the purification process using LC-MS and used NMR to ensure the final product adopted a single conformation. Our study presents an efficient method for obtaining large quantities of soluble proteins in a single conformation, which enables extensive in vitro studies of secreted proteins like OBPs.
Keywords: Protein with disulfide bonds, E. coli host with altered cytosolic environment, LC-MS, NMR
1. Introduction
Odorant binding proteins (OBPs) are critical for the survival of insects, facilitating their abilities to find food, partners, and even bite sites. These proteins are abundant in an insect’s sensory system and are present in various forms [1, 2]. For instance, the fruit fly Drosophila melanogaster has 52 OBPs detected by RNA-seq [2], while the mosquito Anopheles gambiae contains 81 [3]. Though there is low homology among OBP members, they share some distinct common features. They are small (13–17 kDa) soluble proteins secreted into extracellular spaces, such as the sensillum lymph. Typically, OBPs consist of 6 helices that form a hydrophobic pocket. This pocket possesses high affinity for binding to semiochemicals such as pheromones, plant volatiles, and animal odorants. Disulfide bonds form between cysteine residues in OBPs, which defines their signature fold. Most OBPs have 3 disulfide bonds and are labeled as the classic type, while others have 2 disulfide bonds (C-Minus type) or 4–6 disulfide bonds (Plus-C type). The consensus is that one function of OBPs is to bind and solubilize hydrophobic odorant molecules, delivering them to receptors on olfactory receptor neurons (ORN). In turn, this transmits signals to the brain and evokes a cascade of behavioral responses. However, there is mounting evidence indicating that OBPs exist in non-chemosensory tissues, and may be involved in other functions such as food delivery [4], reproduction [5, 6], development [7, 8], and anti-inflammatory cellular responses [9, 10].
For more than two decades, OBPs have been expressed as recombinant proteins for in vitro studies of their structures, properties, and binding capacities. One of the expression systems involves encoding gene information into a baculovirus. The subsequently infection of insect Sf9 cells with the transformed baculovirus leads to production of secreted OBP [11, 12]. Another expression system uses the yeast Pichia pastoris as a host and induces OBP expression with the addition of methanol [13–17]. Generating viral particles to infect Sf9 cells is a laborious process and OBP expression level is relatively low with a yield of only 1–2 mg/L [11]. The yield from Pichia secretion was in the range of 60–100 mg/L, but the induction with methanol took 4–7 days, and the signal peptide affected the N-terminal cleavage leading to heterogeneous proteins [16, 17]. More often, E. coli is used as an expression host due to easy gene manipulation, fast clone selection, high yield, and low cost. Since E. coli’s cytosolic redox environment is reduced, OBP is expressed as an inclusion body without the formation of disulfide bonds [18–23]. Afterwards, the inclusion body must be refolded back into soluble protein either by simple dialysis or dialysis with the help of an oxidizing/reducing shuffle system. In 1999, Wojtasek and Leal [24] cloned the silk moth Bombyx mori’s pheromone binding protein into pET22b with the periplasmic signaling peptide peIB. The periplasm’s oxidative environment facilitates the formation of soluble proteins with disulfide bonds. Since then, this system of using a pET22b/peIB vector coupled with E.coli BL21 as the host has been used in a series of studies [25–29]. However, the yield from this expression system was only 5–10 mg/L, as the periplasmic space is limited [24, 27]. Origami, a derivative of E.coli BL21 with gor/trxB mutations, was used to replace E.coli BL21 later on, with the hope of making cytosolic soluble OBP [30]. However, results were mixed, as 80% of the protein made was found in the form of inclusion bodies [31].
OBP44a is an odorant binding protein from the model organism Drosophila melanogaster, whose functions and structure have not been characterized. In order to further assist future research, a large amount of pure OBP44a is needed. We set to express OBP44a as a cytosolic soluble protein using SHuffle® Express E. coli [32] as the host, aiming to eliminate the refolding step and increase expression levels. We also employed modern techniques such as LC-MS to see if the disulfide bonds remained intact throughout the expression and purification process, and NMR to investigate final product conformations.
2. Materials and methods
2.1. DNA construct and plasmid
Full OBP44a DNA sequence was cloned from reverse transcribed Drosophila brain cDNA sequence. Because the first 18 amino acids of OBP44a is the signal peptide for extracellular secretion, signal peptide sequencing was excluded during later cloning. OBP44a DNA sequence (aa 19 - aa 142) was fused with an ATG starting site, a 3bp linker, a 24bp 8-His tag, a 18bp TEV cleavage sequence, and a 12bp GSGP linker at the N terminal of OBP44a. The whole cassette was cloned into pJ414 plasmid, which uses the T7 promoter to drive the expression of the OBP44a cassette. Later, the 12bp GSGP linker was deleted to yield the exact aa 19 – aa 142 sequence after the TEV cleavage site. The sequence was verified by DNA sequencing (GeneWiz). The plasmid was amplified using One Shot™ TOP10 Chemically Competent E. coli cells (ThermoFisher Scientific) and stored at −20 oC for future use.
2.2. Expression test using shake flasks
Plasmid of pJ414-His-TEV-OBP44a 19–143 was transformed into BL21 Competent E. coli or SHuffle® T7 Competent E. coli (New England Biolabs, USA). A total of 20 ng of plasmid was added to 50 µl of competent cells and incubated on ice for 10 mins. Cell mix was heat-pulsed in a 42 °C water bath for 90 seconds and plated on LB agar with 100 mg/L ampicillin. One colony was chosen from the plate after overnight incubation at 30 °C, and inoculated into 30 ml of modified LB* with 100 mg/L ampicillin in a 250 ml shake flask. After overnight incubation at 30 °C and 220 rpm agitation, the culture was used to inoculate 30 ml of modified LB* with 100 mg/L ampicillin in a 250 ml baffled glass shake flask, with an initial OD600 of 1. Cells were induced with 0.5 mM IPTG when OD600 reached ~ 3.0, incubating either at 30 °C or 16 °C afterwards. Glucose was added to keep its concentration between 1 to 3 g/L. Samples were taken during induction period and pellets were saved at −80 °C for future protein analysis using SDS-PAGE.
2.3. Protein analysis using SDS-PAGE
2 mg of cell pellets were re-suspended with 200 µl of lysis buffer (50 mM HEPEs) and were sonicated for 12 seconds at 75% power (Sonics VCX-130 Vibra-Cell Ultrasonic Liquid Processor). Both the supernatant and pellet of the cell lysate were loaded onto Bis-Tris gels with 5X SDS sample buffer and electrophoresis was run at 200 V for 35 mins. SDS gels were stained with 0.01% Coomassie blue R250 and washed with destaining solution [33]. Gels were then scanned at 700 nm using the Odyssey infrared scanner (Li-Cor); the intensity of each band was determined using the Odyssey software. The concentration of OBP44a for each sample was determined by comparing the band intensity of the sample to a sample of BSA with known concentration.
2.4. Fermentation to produce isotopically labeled OBP44a
One colony was chosen from a plate after transformation using SHuffle® T7 Express Competent E. coli (NEB C3029J) as the host, and inoculated into 30 ml of medium with 100 mg/L ampicillin in a 250 ml shake flask. After incubating at 30 °C and 220 rpm agitation, 5 ml of overnight culture was used to inoculate 500 ml of medium in a 2.8 L baffled shake flask and this was then grown in an incubator at 30 °C with the same level of agitation. After overnight growth, four of these flasks were combined and used to inoculated 4 L of medium in the fermenter. A 14 BioFlo 115 fermenter (Eppendorf, Hauppauge, NY) with a working volume of 6 L was used for the fermentation run. The fermenter was sterile filtered with a defined medium of 10 g/L K2HPO4, 10 g/L KH2PO4, 9 g/L Na2HPO4, 2 g/L 15NH4Cl, 2.4 g/L K2SO4, 3 g/L glucose, 100 mg/L Ampicillin, 5 mM MgCl2, 5 ml/L trace metal solution. Temperature was controlled at 30°C and pH was maintained at 6.8 with the addition of sodium hydroxide. Air flow was controlled at 3 liters per minute throughout the whole fermentation. Before inoculation, the medium was saturated with air at 30 °C and the reading of dissolved oxygen (DO) by a InPro 6000 optical O2 probe (Mettler Toledo, MA) was set to 100%, equaling a 7.5 mg/L oxygen concentration. During the growth period, DO was controlled at 25% by increasing agitation continuously. After induction with the addition of 0.5 mM IPTG, 3 g/L of glucose was added to keep the glucose concentration between 1 to 3 g/L. Glucose concentration was measured offline using GlucCell™ (CESCO Bioengineering Co., Ltd.). Isotopically 13C labeled glucose was used in place of regular glucose to yield 13C/15N-labeled OBP44a. BioCommand Batch Control software from NBS was used for online data collection. Samples were taken at different elapsed fermentation times (EFT). Cells were centrifuged and pellets were stored at −80 °C for SDS-PAGE analysis and protein purification.
2.5. Purification of OBP44a
Frozen pellets of E. Coli cells (5 g) were suspended in 50 ml of 50 mM Tris, pH 8.0, 200 mM NaCl buffer, which also contained a cocktail of protease inhibitors (cOmplete™, EDTA free, Roche). The cells were then lysed using a microfluidizer at 17,000 psi (2 cycles) at 4 °C. The cell lysate was centrifuged (19,000 rpm for 35 min) and the supernatant was filtered through a 0.22 μm filter before being loaded onto a 5 mL Ni-NTA Superflow column (Qiagen). AKTA FPLC with Unicorn software (GE Healthcare) was used to load the supernatant and wash the column with 10 CV of buffer at 2 mL/min. The protein was then eluted with a linear gradient up to 400 mM imidazole over 30 CV. The OBP44a-containing fractions were pooled (~28 ml) and injected into a 30 ml Slide-A-Lyze cassette with 10K MWCO (ThermoFisher Scientific). The cassette was then dialyzed against 1.8 L of 50 mM Tris buffer, pH 8.0 three times.
After dialysis the protein concentration was measured using a NanoDrop Microvolume Spectrophotometer (ThermoFisher Scientific), and OBP44a was then cleaved by protease TEV (Tobacco Etch Virus nuclear-inclusion-a endopeptidase). The TEV protease was made in-house using a plasmid deposited into Addgene by David Waugh’s lab (Addgene #8827), and following the protocol published by the same group [34]. The enzymatic cleavage was completed with 1 mg of TEV per 10 mg of OBP44a in 50 mM Tris buffer, pH 8.0 without addition of DTT (dithiothreitol,ThermoFisher Scientific R0861) or TCEP (Tris(2-carboxyethyl)phosphine hydrochloride, Sigma-Aldrich C4706), and incubated at 4 °C overnight with rocking. The digested mixture was then loaded onto a pre-equilibrated 5 mL Ni-NTA Superflow column (Qiagen) to remove the cleaved His-tag, remaining uncleaved protein, as well as additional impurities, with a flow rate of 1 mL/min. The flowthrough fractions were immediately collected and analyzed with SDS-PAGE.
Following the purification of 15N-labeled and 13C/15N-labeled OBP44a by FPLC, the protein dissolved in Tris buffer (50 mM, pH 8) was subjected to reverse phase HPLC on an Agilent 1100 series system fitted with a Zorbax C8 preparative column (Agilent, Savage, MD). The fractions collected by FPLC were concentrated using a spin column (Pall, Microsep Advance Centrifugal device, 3 kD MWCO) spun at 4,000 rpm for 15 minutes at 10 °C, yielding solutions on the order of 500 μM. Aliquots of this solution (about 250 μL) were diluted four-fold in water (0.1% TFA) prior to injection into the HPLC system. Two mobile phases, water, and acetonitrile (Fischer Chemical) containing 0.1% trifluoracetic acid (TFA) (v/v), were used to produce a gradient, that was run between 0 and 70% acetonitrile. OBP44a eluted at 44% acetonitrile. The fractions containing the protein were frozen and lyophilized overnight on an SP Scientific Benchtop Pro Ominitronics system. To neutralize residual trifluoroacetate ions after HPLC and form the hydrochloride salt of the protein, the samples were reconstituted in 0.01 M HCl prior to lyophilization. This step was pursued over 48 hours to ensure complete removal of TFA. The dry powder was stored at −20 °C until ready for NMR analysis.
2.6. LC-Mass spectrometry for intact protein
Proteins were injected into a reverse phase HPLC (Agilent 1100 series HPLC, Agilent Technologies) with a ZORBAX 300SB-C18, 2.1 × 50mm, 3.5μM (Agilent Technologies), and introduced into the mass spectrometer as described [35]. Positive ion Electrospray Ionization (ESI) mass spectra for intact protein were obtained with an Agilent 6224 mass spectrometer equipped with an ESI interface and a time-of-flight (TOF) mass detector (Agilent Technologies). Mass spectra were analyzed and deconvoluted using an Agilent software, MassHunter Qualitative Analysis, version B.07.00 (Agilent Technologies).
2.7. NMR sample preparation and experiments
Post-HPLC OBP444a in the powder form was dissolved in potassium phosphate buffer (20 mM, pH 6.62). For most samples, the buffer also contained 0.5 mM ethylenediaminetetraacetic acid (EDTA). An Agilent 8453 nanodrop system was used to determine absorbance at 280 nm and assess concentration based on the predicted extinction coefficient of the protein (11710 M-1.cm−1). Typical protein concentrations at this stage were on the order of 600–1,200 μM.
1H-15N Heteronuclear Single Quantum Coherence (HSQC) experiments on OBP44a dissolved in potassium phosphate buffer (20 mM, pH 6.62, 0.5 mM EDTA) were performed on solutions in the 100 μM concentration range. Each sample, which contained 10% (v/v) D2O, was placed in a Shigemi tube (Shigemi Co. Ltd.). The NMR data were collected on a 600 MHz Bruker Avance III instrument fitted with a triple resonance z-axis gradient cryoprobe. Typical parameters included a sample temperature of 298 K; 1H Larmor frequency of 600.33 MHz; respective 1H and 15N carrier frequencies set to ~4.7 and ~118 ppm; 2048 (t2) x 256 (t1) data points; and 8 transients. The spectra were processed using NMRPipe [36] and visualized in Collaborative Computational Project for NMR (CPPN) [37].
3. Results and discussion
3.1. The E. coli host suitable for OBP44a expression
As shown in Figure 1A, when E. coli BL21 was used as the expression host, all OBP44a was expressed in the pellets as inclusion bodies, even after being induced at 16 °C overnight. This is expected and confirmed by numerous reports [18–23], due to the fact that the reduced cytosolic environment of E. coli is not conducive to the formation of disulfide bonds required for the proper folding of OBP44a. On the contrary, more than 60% of OBP44a was expressed as soluble protein when SHuffle T7 E. coli was used as the expression host. SHuffle E. coli is a strain that not only contains trxB and gor mutations, creating a cytosolic redox potential comparable to ER [38], but it also has constitutive chromosomal expression of DsbC (disulfide bond isomerase), which corrects mis-oxidized proteins into their native form [32]. It is possible that the solubility of OBP44a could be further improved by introducing helper proteins such as PDI, MPD2, or AhpF. However, we chose SHuffle E. coli alone since the expression level already exceeded what we needed.
Figure 1. Comparison of OBP44a expression using different E. coli strains.

(A) OBP44a expression using E.coli BL21 or SHuffle T7 (NEB C3026J). Blue arrow indicates OBP44a. M: Bio-Rad Precision Plus Protein™ Marker; 1: supernatant of BL21 cells without induction; 2: supernatant of BL21 cells induced at 37 °C for 2 hours; 3: supernatant of BL21 cells induced at 16 °C for 16 hours; 4: supernatant of SHuffle E. coli without induction; 5: supernatant of Shuffle E. coli induced at 30 °C for 2 hours; 6: supernatant of SHuffle E. coli induced 16 °C for 16 hours; 7: pellets of BL21 cells without induction; 8: BL21 cells induced at 37 °C for 2 hours; 9: pellets of BL21 cells induced at 16 °C for 16 hours; 10: pellets of SHuffle E. coli without induction; 11: pellets of SHuffle E. coli induced at 30 °C for 2 hours; 12: pellets of SHuffle E. coli induced 16 °C for 16 hours; 13: BSA at 0.1 mg/mL. (B) OBP44a expression level and solubility in SHuffle T7 C3026J and C3029J. Top panel: soluble OBP44a expressed by SHuffle T7 C3026J (in blue) and C3029J (in orange) at 30 °C and 16 °C; bottom panel: percentage of soluble OBP44a expressed by SHuffle T7 C3026J (in blue) and C3029J (in orange) at 30 °C and 16 °C.
There are four SHuffle strains, and three of them, C3026J, C3029J, and C3030J, accommodate vectors using a T7 promoter. Though C3030J is supposed to be a superior strain with tighter expression control, we found that these cells died after induction with IPTG, resulting in no OBP44a expression. Comparatively, Figure 1B shows the expression of OBP44a with C3026J and C3029J as hosts. The percentage of soluble OBP44a increased from 63% to 71% when C3029J was used as the expression host. The total amount of soluble protein expressed increased to more than 100 mg/L when cells were induced at 30 °C for 4 hours or 16 °C overnight. The Shuffle C3026J is derived from K12 E.coli and is leucine- and isoleucine-auxotrophic, which will be expensive if 13C and 15N labeled OBP44a are needed for NMR studies. Therefore, C3029J, which originated from E.coli B cells with OmpT and Lon deletions, was chosen as the expression host for the subsequent study. As shown in the top panel of Figure 1B, induction of OBP44a at 30 °C produced more protein than induction at 16 °C. In addition, volumetric productivity increased with a longer induction time at 30 °C. If the limitation of soluble OBP44a was due to the limiting rate of DsbC correcting mis-oxidized protein, that would explain why lowering the temperature did not improve solubility and productivity as the activity of DsbC would decrease as temperature decreases.
Before proceeding to other experiments, the four-hour post-induction sample was examined on LC-MS. This sample was dominated by one protein, shown as the main peak in Supplement 1B. Supplement 1C demonstrates an excellent quality MS spectrum after deconvolution. Supplement 1A shows the mass of OBP44a with oxidized cysteines as calculated by GPMAW, while Supplement 1D shows the actual mass determined by LC-MS. The mass measured by LC-MS was 16081.5 Da, which matched the calculated mass of 16080.9 for OBP44a with two disulfide bonds and the His-tag. Due to this agreement between calculated and measured mass, it was concluded that the soluble OBP44a expressed by SHuffle C3029J formed two disulfide bonds among four cysteines.
3.2. Effect of dissolved oxygen on OBP44a expression
With the deletions of trxB and gor, SHuffle E. coli cells are under constant oxidative stress, and are even more so when growing in minimal medium, as it lacks the small redox-active molecules (such as cysteine and glutathione) from yeast extract. It was suggested by the manufacturer NEB that it is better to grow SHuffle E. coli at lower dissolved oxygen levels of 3–5% rather than 20–30%. Since minimal medium is required for 15N labeled OBP44a, the effect of dissolved oxygen level on cell growth and OBP44a expression was studied using minimal medium with SHuffle C3029J as expression host. This was done by replacing 15N ammonia chloride with unlabeled ammonia chloride on a 2L fermentation scale. Figure 2A shows cells being induced for 6 hours with dissolved oxygen maintained above 14% using a constant agitation of 400 rpm. Figure 2B shows cells being induced at 3% dissolved oxygen by varying the agitation speed. Both cells were induced at 2 OD, and the final OD600 reached 7 following fermentation with dissolved oxygen maintained above 14%. This was compared to only OD600 of 6 for the one with dissolved oxygen at 3%, as seen in Figure 2C. Cells consumed the same amount of glucose as shown in Figure 2D, but the specific OBP44a expression level (mg/g pellets) was higher for dissolved oxygen above 14% (14.6 mg/g) than for dissolved oxygen at 3% (13.3 mg/g). Overall, cells produced more OBP44a at a higher dissolved oxygen level with a volumetric productivity of 103 mg/L. In contrast, they produced less at 3% dissolved oxygen with a volumetric productivity of only 79 mg/L. This indicated that neither cell growth nor OBP44a expression by recombinant SHuffle E. coli was adversely affected by higher oxygen levels. Since 3% is around the critical oxygen concentration affecting E. coli growth [39], it is understandable that the final OD600 was lower when cells were induced at 3% dissolved oxygen. As long as the dissolved oxygen level is maintained above 10%, cell growth is not limited by oxygen. However, for fermenter operation, it is wise to maintain a level above 10% to allow room for adequate process control. Therefore, 25% dissolved oxygen was chosen for large scale fermentation to produce OBP44a.
Figure 2. Effect of dissolved oxygen levels on OBP44a Expression.

(A) Online data from 2L fermentation where OBP44a was induced at dissolved oxygen levels higher than 14%. (B) Online data from 2L fermentation where OBP44a was induced at 3% dissolved oxygen. (C) Comparison of OD600 and volumetric productivity between the two induction conditions. (D) Comparison of glucose consumption and OBP44a specific expression level between the two induction conditions.
3.3. Large scale fermentation of 15N and 13C/15N labeled OBP44a
Figure 3A–C details the online and offline data of expressing OBP44a using 15N ammonium chloride at a 6L scale. Cells doubled every 2 hours before induction, rather than 1 hour, which is typical for E. coli growing in minimal medium with a temperature drop from 37 °C to 30 °C. Once induced, cells continued to grow, but at a slower rate especially towards the end of the 6-hour induction period. Fermentation samples were lysed using sonication, and the supernatants of the lysates were run on Bis-Tris gel. The Coomassie stained gel in Figure 3D shows a band of ~16 kDa appearing 2-hour post induction. This band intensified as the induction proceeded, and after 6-hour induction OBP44a amounted to more than 10% of the total soluble protein expressed. The total OBP44a expressed reached 108 mg/L, agreeing with the 2L fermentation data. Again, the final fermentation sample was checked using LC-MS, and the mass detected was 16,280.8 which was 1.3 lower than the calculated mass. We observed this often with 15N labeled proteins in our lab, and the lower mass was due to the substrate, 15N ammonium chloride, being only 99% pure. Similar growth curves and protein production levels were observed for 13C/15N labeled OBP44a.
Figure 3. 6L fermentation to produce 15N labeled OBP44a.

(A) Online agitation and dissolved oxygen (DO) profile shows DO was kept above 15% by varying agitation. (B) Offline data of OD600 before and after induction and OBP44a produced (mg/L) at 2, 4, 5, 6 hours post induction. (C) Online data shows temperature, pH and airflow controlled at desired level. (D) Coomassie stained gel of fermentation samples (supernatants of cell lysates). M: Bio-Rad Precision Plus Protein™ Marker; 1: without induction; 2: 2 hours post induction; 3: 4 hours post induction; 4: 5 hours post induction; 5: 6 hours post induction.
3.4. Purification of OBP44a using Ni-NTA and His-tag cleavage by TEV
His-tagged OBP44a was easily captured by a Ni-NTA column, as shown in Figures 4A and 4B. The combined 7 fractions from FPLC normally had ~75 mg of OBP44a in 28 ml, which was obtained from 5 g of pellets.
Figure 4. Purification of OBP44a using Ni-NTA column.

(A) Online profile from AKTA FPLC. The black line corresponds to the percentage of 1M imidazole. The blue line displays the 280 nm absorbance, with OBP appearing in the second peak. Fraction numbers are indicated by brown bars. (B) Coomassie stained gel of the fractions. M: Bio-Rad Precision Plus Protein™ Marker; 1: supernatant before loading to Ni-NTA column; 2: fraction D14; 3: fraction D13; 4: fraction D12; 5: fraction D11; 6: fraction D10; 7: fraction D9; 8: fraction D8; 9: fraction D7; 10: fraction D6; 11: fraction D5; 12: fraction D4; 13: fraction D3; 14: BSA at 0.1 mg/mL.
The TEV protease used for cleaving off the His-tag was made in-house using a plasmid deposited into Addgene by David Waugh’s lab (Addgene #8827). It was suggested by the group that the digestion buffer should contain some form of redox agent in order to obtain effective cleavage. Since OBP44a is a protein with two disulfide bonds, the effect of the reducing agent, TCEP, at a low concentration of 0.1 mM was investigated. LC-MS was used to measure the mass of OBP44a before and after cleavage by TEV. As shown in Table 1A, all the masses of OBP44a after cleavage with 0.1 mM TCEP in the buffer were at least one mass more than the mass with two disulfide bonds, as calculated by GPMAW. Based on our experience with disulfide bond breakage by TCEP, this indicated one disulfide bond was broken within the protein. On the contrary, both disulfide bonds were intact when there was no reducing agent added. In order to not only keep the integrity of the disulfide bonds but also achieve a high percentage of cleavage, the ratio of TEV to OBP44a had to be as high as 1:10, as shown in Table 1B. In other words, the His-tag on OBP44a could be completely removed without the addition of any reducing agent, as long as TEV was added at a higher ratio to OBP44a.
Table 1. Cleavage of the histidine tag from OBP44a by TEV protease.
(A) Effect of TCEP in the digestion buffer on the integrity of disulfide bonds. (B) Cleavage efficiency under different ratios of TEV to OBP44a, without reducing agent in the buffer at 4 °C overnight.
| A | ||||||
|---|---|---|---|---|---|---|
| Tube # | TCEP (mM) | Tev: OBP44a | Temperature and length | Mass measured compared to calculated mass with disulfide bonds | ||
| 1 | 0 | 1:10 | 24 °C 3 hours | 0.09 | ||
| 2 | 0 | 1:100 | 24 °C 3 hours | 0.17 | ||
| 3 | 0.1 | 1:10 | 24 °C 3 hours | 1.27 | ||
| 4 | 0.1 | 1:100 | 24 °C 3 hours | 1.22 | ||
| 5 | 0 | 1:10 | 4 °C overnight | 0.14 | ||
| 6 | 0 | 1:100 | 4 °C overnight | 0.14 | ||
| 7 | 0.1 | 1:10 | 4 °C overnight | 1.26 | ||
| 8 | 0.1 | 1:100 | 4 °C overnight | 1.21 | ||
| B | |
|---|---|
| Tev:OBP44a | Cleavage % |
| 1:100 | 60% |
| 1:50 | 77% |
| 1:25 | 95% |
| 1:10 | 100% |
After cleavage by TEV, the overnight digestion mixture was loaded onto a second Ni-NTA column. As shown in Figure 5, the flowthrough fractions contained OBP44a with more than 99% purity, while TEV, the His-tag, uncleaved OBP44a, and other contaminants were bound to the column.
Figure 5. Coomassie stained gel of flowthrough fractions from second Ni-NTA column after histidine tag cleavage.

The blue arrow indicates the cleaved His-tag. M: Bio-Rad Precision Plus Protein™ Marker; 1: overnight digestion mixture before loading to 2nd Ni-NTA column; 2: fraction A2; 3: fraction A3; 4: fraction A4; 5: fraction A5.
3.5. Confirmation of disulfide bonds by LC-MS
In order to confirm that both disulfide bonds within OBP44a were intact after the 2nd Ni-NTA column, the purified OBP44a was incubated with 5 mM DTT at 37 °C for 30 mins. The mass before DTT incubation was 14,269.0, as shown in Figure 6A. After half an hour of incubation with DTT, there were two peaks shown in both the DAD and TIC spectra (Figure 6B). After deconvolution, the first peak had a mass of 14,270.8, which was 1.8 masses more than the original sample, indicating the breakage of one disulfide bond. The second peak had a mass of 14,273.0, which was four masses more than the original sample, indicating the breakage of two disulfide bonds. Therefore, OBP44a kept the integrity of its disulfide bonds through the process of expression by SHuffle E. coli and purification by Ni-NTA column and TEV cleavage.
Figure 6. LC-MS report of OBP44a after 2nd Ni-NTA.

(A) UV and TIC spectrum of the purified sample, and the deconvoluted mass is 14269.0. (B) UV and TIC spectrum after the purified sample incubated with 5 mM DTT at 37 °C for 30 mins. The deconvoluted mass for peak 1 is 14270.8 and 14273.0 for peak 2.
3.5. Detection of multiple states of purified OBP44a by NMR
Figure 7A shows the NMR spectrum of OBP44a after purification by the second Ni-NTA column. A large number of resonances appear broad with overlapping peaks revealing that the protein exists in multiple distinct states. These species could represent the apo form of OBP44a and its ligand-bound states. These unknown ligands were low in molecular weight, otherwise there would be multiple bands on Coomassie stained gel. Also, these unknown ligands could not be detected by LC-MS, indicating they were most likely separated from OBP44a through the C18 column on LC. This phenomenon of OBP in ligand-bound states was not surprising, as it had been reported previously. In 1999, Wojtasek and Leal [24] purified B. mori PBP from E. coli’s periplasm and found the protein forming a single band on SDS-PAGE gel, but appearing in multiple forms on ion-exchange chromatography. Leite et al. (2009) [27] treated A. aegypti OBP1 after purification in a size exclusion column with citric acid buffer at pH 4.5, in order to obtain OBP1 in only its apo state. Wang et al. [19] washed isolated inclusion bodies extensively with 1M urea before refolding, ridding the proteins of ligands and leaving only apo AeOBP22. We tried treating OBP44a with 8M urea, followed by sequentially washing/dialyzing with 6M, 4M, 2M and 0M urea buffers. However, ligand-bound OBP44a persisted, as shown in Figure 7B. Since the ligands were evidently removed by the C18 column in our LC-MS analysis, a C8 prep column was used to get rid of the unknown ligands from OBP44a. The NMR spectrum in Figure 7C shows the single apo state of OBP44a after purification through a C8 column. Therefore, it was necessary to further purify OBP44a using a C8 column, checking the prep on NMR to ensure it existed in only the apo state.
Figure 7. Mixed-state of OBP44a as detected by NMR.

(A) NMR spectrum of OBP44a after 2nd Ni-NTA. (B) NMR spectrum of OBP44a after urea denaturation/refolding. Examples of NMR resonances that are broad with overlapping peaks indicating multiple states are boxed in red. (C) NMR spectrum of OBP44a after elution from a C8 reverse column. NMR resonances are dispersed with narrow width, in contrast to A & B, revealing single conformation of the apo state.
4. Conclusions
Even though odorant binding proteins have been studied for decades, the full scope of their biological roles and physiological functions has not been fully explored. There is more and more evidence that OBPs are involved in many more critical biological activities than solely odorant binding. Therefore, the need for producing OBPs as recombinant proteins for in vitro studies has been steadily increasing. The yield of expressing OBPs via extracellular secretion from insect cells was only 1–2 mg/L [11], and the yield from E. coli periplasmic expression was only slightly better (5–10 mg/L) [24, 27]. OBP expression from Pichia pastoris achieved yields as high as 60–100 mg/L [16], but there was heterogeneity at the N-terminal of the protein after secretion and expensive 13C methanol had to be used for induction should 13C labeled protein be needed [17]. Expressing OBPs as inclusion bodies requires one extra step of refolding, which could introduce unexpected refolding errors. Lastly, even though Origami E. coli possesses a cytosolic oxidative environment to facilitate disulfide-bond formation, it lacks a critical protein, DsbC, to shuffle mis-oxidized disulfide bonds and fold OBP into its native soluble form. In fact, Tsitsanou et al. still had to refold AgamOBP48 from inclusion bodies, even though Origami B (Novagen) E. coli was chosen as the expression host [31]. In our hands, OBP44a was expressed as soluble protein with a yield as high as 100 mg/L by using SHuffle E. coli, which constitutively expresses DsbC cytosolically. Therefore, we recommend using SHuffle E. coli for expression of OBPs in future studies, especially OBPs with more than three disulfide bonds (C-Plus type) and OBPs with multiple non-consecutive disulfide bonds.
OBPs are small proteins with multiple disulfide bonds and have a binding pocket that binds tightly to many small molecules. This results in its propensity to have some impurities bound as they are extracted from E. coli expression. Thus, their expression and purification present unique difficulties and challenges but also an opportunity to apply modern techniques such as LC-MS and NMR as controls during the process. By using LC-MS, the integrity of the disulfide bonds was monitored from initial expression all the way to the final product after purification. NMR was able to detect bound impurities which both Coomassie stained gel and LC-MS failed to detect. Only by applying NMR could a pure product with a single conformation be guaranteed.
In conclusion, our study provides a feasible solution to the low yield problem in previous expression systems for OBPs and establishes a comprehensive set of quality control methods for purification of the proteins. We believe that our findings will facilitate further studies of OBPs and provide insights into their biological roles and physiological functions.
Supplementary Material
Supplement 1. (A) Calculated mass of OBP44a with two disulfide bonds, as reported by GPMAW software. (B-D) LC-MS report of 4-hour post induction sample after expressing OBP44a using C3029J as the host and inducing at 30 °C.
Highlights.
Expressed OBP44a, a protein with two disulfide bonds, as cytosolic soluble protein using SHuffle E.coli
Achieved OBP44a yield of 100 mg/L using fermentation
Monitored integrity of disulfide bonds using LC-MS
Guaranteed single protein conformation using NMR
Acknowledgments
We would like to thank Dr. Duck-Yeon Lee for his support of acquiring the LC-MS data and analysis, and Dr. Marie-Paule Strub for deleting four non-OBP44a related amino acids in the original construct. These investigations were supported by the Intramural Research Programs of the National Heart, Lung, and Blood Institute (NHLBI) and National Institute of Neurological Disorders and Stroke (NINDS) of the NIH.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Formula for modified LB: 10 g/L tryptone, 5 g/L YE, 5 g/L NaCl, 3.5 g/L glucose, 0.5 g/L MgSO4∙7H2O, 12 g/L K2HPO4, pH 7.0
References
- 1.Brito NF, Moreira MF, Melo ACA. A look inside odorant-binding proteins in insect chemoreception. J Insect Physiol 2016;95:51–65. doi: 10.1016/j.jinsphys.2016.09.008. [DOI] [PubMed] [Google Scholar]
- 2.Larter NK, Sun JS, Carlson JR. Organization and function of Drosophila odorant binding proteins. Elife 2016;5. doi: ARTN e20242 10.7554/eLife.20242. [DOI] [PMC free article] [PubMed]
- 3.Pelosi P, Iovinella I, Zhu J, Wang GR, Dani FR. Beyond chemoreception: diverse tasks of soluble olfactory proteins in insects. Biol Rev 2018;93(1):184–200. doi: 10.1111/brv.12339. [DOI] [PubMed] [Google Scholar]
- 4.Ishida Y, Ishibashi J, Leal WS. Fatty Acid Solubilizer from the Oral Disk of the Blowfly. Plos One 2013;8(1). doi: ARTN e51779 10.1371/journal.pone.0051779. [DOI] [PMC free article] [PubMed]
- 5.Sun YL, Huang LQ, Pelosi P, Wang CZ. Expression in Antennae and Reproductive Organs Suggests a Dual Role of an Odorant-Binding Protein in Two Sibling Helicoverpa Species. Plos One 2012;7(1). doi: ARTN e30040 10.1371/journal.pone.0030040. [DOI] [PMC free article] [PubMed]
- 6.Sirot LK, Poulson RL, McKenna MC, Girnary H, Wolfner MF, Harrington LC. Identity and transfer of male reproductive gland proteins of the dengue vector mosquito, Aedes aegypti: Potential tools for control of female feeding and reproduction. Insect Biochem Molec 2008;38(2):176–89. doi: 10.1016/j.ibmb.2007.10.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Costa-da-Silva AL, Kojin BB, Marinotti O, James AA, Capurro ML. Expression and accumulation of the two-domain odorant-binding protein AaegOBP45 in the ovaries of blood-fed Aedes aegypti. Parasite Vector 2013;6. doi: Artn 364 10.1186/1756-3305-6-364. [DOI] [PMC free article] [PubMed]
- 8.Marinotti O, Ngo T, Kojin BB, Chou SP, Nguyen B, Juhn J, et al. Integrated proteomic and transcriptomic analysis of the Aedes aegypti eggshell. Bmc Dev Biol 2014;14. doi: 10.1186/1471-213x-14-15. [DOI] [PMC free article] [PubMed]
- 9.Calvo E, Mans BJ, Ribeiro JMC, Andersen JF. Multifunctionality and mechanism of ligand binding in a mosquito antiinflammatory protein. P Natl Acad Sci USA 2009;106(10):3728–33. doi: 10.1073/pnas.0813190106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Mans BJ, Calvo E, Ribeiro JMC, Andersen JF. The crystal structure of D7r4, a salivary biogenic amine-binding protein from the malaria mosquito anopheles gambiae. Journal of Biological Chemistry 2007;282(50):36626–33. doi: 10.1074/jbc.M706410200. [DOI] [PubMed] [Google Scholar]
- 11.Krieger J, Raming K, Prestwich GD, Frith D, Stabel S, Breer H. Expression of a Pheromone-Binding Protein in Insect Cells Using a Baculovirus Vector. Eur J Biochem 1992;203(1–2):161–6. doi: DOI 10.1111/j.1432-1033.1992.tb19841.x. [DOI] [PubMed] [Google Scholar]
- 12.Newcomb RD, Sirey TM, Rassam M, Greenwood DR. Pheromone binding proteins of Epiphyas postvittana (Lepidoptera: Tortricidae) are encoded at a single locus. Insect Biochem Mol Biol 2002;32(11):1543–54. doi: 10.1016/s0965-1748(02)00075-9. [DOI] [PubMed] [Google Scholar]
- 13.Danty E, Briand L, Michard-Vanhee C, Perez V, Arnold G, Gaudemer O, et al. Cloning and expression of a queen pheromone-binding protein in the honeybee: an olfactory-specific, developmentally regulated protein. J Neurosci 1999;19(17):7468–75. doi: 10.1523/JNEUROSCI.19-17-07468.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Lartigue A, Gruez A, Briand L, Blon F, Bezirard V, Walsh M, et al. Sulfur single-wavelength anomalous diffraction crystal structure of a pheromone-binding protein from the honeybee Apis mellifera L. J Biol Chem 2004;279(6):4459–64. Epub 20031031. doi: 10.1074/jbc.M311212200. [DOI] [PubMed] [Google Scholar]
- 15.Pesenti ME, Spinelli S, Bezirard V, Briand L, Pernollet JC, Campanacci V, et al. Queen bee pheromone binding protein pH-induced domain swapping favors pheromone release. J Mol Biol 2009;390(5):981–90. Epub 20090528. doi: 10.1016/j.jmb.2009.05.067. [DOI] [PubMed] [Google Scholar]
- 16.Briand L, Lescop E, Bezirard V, Birlirakis N, Huet JC, Henry C, et al. Isotopic double-labeling of two honeybee odorant-binding proteins secreted by the methylotrophic yeast Pichia pastoris. Protein Expr Purif 2001;23(1):167–74. doi: 10.1006/prep.2001.1478. [DOI] [PubMed] [Google Scholar]
- 17.Briand L, Perez V, Huet JC, Danty E, Masson C, Pernollet JC. Optimization of the production of a honeybee odorant-binding protein by Pichia pastoris. Protein Expr Purif 1999;15(3):362–9. doi: 10.1006/prep.1998.1027. [DOI] [PubMed] [Google Scholar]
- 18.Prestwich GD. Bacterial expression and photoaffinity labeling of a pheromone binding protein. Protein Sci 1993;2(3):420–8. doi: 10.1002/pro.5560020314. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Wang J, Murphy EJ, Nix JC, Jones DNM. Aedes aegypti Odorant Binding Protein 22 selectively binds fatty acids through a conformational change in its C-terminal tail. Sci Rep-Uk 2020;10(1). doi: ARTN 3300 10.1038/s41598-020-60242-9. [DOI] [PMC free article] [PubMed]
- 20.Kruse SW, Zhao R, Smith DP, Jones DNM. Structure of a specific alcohol-binding site defined by the odorant binding protein LUSH from Drosophila melanogaster. Nat Struct Biol 2003;10(9):694–700. doi: 10.1038/nsb960. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Laughlin JD, Ha TS, Jones DNM, Smith DP. Activation of pheromone-sensitive neurons is mediated by conformational activation of pheromone-binding protein. Cell 2008;133(7):1255–65. doi: 10.1016/j.cell.2008.04.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Northey T, Venthur H, De Biasio F, Chauviac FX, Cole A, Ribeiro KALR, et al. Crystal Structures and Binding Dynamics of Odorant-Binding Protein 3 from two aphid species Megoura viciae and Nasonovia ribisnigri. Sci Rep-Uk 2016;6. doi: ARTN 24739 10.1038/srep24739. [DOI] [PMC free article] [PubMed]
- 23.Spinelli S, Lagarde A, Iovinella I, Legrand P, Tegoni M, Pelosi P, et al. Crystal structure of Apis mellifera OBP14, a C-minus odorant-binding protein, and its complexes with odorant molecules. Insect Biochem Molec 2012;42(1):41–50. doi: 10.1016/j.ibmb.2011.10.005. [DOI] [PubMed] [Google Scholar]
- 24.Wojtasek H, Leal WS. Conformational change in the pheromone-binding protein from Bombyx mori induced by pH and by interaction with membranes. Journal of Biological Chemistry 1999;274(43):30950–6. doi: DOI 10.1074/jbc.274.43.30950. [DOI] [PubMed] [Google Scholar]
- 25.Wogulis M, Morgan T, Ishida Y, Leal WS, Wilson DK. The crystal structure of an odorant binding protein from Anopheles gambiae: evidence for a common ligand release mechanism. Biochem Biophys Res Commun 2006;339(1):157–64. Epub 20051109. doi: 10.1016/j.bbrc.2005.10.191. [DOI] [PubMed] [Google Scholar]
- 26.Lautenschlager C, Leal WS, Clardy J. Bombyx mori pheromone-binding protein binding nonpheromone ligands: implications for pheromone recognition. Structure 2007;15(9):1148–54. doi: 10.1016/j.str.2007.07.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Leite NR, Krogh R, Xu W, Ishida Y, Iulek J, Leal WS, et al. Structure of an odorant-binding protein from the mosquito Aedes aegypti suggests a binding pocket covered by a pH-sensitive “Lid”. PLoS One 2009;4(11):e8006. Epub 20091126. doi: 10.1371/journal.pone.0008006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Mao Y, Xu X, Xu W, Ishida Y, Leal WS, Ames JB, et al. Crystal and solution structures of an odorant-binding protein from the southern house mosquito complexed with an oviposition pheromone. Proc Natl Acad Sci U S A 2010;107(44):19102–7. Epub 20101018. doi: 10.1073/pnas.1012274107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Zheng J, Li J, Han L, Wang Y, Wu W, Qi X, et al. Crystal structure of the Locusta migratoria odorant binding protein. Biochem Biophys Res Commun 2015;456(3):737–42. Epub 20141215. doi: 10.1016/j.bbrc.2014.12.048. [DOI] [PubMed] [Google Scholar]
- 30.Tsitsanou KE, Thireou T, Drakou CE, Koussis K, Keramioti MV, Leonidas DD, et al. Anopheles gambiae odorant binding protein crystal complex with the synthetic repellent DEET: implications for structure-based design of novel mosquito repellents. Cell Mol Life Sci 2012;69(2):283–97. Epub 20110614. doi: 10.1007/s00018-011-0745-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Tsitsanou KE, Drakou CE, Thireou T, Vitlin Gruber A, Kythreoti G, Azem A, et al. Crystal and solution studies of the “Plus-C” odorant-binding protein 48 from Anopheles gambiae: control of binding specificity through three-dimensional domain swapping. J Biol Chem 2013;288(46):33427–38. Epub 20131004. doi: 10.1074/jbc.M113.505289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Lobstein J, Emrich CA, Jeans C, Faulkner M, Riggs P, Berkmen M. SHuffle, a novel Escherichia coli protein expression strain capable of correctly folding disulfide bonded proteins in its cytoplasm. Microb Cell Fact. 2012;11. doi: Artn 56 Tropea JE, Cherry S, Waugh DS. Expression and purification of soluble His(6)-tagged TEV protease. Methods Mol Biol 2009;498:297–307. doi: 10.1007/978-1-59745-196-3_19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Apffel A, Fischer S, Goldberg G, Goodley PC, Kuhlmann FE. Enhanced sensitivity for peptide mapping with electrospray liquid chromatography-mass spectrometry in the presence of signal suppression due to trifluoroacetic acid-containing mobile phases. J Chromatogr A 1995;712(1):177–90. doi: 10.1016/0021-9673(95)00175-m. [DOI] [PubMed] [Google Scholar]
- 34.Delaglio F, Grzesiek S, Vuister GW, Zhu G, Pfeifer J, Bax A. NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J Biomol NMR 1995;6(3):277–93. doi: 10.1007/BF00197809. [DOI] [PubMed] [Google Scholar]
- 35.Skinner SP, Fogh RH, Boucher W, Ragan TJ, Mureddu LG, Vuister GW. CcpNmr AnalysisAssign: a flexible platform for integrated NMR analysis. J Biomol NMR 2016;66(2):111–24. Epub 20160923. doi: 10.1007/s10858-016-0060-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Bessette PH, Aslund F, Beckwith J, Georgiou G. Efficient folding of proteins with multiple disulfide bonds in the Escherichia coli cytoplasm. Proc Natl Acad Sci U S A 1999;96(24):13703–8. doi: 10.1073/pnas.96.24.13703. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Chen J, Tannahill AL, Shuler ML. Design of a system for the control of low dissolved oxygen concentrations: critical oxygen concentrations for Azotobacter vinelandii and Escherichia coli. Biotechnol Bioeng 1985;27(2):151–5. doi: 10.1002/bit.260270208. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplement 1. (A) Calculated mass of OBP44a with two disulfide bonds, as reported by GPMAW software. (B-D) LC-MS report of 4-hour post induction sample after expressing OBP44a using C3029J as the host and inducing at 30 °C.
