Skip to main content
Protein Science : A Publication of the Protein Society logoLink to Protein Science : A Publication of the Protein Society
. 2024 May 27;33(6):e5029. doi: 10.1002/pro.5029

High‐throughput system for the thermostability analysis of proteins

Sae Ito 1, Ryo Matsunaga 1,2,, Makoto Nakakido 1,2, Daisuke Komura 3, Hiroto Katoh 3, Shumpei Ishikawa 3, Kouhei Tsumoto 1,2,4,
PMCID: PMC11129621  PMID: 38801228

Abstract

Thermal stability of proteins is a primary metric for evaluating their physical properties. Although researchers attempted to predict it using machine learning frameworks, their performance has been dependent on the quality and quantity of published data. This is due to the technical limitation that thermodynamic characterization of protein denaturation by fluorescence or calorimetry in a high‐throughput manner has been challenging. Obtaining a melting curve that derives solely from the target protein requires laborious purification, making it far from practical to prepare a hundred or more samples in a single workflow. Here, we aimed to overcome this throughput limitation by leveraging the high protein secretion efficacy of Brevibacillus and consecutive treatment with plate‐scale purification methodologies. By handling the entire process of expression, purification, and analysis on a per‐plate basis, we enabled the direct observation of protein denaturation in 384 samples within 4 days. To demonstrate a practical application of the system, we conducted a comprehensive analysis of 186 single mutants of a single‐chain variable fragment of nivolumab, harvesting the melting temperature (T m) ranging from −9.3 up to +10.8°C compared to the wild‐type sequence. Our findings will allow for data‐driven stabilization in protein design and streamlining the rational approaches.

Keywords: data‐driven, high‐throughput, protein engineering, single‐chain variable fragment, thermal stability

1. INTRODUCTION

Thermal stability is a pivotal stability assessment criterion considered during protein design. At elevated temperatures, folded proteins, which are often stated as marginally more stable than unfolded states, undergo partial denaturation owing to disruption in the equilibrium, leading to overall structural collapse. Concurrently, for each molecule, the exposed hydrophobic patches resulting from unfolding attract each other through hydrophobic interactions, triggering chaotic aggregation (Fleishman & Baker, 2012). Thus, thermal stability is associated with both the structural and colloidal stability of proteins and colloidal stability (Brummitt et al., 2011; Melien et al., 2020; Wang, 2005). Several measurements can be used to describe this property in terms of temperature and energetic units. Principally, the midpoint temperature at which half of the proteins denature is referred to as the melting temperature T m, while unfolding enthalpy represents the energy difference between the folded state and the unfolded state as ΔH. T m typically provides a common and interpretable indicator of the thermodynamic equilibrium (Miotto et al., 2019; Sanfelice & Temussi, 2016) of proteins.

The development of molecules with high thermal stability is a potent strategy for maintaining intact and active structures under a wide range of conditions. Thus far, the introduction of mutations has been a major approach to this end, resulting in minimal effects on activity and structure. Conventionally, in silico calculations of mutational effects have been conducted using rational design approaches; notable examples rely on energetic calculations or sequence conservations (Alford et al., 2017; Kulshreshtha et al., 2016; Li et al., 2013). Although these powerful tools have succeeded in designing stable mutants, their success rates are not high; therefore, empirical investigation remains essential. The numerical prediction of experimental parameters adds further complexity to this issue. One possible factor is that certain methods used to study protein folding, such as sequence statistics or physics‐based calculations based on sequences and structures, may not fully account for the temperature dependence of amino acid interactions. Consequently, their ability to accurately predict stability across a range of temperatures may be limited (Pucci & Rooman, 2014).

Recently, various machine learning approaches have been used to predict the physical properties of proteins, driven by diverse objectives. These methodologies encompass a wide range including (semi‐)supervised, and unsupervised learning (Kouba et al., 2023), utilizing diverse representations such as language models (Huang & Li, 2023) and graph‐based approaches (Gligorijević et al., 2021). Particularly for predicting the mutational effects on a specific protein property like thermal stability, supervised learning with a set of labeled data proves to be a desirable strategy (Li et al., 2024; Pucci et al., 2016). However, obtaining large amounts of experimental data at one time is technically challenging. As a result, the current training datasets (Nikam et al., 2021; Xavier et al., 2021) consist of individual reports conducted by different researchers. Here, importantly, the predictive models constructed through the learning are completely dependent on the historically accumulated data. Their nonstandardized and biased nature caused the model's accuracy and generalizability to reach a plateau (Louis & Abriata, 2021).

In this study, we developed a high‐throughput thermal stability analysis system to overcome this limitation, called “Brevity” (Brevibacillus thermodynamic stability analysis system). It allows for thermodynamic unfolding characterization by differential scanning fluorimetry (DSF) of up to 384 proteins within 4 days, with parallel processing of sequencing by a nanopore sequencer. The expression host Brevibacillus enables efficient secretion of active proteins (Mizukami et al., 2018; Yao et al., 2020), and the sequence of the target region in the plasmid inside the bacteria can be analyzed using direct PCR and nanopore sequencing.

We applied the system to the T m‐based mutational analysis of nivolumab scFv, an anti‐human programmed cell death protein 1 (hPD‐1) antibody, to comprehensively analyze the thermal stability of single mutants designed based on Rosetta Site Saturation Mutagenesis (SSM) calculation. The utility of our system was proved by the quantitative thermostability characterization of 184 mutants, and several key residues responsible for stability were revealed through high‐throughput experiments.

2. RESULTS

2.1. Construction of brevity

An overview of “Brevity” is presented in Figure 1. Expression vectors of a single‐chain variable fragment (scFv) contain VH and VL sequences linked by a (GGGGS)×3 linker followed by a 6xHis tag. DNA fragments containing the scFv sequence were inserted into a linearized vector to construct a plasmid library. Brevibacillus transformed with the plasmids were cultivated overnight on agar plates. Colonies were picked and inoculated into each well of a 96‐well plate, followed by shaking for 60 h. A portion of the culture media was used for sequencing. The supernatant containing the expressed scFvs was used for thermal stability analysis after treatment with the following purification procedures.

FIGURE 1.

FIGURE 1

Overview of the high‐throughput thermal stability analysis system “Brevity”.

A human variable fragment (Fv) sequence pair was prepared for protocol development (Table S1, clone #1). After incubation, the culture supernatants of clone #1 were subjected to ammonium sulfate (AS) precipitation for subsequent purification using plate‐scale immobilized metal affinity chromatography (IMAC) (Figure 2). AS precipitation effectively removes low‐molecular‐weight components derived from culture medium (Matsunaga et al., 2023). However, the purity of the final sample was not enough for DSF measurements after treatment with 60% (v/v) saturated AS and IMAC, yielding initial high fluorescence (Figure 3a). Therefore, two‐step purification was performed for further refinement. By removing the soluble aggregates in the first step with lower AS concentration (20%–40%), we obtained a melting curve derived from the target scFv molecules with the optimal stepped gradient 30%–60% (v/v) of saturated AS (Figure 3a). Further analysis using size‐exclusion chromatography (SEC) revealed that the treatment selectively removed high‐molecular‐weight soluble aggregates (Figure 3b). scFv was eluted with EDTA from the IMAC resin and used for DSF measurements.

FIGURE 2.

FIGURE 2

Overview of the plate‐to‐plate purification protocol in the high‐throughput system.

FIGURE 3.

FIGURE 3

Thermal stability measurement and purity analysis of scFv clone #1 treated with different purification protocols. (a) Differential scanning fluorometry (DSF) thermograms of scFv clone #1 purified by one step or the corresponding stepwise‐gradient ammonium sulfate (AS) treatment and His‐tag affinity chromatography. (b) Size exclusion chromatogram of scFv clone #1 purified by AS treatment with the corresponding stepwise‐gradient and His‐tag affinity chromatography.

Applicability and accuracy were validated by analyzing representative eight scFv sequences, including the clone used for protocol development as clone #1. The expression vectors of each clone were used to transform Brevibacillus, and colonies were picked from each plate. The eight scFv clones exhibited varying stability properties, with T m values ranging from 42.9 to 67.9°C. The data were compared with those obtained using conventional methodologies, which involved purification via open‐column IMAC, followed by fine purification using SEC (Figures S1 and S2). All scFvs prepared using the high‐throughput system showed consistent T m to those prepared by the conventional method, agreed with a correlation coefficient of 0.97 and a coefficient of determination of 0.99 (Figure 4, Table S2).

FIGURE 4.

FIGURE 4

Plots of determined T m between the conventional sample purification method and high throughput plate‐scale treatments. n = 1 for the conventional method, n = 3 for the high‐throughput system.

In this manner, scFv clones were expressed, purified, and analyzed in each well on a per‐plate basis. For instance, by culturing four plates using a plate shaker, a dataset of 384 samples was obtained within 4 days.

2.2. Sequencing by direct PCR and nanopore sequencer

The sequence of each scFv‐coding region in the plasmid was determined by nanopore sequencing of direct PCR products from bacteria in culture. Previously, a multiplexed barcoding system for E. coli was developed (Currin et al., 2019). We applied this methodology to our system and simultaneously determined the sequences of up to 1152 samples from 12 culture plates. Samples containing multiple clones were excluded from the analysis. We overcame the limitation of accuracy of nanopore sequencing by setting the number of reads aiming around 1200 reads for sequencing 792 bp and performing alignment. This allowed for in‐house parallel processing of thermal stability and sequencing analyses in a cost‐effective and high‐throughput manner while maintaining accuracy. The least read number of valid data in this study resulted in 232 and the average was 1160.

2.3. Construction of a single mutant library of nivolumab scFv

As a practical example of this system, we comprehensively analyzed the designed nivolumab single mutants for the improvement of thermal stability. A Rosetta‐based web tool application, stabilize‐PM (Lyskov et al., 2013; Thieker et al., 2022), was used for SSM calculations based on the input structure, which was modeled from the crystal structure (PDBID: 5WT9). Each calculation was repeated thrice, and the minimum value among the repetitions, which was considered to represent the most stable structure among the possibilities, was referenced. To condense the number of mutants for the experimental validation, we set a threshold of ΔE being −2 or lower Rosetta energy unit (REU) in the output. Thus, 213 single mutants were selected, representing the top 5.2% of the calculated 4123 mutants (217 residues substituted for 19 amino acids) (Table S3). For residue numbering, Rosetta numbering from the SSM computation was used in succession, starting at 1 for the first residue and increasing by one for each residue, regardless of the chain designation. Instead, H. or L. in the head of each mutant indicates the VH or VL domain.

2.4. Variation and determination of thermal stability of designed mutants

The T m values of 190 single mutants were determined, and 23 mutants did not appear during the iterations of the experiments performed in this study. Four mutants (H.E6T, H.C96S, H.C96T, and L.Q90C) were excluded from the data because of their low yield by setting a threshold of peak height > 1000 relative fluorescence units (RFU) on the DSF melting curves. The boxplots of the remaining 186 data points are shown in Figure 5. Most clones exhibited a small deviation, whereas some boxplots had outliers (such as H.G33H and H.G42H) or peculiarly widespread plots (such as H.G42D and H.Q82I). The median value was used to define the T m for each mutant (Table S4). Despite the stability screening by Rosetta, the T m of designed single mutants ranged from −8.3 to +10.8°C compared to the wild‐type. There was a poor correlation between empirical ΔT m and calculated ΔE (Figures S3 and S4). The mutations that enhanced thermal stability (ΔT m > 2.0) were categorized into three groups with a few exceptions according to the amino acids' substitution patterns: residues that stabilized upon substitution with (a) a variety of amino acids, (b) and (c) negatively/positively charged amino acids and, (d) aromatic amino acids (Table 1, Figure 6).

FIGURE 5.

FIGURE 5

Box plots of melting temperature (T m) of nivolumab scFv mutants.

TABLE 1.

Categorized stability hit mutants of nivolumab (ΔT m >2.0°C) and their ΔT m (region: Chothia).

Category Chain Region wt residue mut residue ΔT m
Any H HFR1 D21 E 3.0
Q 3.0
A 2.8
S 2.6
K 2.4
V 2.2
CDR‐H1 S32 R 3.4
V 3.0
M 2.4
I 2.2
K 2.2
HFR2 G33 T 8.0
N 7.2
H 7.0
V 6.2
M 5.8
D 5.2
L 5.2
P 2.7
E 2.4
A 2.0
I 2.0
Charged H CDR‐H1 T28 D 4.5
HFR2 K43 D 2.0
HFR3 S85 D 2.3
CDR‐H3 N99 E 3.0
L CDR‐L1 S31 H 2.0
A34 K 9.4
R 4.6
CDR‐L3 S91 R 10.8
K 4.1
Aromatic H HFR2 V37 F 2.4
V50 Y 7.2
F 7.1
Others H HFR2 G42 T 3.0
L LFR3 F83 T 2.2
CDR‐L3 S91 Q 2.0

FIGURE 6.

FIGURE 6

Crystal structure of hPD‐1 (brown surface) and nivolumab (blue ribbon) (PDBID: 5wtq) with stability coldspots found in this study. Residues that stabilized when substituted to (a) any, (b) negatively charged, (c) positively charged, (d) aromatic (green) residues and ones not categorized (yellow).

2.5. Validation of mutants with improved stabilities

The three clones with the highest thermal stability were validated by DSF and differential scanning calorimetry (DSC) after conventional fine purification by SEC (Table 2, Figures S6 and S7). The thermal stability of the mutants relative to that of the wild‐type was precisely assayed using high‐throughput analysis.

TABLE 2.

Melting temperature (T m) and T onset of the most stabilized mutants of nivolumab scFv.

High‐throughput system Conventional DSF Conventional DSC
T m ΔT m T m (n = 3) ΔT m T m (n = 3) ΔT m T onset (n = 3) ΔT onset
wt 49.0 46.5 ± 0.2 51.2 ± 1.0 39.1 ± 0.2
L.S91R 59.2 10.2 59.6 ± 0.4 13.1 60.7 ± 0.2 9.5 50.1 ± 0.2 11.0
L.A34K 58.4 9.4 58.1 ± 0.1 11.6 59.4 ± 0.1 8.2 49.5 ± 0.2 10.4
H.V50Y 56.2 7.2 56.8 ± 0.2 10.3 58.5 ± 0.2 7.3 47.4 ± 0.2 8.3

Note: Median values for the high‐throughput method, and n = 3 for the conventional analysis (mean ± SD). For Conventional DSF, T m measured over 8 μM sample concentration was referred.

3. DISCUSSION

In this study, we developed a high‐throughput system to analyze the thermodynamic stability of proteins quantitatively. The system allowed for the direct measurement of T m and thermograms using DSF for 384 different proteins within 4 days. DSF is a widely known high‐throughput stability analysis method for proteins that enables T m determination of samples prepared in a 96‐well plate within 2 h using a real‐time qPCR machine. Despite the fast and easy‐to‐handle methodology, its application using many samples on a plate has been primarily limited to buffer (Houser et al., 2021; Wu et al., 2023) or low‐molecular‐weight ligand (Huynh & Partch, 2015; Li & Zhang, 2021) screening for the same protein. This is partly because, although melting curve assays require the absence of impurities, sample preparation of various proteins on a plate scale is technically difficult because of the laborious nature of conventional purification methods. We overcame this limitation by ensuring high sample purity through plate‐level processing. Two‐step AS precipitation was optimized to efficiently remove soluble aggregates from the solution. This resulted in a high ratio of scFv monomers to impurities after the affinity chromatography. Finally, distinguishable melting curves were successfully obtained by reducing the initial fluorescence, enabling the determination of T m. DNA sequencing was conducted in parallel with the thermal stability analysis of the secreted proteins. Compared with conventional sequencing methods, direct PCR from host bacteria streamlines experiments by shortening the plasmid extraction procedure. Combined with long‐read DNA sequencing technology using a nanopore sequencer, this enables fast, cost‐effective, and accurate analyses. The limitation in raw‐read accuracy was overcome by consensus sequences derived from hundreds of raw reads (Karst et al., 2021).

Innovative methodologies have recently been developed for high‐throughput data collection of protein stability. For instance, a method based on interaction activity was used to define the thermostability of 2700 distinct scFvs by measuring the loss of activity after heat incubation (Harmalkar et al., 2023). Additionally, the latest developments in mega‐scale proteolytic analyses using cDNA displays have provided abundant and insightful data for deciphering folding stability (Tsuboyama et al., 2023). Our system is advantageous in terms of data quality and applicability to other emerging high‐throughput methodologies. These advantages include (a) direct observation of the thermal denaturing process by DSF, (b) complete independence from protein activity, and (c) expected adaptability to diverse proteins with multiple domains that can be expressed by Brevibacillus (Panda et al., 2014; Yao et al., 2020). The workflow is structured to support increased throughput by automation by dispensing robotics.

We applied this method to the miniaturized antibody format scFv. Screening scFvs for thermostability can expand their potential applications in novel therapeutics (Dotti et al., 2014; Duan et al., 2021; Holliger & Hudson, 2005; Husain & Ellerman, 2018) by addressing their inherent instability compared with full‐length IgG antibodies and other formats. As a practical example, 213 nivolumab single mutants were thoroughly analyzed. Most of the mutants showed small variations, although there were some outliers. Concentration‐dependent thermostability was identified in the DSF analysis of the wild‐type (Figure S8). The correlation between peak height, which represents protein concentration, and T m value was observed in both the wild‐type and mutants collected by the high‐throughput system (Figure S9). The relationship between these two factors may partly explain the variation in mutational analysis. Furthermore, mutations detrimental to overall folding may result in minimal or no monomer yield in the final solution after purification, thus eliminating it from the analysis. However, mutants that are expressed but incompletely folded or prone to colloidal instability may not be distinguishable using this system, potentially contributing to T m errors. Cross‐contamination during the plate‐to‐plate sample preparation should also be considered.

We quantitatively screened the nivolumab mutants for improved thermostability at different positions. For example, substituting various amino acids for the 33rd glycine in the heavy chain consistently improved thermal stability (Figure 6a). The replacement of some residues with negatively or positively charged or aromatic amino acids resulted in enhanced thermal stability, indicating the potential formation of new electrostatic or hydrophobic interactions within the molecule (Figure 6b–d). In the case of the 91st serine in the light chain, mutation into arginine improved stability by forming a salt bridge with L.D50 or L.D101. The tyrosine substitution at the 50th valine in the heavy chain was considered to create the π‐stacking interaction with the 94th tryptophan, contributing to enhanced stability (Figure S7). The T m validated by DSF for some of these mutants and the wild‐type differed from that measured by DSC (Table 2), a phenomenon also reported in another study (Shi et al., 2013). As indicated by the study, the T m value, as measured by DSF, is reduced by a lower T onset, which is the temperature at which denaturation of protein begins. Similarly, the ΔT m values determined by DSF exhibited a closer correlation with ΔT onset values than ΔT m values determined by DSC. The discrepancy between the T m values validated by DSF and DSC of each construct would vary depending on how much the mutation affects on T onset.

A poor correlation between the predicted stability scores and experimental thermostability was observed in our study, demonstrating the complexity of predicting thermodynamic denaturation based on the structure information and statistical functions (Figures S3–S5).

Although our system demonstrates accurate high‐throughput T m screening of different proteins, it has certain limitations. First, our demonstration was limited to the analysis of the antibody format, scFv, and its expected applicability to other protein families was not experimentally validated. Although Brevibacillus has proven high expression ability and efficacy for proteins, including various enzymes and antibody formats, such as VHH, scFv, and Fab (Mizukami et al., 2015, 2018; Mu et al., 2013; Onishi et al., 2013), not as much knowledge and experience has been accumulated as in E. coli. Moreover, the thermal stability dataset we constructed relied on the initial in silico mutation screening by Rosetta. Here, we found various substitution patterns in the analysis of nivolumab. However, although we successfully obtained a dataset with an abundance of stabilizing mutants (46% of the mutants showed improved T m), design biases may have been introduced by its scoring algorithm. Furthermore, the loss of scFv activity due to mutations was not considered in this study. Recently, our twin system BreviA (Matsunaga et al., 2023) enabled the high‐throughput interaction kinetic analysis of Fab. Combinatory analyses will facilitate the simultaneous optimization of interaction kinetics and thermal stability, deciphering the quantitative effects of mutations on different physicochemical parameters.

4. CONCLUSION

Data‐driven approaches for protein design are becoming increasingly popular. Although versatile thermal stability assessment tools have been developed to date, there is a lack of uniform data collection methods, especially for complex multidomain proteins, which has hindered their validation and optimization. To address these limitations, it is crucial to (1) design datasets based on individual strategies; (2) incorporate diverse proteins, including multi‐domain families; (3) acquire useful parameters for prediction under uniform conditions; and (4) manage the accuracy and abundance of data. We specifically designed a system to meet these requirements. Our system successfully measured the T m of single mutants of nivolumab scFv, deciphering the temperature‐dependent mutational effects on stability. We expect to facilitate the prediction of thermal stability of proteins by building upon this technical breakthrough.

5. MATERIALS AND METHODS

5.1. Design of nivolumab scFv mutants using Rosetta

We designed single mutants of nivolumab scFv with the PM application (Thieker et al., 2022) using the publicly accessible web server, ROSIE2 (Lyskov et al., 2013; Thieker et al., 2022). Before the energy calculations by Rosetta, the structure of nivolumab scFv was modeled by deleting the constant regions based on the crystal structure of nivolumab Fab fragment (PDBID: 5WT9), and the relaxed structure was obtained from among five replicative calculations by MODELLER 10.4 software (Webb & Sali, 2016) as input for the calculation by Rosetta. The structure with the lowest energy score, indicating the highest structural stability, was chosen as the input structure for subsequent calculations. Saturation mutagenesis was conducted for the full sequence three times, and the minimum scores of ΔE from each amino acid substitution were referred to. Experimentally tested were only the ones with ΔE < −2.0 REU (the lower, the stabler).

5.2. Construction of scFv vectors

The nucleotide sequences of the scFv antibodies were codon‐optimized and synthesized by Twist Bioscience. The variable regions of each antibody, VH and VL sequences, were connected by a (GGGGS)×3 linker, followed by a 6×His tag at the C‐terminus, and inserted into the pNI vector (Takara Bio). The N‐terminus of the scFv was fused to the signal peptide sequence of pBIC3 (TaKaRa Bio, Shiga, Japan).

5.3. Construction of nivolumab mutant plasmid libraries

Double‐stranded DNA fragments (300 bp) containing each nivolumab mutation were synthesized by Twist Biosciences. The fragments with single mutations were divided into three regions to cover the entire nivolumab scFv region. Fragments from the same region were mixed and assembled with a linearized vector amplified from the nivolumab scFv vector containing all but the fragment region, using the NEBuilder HiFi DNA Assembly Master Mix (NEB). The reaction mixture was used to transform E. coli JM109 competent cells. All colonies that appeared on the agar plate containing ampicillin sodium were collected, and the mutant plasmid library was extracted from the collected bacteria.

5.4. Expression of scFvs

The plasmid libraries were used to transform Brevibacillus competent cells (Higeta Shoyu), and the bacteria were cultivated overnight on an agar plate containing neomycin sulfate. Each colony was inoculated and cultured at a 1 mL scale in a 96‐well deep‐well plate. We used 10 g/L L‐proline and 200 mM L‐arginine hydrochloride‐supplemented 2SY medium for this study. The bacteria were incubated at 1000 rpm at 30°C for 60 h in a plate incubator (MBR‐034P, TAITEC) covered with a gas‐permeable seal. Bacterial cultures were sampled and diluted for nanopore sequencing, and the remaining medium was subjected to AS precipitation. For sequencing, 2 μL of the culture mediums was diluted 100 times (stored at −30°C). The remaining medium was centrifuged at 1500 × g for 30 min and 600 μL of supernatant was used for the following protein purification steps.

5.5. Nanopore sequencing

We used a sequencing system based on Currin's concept (Currin et al., 2019), with modifications. Initially, PCR was performed in a 384‐plate format using TaKaRa Ex Premier DNA Polymerase (Takara Bio), with a diluted culture in each well as a template. The barcoded primer sequences used in this study are listed in Table S5. The forward primer positions were identical for each culture plate, whereas the reverse primer positions differed. This allowed a maximum of 12 culture plates to be analyzed simultaneously. The resulting PCR products were mixed and purified using NucleoSpin Gel and PCR Clean‐up (Macherey‐Nagel). Subsequently, sequencing libraries were prepared from the purified DNA solution using a Ligation Sequencing Kit V14 (Oxford Nanopore Technologies). The sequencing library was loaded into a Flow Cell (R10.4.1) (Oxford Nanopore Technologies), and a sequencing run was performed using MinION (Oxford Nanopore Technologies). Data collection and basecalling were performed using MinKNOW software (Oxford Nanopore Technologies). The resulting FASTQ files were classified by barcode using a custom‐made Python program. Sequence clustering was performed, and consensus sequences for each cluster were obtained using SeekDeep (Hathaway et al., 2018). The otu option was set to 98% homology to form clusters of sequences. The monoclonality of each sample was determined based on inter‐cluster and intra‐cluster mixture scores. The intercluster mixing score was calculated by dividing the number of reads in the second cluster by the number of reads in the top cluster, with a threshold value of 0.1. If there were no reads in the second cluster, the calculation was performed as if the number of reads in the second cluster was one. The intra‐cluster mixing score was determined by identifying the highest mutation ratio in the consensus sequence of the top cluster using the SNP information output from the—writeOutFinalInternalSnps option. A threshold value of 0.2 was used. Only monoclonal samples were included in the dataset.

5.6. Ammonium sulfate treatment and affinity chromatography

The supernatant from each well was purified using two methods. First, the stepwise saturated‐AS precipitation was conducted with concentrations from 30% to 60%. Specifically, 600 μL of supernatant was initially treated with 257 μL of saturated AS in each well on a 96‐well plate. Centrifuged at 1500 × g for 30 min, the supernatant was transferred to a new plate. Followed by the second step treatment by adding 573 μL of AS solution, centrifugation at 1500 × g for 30 min was conducted to collect precipitation containing scFvs. The precipitates were dissolved in 600 μL phosphate‐buffered saline (PBS) (50 mmol/L NaCl, 81 mmol/L Na2HPO4, 26.8 mmol/L KCl, 14.7 mmol/L KH2PO). Subsequently, further purification was achieved using His MultiTrap TALON (Cytiva) with the designated protocol using PBS as wash buffer and scFvs were eluted from the resin with 100 μL of elution buffer (PBS, 30 mM EDTA).

5.7. Size exclusion chromatography

Fine purification by SEC was performed using a HiLoad 16/600 Superdex 75‐pg column (Cytiva) equilibrated with PBS.

5.8. Differential scanning fluorometry

DSF thermal stability measurements were performed using a CFX Real‐Time PCR System (Bio‐Rad). In the high‐throughput system, 1 μL of 100× SYPRO Orange stain was diluted with 19 μL of eluted scFvs containing EDTA and Co2+ from the IMAC resin. Finally, fluorescence was scanned every minute to obtain melting curves being heated at 1.0°C/min. T m was calculated by identifying the temperature at which the first derivative of the melting curve minimized within the range of 30–80°C. The peak height was determined by subtracting the lowest RFU from the highest RFU in the thermogram. In the conventional method, scFv was prepared in a twofold dilution series over a concentration range of 1–8 μM.

5.9. Differential scanning calorimetry

DSC thermal stability measurements were conducted using a MicroCal PEAQ‐DSC instrument (Malvern). Here, 350 μL of 0.5 mg/mL scFv was prepared as a sample in PBS and heated from 10 to 100°C at a rate of 1°C/min. Data analysis was performed using the MicroCal PEAQ‐DSC 1.4 software (Malvern). The sample thermograms from 30 to 90°C were normalized by spline baseline subtraction with the response of the reference buffer. Curve fitting was performed using a non‐two‐state model.

AUTHOR CONTRIBUTIONS

Sae Ito: Conceptualization; methodology; investigation; validation; data curation; visualization; writing – original draft; writing – review and editing; formal analysis. Ryo Matsunaga: Conceptualization; methodology; funding acquisition; writing – original draft; writing – review and editing; formal analysis; data curation. Makoto Nakakido: Funding acquisition; writing – review and editing; methodology; project administration. Daisuke Komura: Writing – review and editing; resources. Hiroto Katoh: Writing – review and editing; resources. Shumpei Ishikawa: Funding acquisition; writing – review and editing; project administration; resources. Kouhei Tsumoto: Writing – review and editing; project administration; supervision; funding acquisition; conceptualization; resources.

Supporting information

APPENDIX S1. Supporting information.

PRO-33-e5029-s002.xlsx (30.7KB, xlsx)

TABLE S1. Eight scFv sequences used for accuracy validation of the high‐throughput thermal stability analysis system.

TABLE S2. Comparison of determined T m between the conventional sample preparation method and high‐throughput system.

TABLE S3. Results of the site saturation mutagenesis calculated by Rosetta.

TABLE S4. Determined T m of 186 single mutants of nivolumab scFv.

TABLE S5. Barcode primers used for the nanopore sequencing.

FIGURE S1. SEC chromatograms of scFvs prepared by the conventional method.

FIGURE S2. DSF thermograms of scFvs prepared by the conventional method in a concentration range of 1–8 μM (gray) and in the high‐throughput system (red).

FIGURE S3. Plot of the predicted scores ΔE and experimentally determined T m.

FIGURE S4. Heat map of the predicted scores ΔE and experimentally determined T m.

FIGURE S5. Plot of the predicted ΔT m by HoTMuSiC (v3.0 r3) and experimentally determined ΔT m.

FIGURE S6. Thermograms of stabilized mutants of nivolumab scFv measured by DSF.

FIGURE S7. Thermograms of nivolumab scFv wild‐type and stabilized mutants measured by DSC.

FIGURE S8. Concentration‐dependent thermostability of nivolumab scFv wild‐type identified in the DSF analysis.

FIGURE S9. Boxplots of T m of wild‐type and mutants that showed large variation, with a data range exceeding 3.0°C between the highest and lowest T m values.

PRO-33-e5029-s001.pdf (1.7MB, pdf)

ACKNOWLEDGMENTS

This work was supported by AMED under Grant Numbers JP19am0401010h0005 (to M.N. and S.I.), JP23tk0124002h0002 (to M.N. and S.I.), JP223fa627001 (UTOPIA) (to K.T.), JP223fa727002 (SCARDA) (to K.T.), JST ACT‐X under Grant Number JPMJAX222I (to R.M.), and the MEXT Data Creation and Utilization‐Type Material Research and Development Project under Grant Number JPMXP1122714694 (to K.T.). Brevibacillus competent cells were supplied by Higeta Shoyu Co., Ltd. Supercomputing resources used in this study were provided by the Human Genome Center of the Institute of Medical Science at the University of Tokyo.

Ito S, Matsunaga R, Nakakido M, Komura D, Katoh H, Ishikawa S, et al. High‐throughput system for the thermostability analysis of proteins. Protein Science. 2024;33(6):e5029. 10.1002/pro.5029

Reviewing Editor: Aitziber L. Cortajarena

Contributor Information

Ryo Matsunaga, Email: ryo-matsunaga@g.ecc.u-tokyo.ac.jp.

Kouhei Tsumoto, Email: tsumoto@bioeng.t.u-tokyo.ac.jp.

REFERENCES

  1. Alford RF, Leaver‐Fay A, Jeliazkov JR, O'Meara MJ, DiMaio FP, Park H, et al. The Rosetta all‐atom energy function for macromolecular modeling and design. J Chem Theory Comput. 2017;13:3031–3048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Brummitt RK, Nesta DP, Chang L, Chase SF, Laue TM, Roberts CJ. Nonnative aggregation of an IgG1 antibody in acidic conditions: part 1. Unfolding, colloidal interactions, and formation of high‐molecular‐weight aggregates. J Pharm Sci. 2011;100:2087–2103. [DOI] [PubMed] [Google Scholar]
  3. Currin A, Swainston N, Dunstan MS, Jervis AJ, Mulherin P, Robinson CJ, et al. Highly multiplexed, fast and accurate nanopore sequencing for verification of synthetic DNA constructs and sequence libraries. Synth Biol. 2019;4:ysz025. 10.1093/synbio/ysz025 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Dotti G, Gottschalk S, Savoldo B, Brenner MK. Design and development of therapies using chimeric antigen receptor‐expressing T cells. Immunol Rev. 2014;257:107–126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Duan Y, Chen R, Huang Y, Meng X, Chen J, Liao C, et al. Tuning the ignition of CAR: optimizing the affinity of scFv to improve CAR‐T therapy. Cell Mol Life Sci. 2021;79:14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Fleishman SJ, Baker D. Role of the biomolecular energy gap in protein design, structure, and evolution. Cell. 2012;149:262–273. [DOI] [PubMed] [Google Scholar]
  7. Gligorijević V, Renfrew PD, Kosciolek T, Leman JK, Berenberg D, Vatanen T, et al. Structure‐based protein function prediction using graph convolutional networks. Nat Commun. 2021;12:3168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Harmalkar A, Rao R, Richard Xie Y, Honer J, Deisting W, Anlahr J, et al. Toward generalizable prediction of antibody thermostability using machine learning on sequence and structure features. MAbs. 2023;15:2163584. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Hathaway NJ, Parobek CM, Juliano JJ, Bailey JA. SeekDeep: single‐base resolution de novo clustering for amplicon deep sequencing. Nucleic Acids Res. 2018;46:e21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Holliger P, Hudson PJ. Engineered antibody fragments and the rise of single domains. Nat Biotechnol. 2005;23:1126–1136. [DOI] [PubMed] [Google Scholar]
  11. Houser J, Kosourova J, Kubickova M, Wimmerova M. Development of 48‐condition buffer screen for protein stability assessment. Eur Biophys J. 2021;50:461–471. [DOI] [PubMed] [Google Scholar]
  12. Huang T, Li Y. Current progress, challenges, and future perspectives of language models for protein representation and protein design. Innovation (Camb). 2023;4:100446. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Husain B, Ellerman D. Expanding the boundaries of biotherapeutics with bispecific antibodies. BioDrugs. 2018;32:441–464. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Huynh K, Partch CL. Analysis of protein stability and ligand interactions by thermal shift assay. Curr Protoc Protein Sci. 2015;79:28.9.1–28.9.14. 10.1002/0471140864.ps2809s79 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Karst SM, Ziels RM, Kirkegaard RH, Sørensen EA, McDonald D, Zhu Q, et al. High‐accuracy long‐read amplicon sequences using unique molecular identifiers with nanopore or PacBio sequencing. Nat Methods. 2021;18:165–169. [DOI] [PubMed] [Google Scholar]
  16. Kouba P, Kohout P, Haddadi F, Bushuiev A, Samusevich R, Sedlar J, et al. Machine learning‐guided protein engineering. ACS Catal. 2023;13:13863–13895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Kulshreshtha S, Chaudhary V, Goswami GK, Mathur N. Computational approaches for predicting mutant protein stability. J Comput Aided Mol Des. 2016;30:401–412. [DOI] [PubMed] [Google Scholar]
  18. Li Z, Yang Y, Zhan J, Dai L, Zhou Y. Energy functions in de novo protein design: current challenges and future prospects. Annu Rev Biophys. 2013;42:315–335. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Li G, Yao S, Fan L. ProSTAGE: predicting effects of mutations on protein stability by using protein embeddings and graph convolutional networks. J Chem Inf Model. 2024;64:340–347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Li X, Zhang C. Using differential scanning fluorimetry (DSF) to detect ligand binding with purified protein. Methods in molecular biology. Clifton, NJ/New York, NY: Springer US; 2021. p. 183–186. [DOI] [PubMed] [Google Scholar]
  21. Louis BBV, Abriata LA. Reviewing challenges of predicting protein melting temperature change upon mutation through the full analysis of a highly detailed dataset with high‐resolution structures. Mol Biotechnol. 2021;63:863–884. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Lyskov S, Chou F‐C, Conchúir SÓ, Der BS, Drew K, Kuroda D, et al. Serverification of molecular modeling applications: the Rosetta online server that includes everyone (ROSIE). PLoS One. 2013;8:e63906. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Matsunaga R, Ujiie K, Inagaki M, Fernández Pérez J, Yasuda Y, Mimasu S, et al. High‐throughput analysis system of interaction kinetics for data‐driven antibody design. Sci Rep. 2023;13:19417. 10.1038/s41598-023-46756-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Melien R, Garidel P, Hinderberger D, Blech M. Thermodynamic unfolding and aggregation fingerprints of monoclonal antibodies using thermal profiling. Pharm Res. 2020;37:78. 10.1007/s11095-020-02792-1 [DOI] [PubMed] [Google Scholar]
  25. Miotto M, Olimpieri PP, Di Rienzo L, Ambrosetti F, Corsi P, Lepore R, et al. Insights on protein thermal stability: a graph representation of molecular interactions. Bioinformatics. 2019;35:2569–2577. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Mizukami M, Onishi H, Hanagata H, Miyauchi A, Ito Y, Tokunaga H, et al. Efficient production of trastuzumab fab antibody fragments in Brevibacillus choshinensis expression system. Protein Expr Purif. 2018;150:109–118. [DOI] [PubMed] [Google Scholar]
  27. Mizukami M, Tokunaga H, Onishi H, Ueno Y, Hanagata H, Miyazaki N, et al. Highly efficient production of VHH antibody fragments in Brevibacillus choshinensis expression system. Protein Expr Purif. 2015;105:23–32. [DOI] [PubMed] [Google Scholar]
  28. Mu T, Liang W, Ju Y, Wang Z, Wang Z, Roycik MD, et al. Efficient soluble expression of secreted matrix metalloproteinase 26 in Brevibacillus choshinensis. Protein Expr Purif. 2013;91:125–133. [DOI] [PubMed] [Google Scholar]
  29. Nikam R, Kulandaisamy A, Harini K, Sharma D, Gromiha MM. ProThermDB: thermodynamic database for proteins and mutants revisited after 15 years. Nucleic Acids Res. 2021;49:D420–D424. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Onishi H, Mizukami M, Hanagata H, Tokunaga M, Arakawa T, Miyauchi A. Efficient production of anti‐fluorescein and anti‐lysozyme as single‐chain anti‐body fragments (scFv) by Brevibacillus expression system. Protein Expr Purif. 2013;91:184–191. [DOI] [PubMed] [Google Scholar]
  31. Panda AK, Bisht SS, DeMondal S, Senthil Kumar N, Gurusubramanian G, Panigrahi AK. Brevibacillus as a biological tool: a short review. Antonie Van Leeuwenhoek. 2014;105:623–639. [DOI] [PubMed] [Google Scholar]
  32. Pucci F, Bourgeas R, Rooman M. Predicting protein thermal stability changes upon point mutations using statistical potentials: introducing HoTMuSiC. Sci Rep. 2016;6:23257. 10.1038/srep23257 [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Pucci F, Rooman M. Stability curve prediction of homologous proteins using temperature‐dependent statistical potentials. PLoS Comput Biol. 2014;10:e1003689. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Sanfelice D, Temussi PA. Cold denaturation as a tool to measure protein stability. Biophys Chem. 2016;208:4–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Shi S, Semple A, Cheung J, Shameem M. DSF method optimization and its application in predicting protein thermal aggregation kinetics. J Pharm Sci. 2013;102:2471–2483. [DOI] [PubMed] [Google Scholar]
  36. Thieker DF, Maguire JB, Kudlacek ST, Leaver‐Fay A, Lyskov S, Kuhlman B. Stabilizing proteins, simplified: a Rosetta‐based webtool for predicting favorable mutations. Protein Sci. 2022;31:e4428. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Tsuboyama K, Dauparas J, Chen J, Laine E, Mohseni Behbahani Y, Weinstein JJ, et al. Mega‐scale experimental analysis of protein folding stability in biology and design. Nature. 2023;620:434–444. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Wang W. Protein aggregation and its inhibition in biopharmaceutics. Int J Pharm. 2005;289:1–30. [DOI] [PubMed] [Google Scholar]
  39. Webb B, Sali A. Comparative protein structure modeling using MODELLER. Curr Protoc Protein Sci. 2016;86:2.9.1–2.9.37. [DOI] [PubMed] [Google Scholar]
  40. Wu T, Hornsby M, Zhu L, Yu JC, Shokat KM, Gestwicki JE. Protocol for performing and optimizing differential scanning fluorimetry experiments. STAR Protoc. 2023;4:102688. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Xavier JS, Nguyen T‐B, Karmarkar M, Portelli S, Rezende PM, Velloso JPL, et al. ThermoMutDB: a thermodynamic database for missense mutations. Nucleic Acids Res. 2021;49:D475–D479. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Yao D, Zhang K, Wu J. Available strategies for improved expression of recombinant proteins in Brevibacillus expression system: a review. Crit Rev Biotechnol. 2020;40:1044–1058. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

APPENDIX S1. Supporting information.

PRO-33-e5029-s002.xlsx (30.7KB, xlsx)

TABLE S1. Eight scFv sequences used for accuracy validation of the high‐throughput thermal stability analysis system.

TABLE S2. Comparison of determined T m between the conventional sample preparation method and high‐throughput system.

TABLE S3. Results of the site saturation mutagenesis calculated by Rosetta.

TABLE S4. Determined T m of 186 single mutants of nivolumab scFv.

TABLE S5. Barcode primers used for the nanopore sequencing.

FIGURE S1. SEC chromatograms of scFvs prepared by the conventional method.

FIGURE S2. DSF thermograms of scFvs prepared by the conventional method in a concentration range of 1–8 μM (gray) and in the high‐throughput system (red).

FIGURE S3. Plot of the predicted scores ΔE and experimentally determined T m.

FIGURE S4. Heat map of the predicted scores ΔE and experimentally determined T m.

FIGURE S5. Plot of the predicted ΔT m by HoTMuSiC (v3.0 r3) and experimentally determined ΔT m.

FIGURE S6. Thermograms of stabilized mutants of nivolumab scFv measured by DSF.

FIGURE S7. Thermograms of nivolumab scFv wild‐type and stabilized mutants measured by DSC.

FIGURE S8. Concentration‐dependent thermostability of nivolumab scFv wild‐type identified in the DSF analysis.

FIGURE S9. Boxplots of T m of wild‐type and mutants that showed large variation, with a data range exceeding 3.0°C between the highest and lowest T m values.

PRO-33-e5029-s001.pdf (1.7MB, pdf)

Articles from Protein Science : A Publication of the Protein Society are provided here courtesy of The Protein Society

RESOURCES