Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Nov 12.
Published in final edited form as: Anal Chim Acta. 2019 Jul 10;1081:138–145. doi: 10.1016/j.aca.2019.07.007

Rapid, quantitative determination of aggregation and particle formation for antibody drug conjugate therapeutics with label-free Raman spectroscopy

Chi Zhang 1, Jeremy Samuel Springall 2,*, Xiangyang Wang 2, Ishan Barman 1,3,4,*
PMCID: PMC6750807  NIHMSID: NIHMS1535704  PMID: 31446951

Abstract

Lot release and stability testing of biologics are essential parts of the quality control strategy for ensuring therapeutic material dosed to patients is safe and efficacious, and consistent with previous clinical and toxicological experience. Characterization of protein aggregation is of particular significance, as aggregates may lose the intrinsic pharmaceutical properties as well as engage with the immune system instigating undesirable downstream immunogenicity. While important, real-time identification and quantification of subvisible particles in the monoclonal antibody (mAb) drug products remains inaccessible with existing techniques due to limitations in measurement time, sensitivity or experimental conditions. Here, owing to its exquisite molecular specificity, non-perturbative nature and lack of sample preparation requirements, we propose label-free Raman spectroscopy in conjunction with multivariate analysis as a solution to this unmet need. By leveraging subtle, but consistent, differences in vibrational modes of the biologics, we have developed a support vector machine-based regression model that provides fast, accurate prediction for a wide range of protein aggregations. Moreover, in blinded experiments, the model shows the ability to precisely differentiate between aggregation levels in mAb like product samples pre- and post-isothermal incubation, where an increase in aggregate levels was experimentally determined. In addition to offering fresh insights into mAb like product-specific aggregation mechanisms that can improve engineering of new protein therapeutics, our results highlight the potential of Raman spectroscopy as an in-line analytical tool for monitoring protein particle formation.

Keywords: Raman spectroscopy, protein aggregation, support vector machine

Graphical Abstract

graphic file with name nihms-1535704-f0001.jpg

INTRODUCTION

Since their first licensing for clinical use nearly three decades ago, monoclonal antibodies (mAbs) have offered a powerful therapeutic route for targeting specific mutations and defects in protein structure and expression. The high specificity and affinity of mAbs have catalyzed their development for treating a wide range of pathologies [1], such as cancers, infectious diseases and inflammatory conditions, making them the fastest growing group of biotechnology-derived molecules in clinical trials [2]. By the end of 2017, 57 mAbs and 11 biosimilars had been approved by the U.S. Food and Drug Administration (FDA) and European Medicines Agency (EMA) [3] with the global value of the market estimated to be $20 billion per year [4].

However, the production of these therapeutic antibodies requires the use of very large cultures of mammalian cells followed by extensive purification steps leading to extremely high production costs. This is exacerbated by the lack of suitable metrology tools for rapid characterization of key attributes of the biologic product that are directly linked to its safety and efficacy. For instance, there is a critical unmet need to rapidly assess antibody stability during the development and manufacturing phases of a mAb product. Of all possible instabilities, protein aggregation presents a singular challenge. In addition to the mAb-specific aggregation propensity [5], the interplay of physicochemical parameters such as protein and ion concentrations, particulate contamination, pH, and temperature plays a key role in inducing aggregation in therapeutic formulations. Severe protein aggregation (resulting in protein particles) could lead mAbs to lose their pharmaceutical properties, hinder various upstream/downstream processes, and even stimulate immune response in patients causing harmful effects [6]. Although the latter response is poorly defined, there is emerging evidence of a difference in immune response for aggregated material in comparison to non-aggregated material [7,8]. By achieving rapid and accurate aggregation evaluation, the downstream purification of biopharmaceuticals can be optimized in real-time with the ultimate goal of enhancing the product quality during manufacturing campaigns.

High-performance size exclusion chromatography (HP-SEC) is widely employed for detailed characterization of therapeutic proteins and is often considered as the reference method for qualitative and quantitative evaluation of aggregates [911]. Besides HP-SEC, sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) [12], asymmetrical field flow fractionation (AF4) [13], fluorescence spectroscopy [14], circular dichroism (CD) [15,16] and dynamic light scattering (DLS) [17] are also applied for quantification of protein aggregation. In addition, mass spectrometry and its variants are extensively used to confirm the fragment masses and, increasingly, to characterize oligomeric protein aggregates [18,19]. Although these techniques have been employed for protein aggregation characterization under various conditions, each has well-known limitations. For instance, mass spectrometry can be time-consuming and requires substantial expertise to properly execute. Meanwhile, DLS has relatively low sensitivity with the obtained size distribution being biased towards larger particles owing to the dependence of the intensity on the sixth power of the diameter [20]. This is particularly limiting in quantifying solutions with low aggregation content. On the other hand, additional (often complicated) sample-specific preparation is needed for enabling fluorescence measurements. For the most widely employed technique, HP-SEC, the major limitation is that the run times can be long making the analysis time-consuming and difficult to employ as an in-line mAb evaluation tool. Overall, current limitations of the analytical tools used to detect subvisible particles in the mAb drug products make meaningful measurements challenging with different methods often offering widely conflicting results [21].

Development of an in-line method to support real-time analysis requires a rapid, label-free technique, which can detect and quantify low aggregation levels under standard mAb manufacturing conditions. We propose label-free Raman spectroscopy (RS) as a solution to this unmet need based on the wealth of molecular information encoded in the vibrational spectrum. Protein secondary and tertiary structural information are extensively studied through analysis of Raman peak positions and ratios of spectral features that characterize amide-I, amide-II, amide-III, and other backbone vibrations [2224]. It has also been successfully leveraged to investigate conformational changes in single protein crystals [25] as well as to monitor lyophilization [26] and side chain confirmation [27]. While RS has been previously reported for the elucidation of conformational transitions and aggregation mechanisms in antibodies [2831], its quantitative power has yet to be utilized for real-time determination of aggregation and to reveal how physiochemical properties may impact the formation of high molecular weight species.

Here, we seek to significantly extend the RS approach by combining it with multivariate analysis of frequent patterns expressed in the spectral profiles for predicting aggregation level of mAb like products – with minimal sample preparation and in a near real-time manner. For this pilot study, antibody drug conjugate (ADC) samples were first measured by HP-SEC to determine the initial aggregation level. ADCs, which are mAbs attached to cytotoxic drugs by chemical linkers with labile bonds, present a particularly important and understudied cohort for determining protein aggregation. Following fractionation of the initial ADC sample solution a series of mixture samples was established with various levels of aggregate content. Raman spectral data were acquired from these characterized mixture samples using a spectrometer customized for solution measurements. In addition, part of the initial ADC sample solution was stressed at 40°C for a month to rapidly generate pr oduct degradation at a faster rate than the anticipated storage condition. Spectroscopic measurements were subsequently performed on the paired set of pre- and post-stress samples.

Two separate studies were performed with the acquired data: (A) Mixture sample study: this featured cross-validation and blinded prediction on a series of ADC mixture samples with different aggregation levels; and (B) Thermal stressing study involving independent prediction of protein aggregation level on ADC samples pre- and post-isothermal incubation. For study (A), principal component analysis (PCA) and 2D correlation analysis (2DCOS) was first used to reveal the subtle, but consistent, spectral changes observed with increasing proportion of mAb like product aggregation. Without any prior knowledge of the amino acid sequence of the tested antibodies, we next developed a support vector machine (SVM)-based regression algorithm for estimation of a wide range of protein aggregation after training it with established aggregation mixture samples. For the thermal stressing study (B), the SVM-derived decision algorithm developed in the former study was imported to predict the (differential) levels of aggregation in the pre- and post-isothermal incubation ADC samples. The collective findings of our study pave the way for real-time, in-line mAb aggregation measurements with RS and, when viewed together with recent reports [3134], support the expanded use of vibrational spectroscopy as a process analytical tool in the development and manufacturing of protein drugs and biosimilars.

MATERIALS AND METHODS

Sample Preparation.

To achieve the goals outlined for studies (A) and (B), multiple mAb samples were derived from an antibody-drug conjugate (ADC) that was kindly provided by MedImmune, LLC (Gaithersburg, MD, USA). Since the aggregation mechanism (and the corresponding spectral changes) are highly mAb-specific (as demonstrated in earlier reports [31]), we have focused on developing an accurate classifier and deriving mechanistic insights for aggregation determination for this biologic sample set. The parent ADC sample was first tested by high-performance size exclusive chromatography (HP-SEC) to determine the protein aggregation level. To facilitate the thermal stressing study, 10 mL of the original sample was incubated at 40°C for one month. The 40°C/75% relat ive humidity 1-month treatment is a recognized accelerated stability condition from the ICH Q5C [35]. In contrast to some of the prior studies that involved Raman measurements of protein aggregation [2831], the temperature used in this investigation is significantly below the melting temperature (Tm) of typical mAbs and, hence, is not expected to produce any alteration of the secondary-tertiary structure of the protein molecules [36]. Using HP-SEC, protein monomers and high molecular weight species (HMWS) fractions were collected separately via peak fractionation. Various samples with 0%, 2%, 5%, 10%, 15%, 20%, 30%, 40% and 50% of HMWS were then generated by mixing purified monomer and HMWS samples. Samples were stored at 4°C until spectroscopic measurements. The sample solution was held in 100 µL fused quartz cuvettes and 10 mg/mL concentration was used for all samples. For the protein aggregation mixture samples, the dilution buffer was the HP-SEC mobile phase. On the other hand, the dilution buffer for the original unstressed and 40°C stresse d sample was phosphate-buffered saline (PBS). Sample preparation and Raman spectral acquisition were repeated thrice to assess reproducibility of the measurements.

Instrumentation.

An Agilent 1260 Infinity HP-SEC was used to characterize the samples’ aggregation level. The column used for HP-SEC was TSKgel® G3000SWXL (L × I.D. 30 cm × 7.8 mm, 5 µm particle size) (King of Prussia, PA, USA) and each injection contained 250 µg protein sample with 1 mL/min mobile phase flow. Before each SEC experiment, PBS and gel filtration standard (GFS) were measured to verify system condition.

Raman spectra were acquired by using a µ-ChiralRAMAN-2X Raman spectrometer (ChiralRAMAN, Jupiter, FL, USA). Sample excitation was achieved by using a 532 nm diode laser (MPC6000, Laser Quantum, Fremont, CA, USA). The laser beam passed through a polarizer, a degree of circularity converter and two synchronized counter-rotating half-wave plates. The backscattered Raman signals were transmitted through a notch filter to remove the Raleigh scattered photons, and the spectra were collected using a thermoelectrically cooled CCD camera (MityCCD E3011Bl-DVM, Critical Link, Syracuse, NY, USA). Laser intensity at the sample was kept constant at 50 mW for all the measurement and the spectral resolution of the system was 7 cm−1. The exposure time for each spectral acquisition was 4 seconds.

Data Analysis.

The collected Raman spectra were imported into the MATLAB 2017a (Mathworks, Inc., Natick, MA, USA) environment for further analysis. The Raman wavenumber calibration was performed by using the neon lap spectra acquired by the same system. Spectra were processed to remove interference from cosmic rays. The spectra were subjected to a fifth order best-fit polynomial-based fluorescence removal.

Support vector machines (SVM) were used to develop a regression algorithm for spectroscopically predicting the aggregation level of the proteins. SVM is a supervised learning model that is built on structural risk minimization concepts and can efficiently perform non-linear regression by implicitly mapping the inputs into high-dimensional feature spaces through a kernel. A radial basis function (RBF) with a Gaussian envelope was chosen as the kernel, and the kernel parameters were optimized based on an automated grid search algorithm [3739].

To examine the predictive power of the SVM-derived regression algorithm, leave-many-out cross-validation and blinded sample tests were performed on the mixture samples in study (A). For the cross-validation tests on the mixture samples, spectra were normalized to the intensity of the 983 cm−1 peak, that is characteristic of the HP-SEC mobile phase background. Subsequently, 50% of the spectra from each type of mixture sample were randomly chosen to build the training set. The rest of the spectra were used for cross-validation of the regression model. The algorithm was iterated 100 times with various division of spectra into training and test set to avoid any potential bias. Moreover, for the blinded predictions, the validation set consisted of separate samples with aggregation levels unknown to the developed regression model. In study (B), the mAb samples were subjected to thermal treatment. These samples did not contain the mobile phase component and, hence, the acquired spectra were normalized to 1643 cm−1 amide-I peak. Here, the algorithm developed in (A) was directly used for predicting the HMWS content in the pre- and post-thermal incubation samples without any alteration.

To ensure transferability of algorithms between the two studies, the 1160–1800 cm−1 fingerprint region, which is free from the HP-SEC mobile phase interference, was used for multivariate model development. The aggregation levels predicted from SVM-derived regression algorithms were validated against the results from HP-SEC measurements.

Additionally, to better visualize potential correlations between the HMWS content and spectral changes, PCA was employed. PCA is a widely used dimensional reduction technique, which aims to capture the variance in the spectral dataset using only a few orthogonal components (known as principal components, PCs) [40]. Radviz and VizRank algorithms from Orange data mining software [41] were used together with the PC scores to elucidate trends within the spectral data.

Furthermore, in order to gain a better understanding of the relationship between HMWS proportion changes and spectral variations [42,43], 2DCOS was employed. 2DCOS is an emerging analytical tool that is harnessed to uncover the specific spectral intensity fluctuations induced by the external perturbations (i.e. the aggregation of the mAb samples). 2DCOS analysis was performed under the environment of MATLAB 2017a (Mathworks, Inc., Natick, MA, USA). The 2DCOS analysis provides synchronous and asynchronous contour maps. The details of 2DCOS interpretation are noted in the Supplementary Information.

RESULTS AND DISCUSSION

Rapid and accurate determination of protein aggregation remains an unmet analytical need, addressing which is crucial to the development of more efficient and inexpensive manufacturing processes of protein-based biopharmaceuticals. While previous Raman spectroscopy studies have focused on single variants of well-characterized proteins [2830] or on understanding aggregation mechanisms of specific mAb molecules [31], this technique is yet to be applied for quantifying aggregation levels, particularly in order to assess its feasibility for detecting low levels of aggregation in ADCs. The latter forms a particularly important, yet understudied, sample cohort. Additionally, in the second part of our investigation (study (B)), we seek to evaluate the robustness of Raman spectroscopy-based multivariate algorithms in determining changes induced by thermal incubation that mimic long-term storage conditions in real life.

Study A: Quantitative determination of aggregation levels in ADC mixture samples.

Based on the HP-SEC profile, peak fractionation was applied to separate major ADC product (monomer) and HMWS, and subsequently a series of mixture samples consisting of increasing amounts of HMWS was established. Figure 1A shows representative label-free Raman spectrum recorded from a clean ADC sample (i.e. without the presence of the HP-SEC mobile phase) after background fluorescence subtraction. The vibrational peaks in the range of 600–1800 cm−1 are identified in the spectrum corresponding to various amino acids and characteristic amide modes, such as those at 764 cm−1 (tryptophan), 829 cm−1 (tyrosine), 931 and 1069 cm−1 (proline), 1007 and 1031 cm−1 (phenylalanine), 1243 cm−1 (amide-III), 1342 cm−1 (CH deformation), 1381 cm−1 (CH3 band), 1418 cm−1 (CH2 bending), 1471 cm−1 (C=N stretching), 1566 cm−1 (amide-II), and 1643 cm−1 (amide-I) [4446]. In Figure 1B, representative Raman spectra acquired from 0% HMWS and 50% HMWS standard samples that were obtained following peak fractionation are shown. Due to collection of the fractionated species in the HP-SEC mobile phase solution, the aggregation standard spectra were normalized to the intensity of the 983 cm−1 peak that is characteristic of the mobile phase background. Since the mobile phase does not exhibit any interference in the 1160–1800 cm−1 range (marked by the black dashed box in Figure 1B), we used this spectral subset for development of the multivariate regression models and the ensuing analysis. Several spectral differences have been found between the 0% and 50% HMWS standard samples, notably in the amide-II, CH2 bending and CH deformation regions. The amide-II bond (purple dashed line), which is attributed to the secondary structure of the protein, represents C-N stretching vibrations in combination with N-H bending [46,47]. The CH2 bending and CH deformation features (yellow dashed lines) are attribute to the primary structure of proteins [46]. Crucially, other subtle, but reproducible, changes correlating to different protein aggregation levels may be revealed through the application of chemometric algorithms [32]. To this end, we developed a SVM-based regression model with the goal of quantifying protein aggregation level and subsequently employed PCA to tease out aggregation-specific differences in the spectral profiles.

Figure 1. Vibrational Raman spectra of the biologics.

Figure 1.

(A) Label-free Raman spectra recorded from a representative mAb sample. Prominent Raman peaks are indicated in the biologic spectrum. (B) Mean spectra obtained from standard mAb samples with 0% (blue curve) and 50% (red curve) HMWS proportion after peak fractionation. The solid lines depict the mean spectrum with associated shadings representing the ±1 standard deviations (SD). The spectra were normalized to the intensity of the 983 cm−1 peak – characteristic of the mobile phase. The 1160–1800 cm−1 (black dashed box) fingerprint region was used for further multivariate model development. Spectral features of CH deformation, CH2 bending and amide-II modes (from left to right) are marked by yellow and purple dashed lines.

SVM was chosen as the supervised regression technique as it is particularly well-suited to deal with data sets where the number of variables is large with undetermined linearity, two key characteristics of the recorded vibrational spectra [48,49]. Figure 2A shows the boxplot of the leave-many-out cross-validation results when the SVM-derived model is applied to the acquired spectral dataset. The root-mean-square error (RMSE) was computed to be 1.8% with 100 iterations of equal test and training spectral dataset division. Figure 2B presents the boxplot of the blinded prediction results where separate samples with aggregation levels unknown to the developed regression model were used. The developed model maintains its accuracy for estimating HMWS content in the range of 0–30% with RMSE of ca. 4.5%. What is most impressive is that RS can make accurate and precise predictions in near real-time, with minimal processing at low aggregation levels of the mAbs, which present singular quantification challenges for DLS.

Figure 2. Aggregation level predictions with label-free Raman spectroscopy.

Figure 2.

The median values indicate the prediction results. (A) Graphical representation of prediction results obtained by application of the SVM-derived regression algorithm in leave-many-out cross-validation routine. (B) The SVM-derived regression results of protein aggregation in blinded experiments. The red dashed box highlights the predictions for 40% and 50% aggregations. The root-mean-square error (RMSE) for cross-validation is computed to be 1.8%. The RMSE for the blinded tests (for 0–30% aggregation levels) is 4.5%. (n.s.: not significant, *p < 0.05, **p < 0.01, ***p < 0.001).

However, the prediction error in the blinded tests increased when dealing with samples containing higher relative amounts of HMWS (40% or 50%, Figure 2B red dashed boxes). In particular, test samples with 40% and 50% HMWS content were spectroscopically estimated to have, on average, only ca. 28% and 38% HMWS proportions, respectively. (In contrast, the cross-validation results do not show significant deviation at these high aggregation levels.) A possible reason for the deviation may be an outcome of the larger number of samples with low aggregation levels in the training data that may artificially skew the regression weights in favor of the latter. It is notable that the underestimation at these levels is likely not significant in the context of Raman-based quality control of the mAb products, as the overall assessment that these samples harbor large amounts of HMWS is correct and would offer the desired input to the analytical process. Nevertheless, to reveal the potential basis for these deviations at high levels of aggregation, we resorted to principal component analysis.

Figure 3A shows PC-scores based radial visualization plots for various aggregation standards. Points on the radial map represent sample spectral measurements from several different experimental batches, and the scores corresponding to the chosen loadings influence their position [41]. In comparison to the regular 2D/3D PC score plots, these radial visualization plots offer substantial advantages, notably their ability to utilize more than three PCs and to depict all of the clusters in a single plane. Evidently, distinct clustering patterns for the data points possessing similar content of HMWS species is observed. Further inspection reveals a trending of the PC scores associated with increasing HMWS content from the left to the right of the unit circle. The only major intra-class variability is seen for the samples with sub-10% (but not 0%) HMWS content. We attribute this spread to the difficulty in monitoring low concentration of sub-visible particles that freely diffuse into/out of the laser’s focal volume.

Figure 3. Radial visualization plot highlights changes correlating with increasing HMWS proportion.

Figure 3.

(A) Multidimensional radial visualization plot based on the PC scores shows the clustering behavior of six representative aggregation levels. (B) Corresponding PC 1 and 2 loadings. Black dashed lines indicate zero loading positions. Yellow and purple dashed boxes indicate (from left to right) the CH deformation, CH2 bending and amide-II modes.

Figure 3B shows the corresponding loadings of PC 1 to PC 4. The major peak in these loadings is located at 1566 cm−1, which is the amide-II bond representing C-N stretching vibrations in combination with N-H bending [46,47]. Our observations are consistent with findings from previous studies where the vibrational information of the amide-II peak has been used to measure protein clustering [5053]. Additionally, in the context of separating between the samples with 10–20% and 30% HMWS content, the CH deformation and CH2 bending, as seen in the PC1 loading, may play an essential role.

In order to gain a better understanding of spectral intensity fluctuation induced by changing aggregation levels, synchronous (Figure 4A) and asynchronous (Figure 4B) 2DCOS plots are presented here. In the synchronous plot shown in Figure 4A, the bands with the most obvious dynamic spectral variations reflecting aggregation changes are observed at 1342, 1381, 1418 and 1566 cm−1 (CH deformation, CH3 band, CH2 bending and amide-II, respectively). The positive cross peaks occurring between (1566, 1342) and (1566,1418) accurately present that both amide-II and CH-related bands are changing intensity in the same direction as aggregation proportion increase. Besides positive cross peaks, a weak negative cross peak is noted around (1566,1643) cm−1, which could be assigned to amide-II / amide-I cross peak, indicating their changes are in opposite directions. In the asynchronous plot (Figure 4B), there is a major positive cross peak located between the coordinates (1566, 1243) and (1566, 1500) cm−1 that correspond to amide-II, amide-III, C=N stretching and CH-related bands. However, due to the absence of obvious corresponding cross peaks around (1566, 1243) and (1566, 1471) cm−1 in the synchronous plot, the predominant order of amide-II and amide-III or C=N stretching bands intensity variance cannot be determined at this stage. Since the corresponding amide-II and CH-related cross peaks share the same sign (positive) in both the synchronous and asynchronous plots, it is indicative of the aggregation-induced changes in amide-II intensity occurring predominantly ahead of such changes in the CH-related spectral features.

Figure 4.

Figure 4.

(A) Synchronous and (B) asynchronous 2DCOS plots generated from mAb aggregation-induced Raman spectral intensity variations. The shaded regions above the 2DCOS plots highlight (from left to right) the CH deformation, CH3 band, CH2 bending and amide-II modes in the Raman spectrum.

The inclusion of the CH and CH2 peaks in accounting for this differential separation indicates that the mechanism of aggregation is driven, in part, by the attraction between hydrophobic patches of the protein molecules, involving noncovalent interactions [31]. As mentioned previously, we have not denatured the protein under investigation in this study and, as such, the protein is predominantly in its native state, resulting in the presence of surface exposed hydrophobic amino acids, which likely act as nucleation sites for aggregation to occur via hydrophobic attraction between non-polar side chains [54]. Similar conclusion can also be drawn from Figure 1B, where the largest variance in our data is observed in the amide-II region with secondary variances distributed around CH deformation and CH2 bending modes. The amide-I region is widely studied to assess protein secondary structure [55,56]. However, due to the broad band of H-O-H bending vibrations from water (as also reported by previous studies [57]), amide-I does not appear to be a major driver of our regression model for quantifying aggregation levels.

Study B: Quantification of aggregation levels of mAbs subjected to thermal stress.

Here, we sought to investigate the feasibility of the previously developed regression model in quantifying differential aggregation levels of the ADC sample, pre- and post-thermal treatment. This experiment is of significance in evaluating the suitability of RS in monitoring protein degradation over the long-term and, specifically, in understanding the robustness of our algorithm to non-analyte specific buffer variations. The SVM model, developed in the aforementioned cross-validation study, was used without alteration to predict aggregate content in the pre- and post-stress ADC samples, which (unlike the samples in study (A)) do not consist of the HP-SEC mobile phase.

Our unstressed ADC sample was an in-process intermediate from the purification process and was determined by HP-SEC to consist of 54% monomer and 45.9% HMWS, respectively (Figure 5A). After the one-month isothermal incubation period, the stressed sample displayed an increase in protein aggregation with 53% components now recognized as HMWS (Figure 5B). The reduction of the peak of the major product and the corresponding gain in the HMWS peak underscores the expected conversion of monomer species into larger than monomer HMWS. Figure 5C presents the boxplot of the RS-based predictions of the HMWS content in the unstressed and stressed samples. The SVM-based regression model estimates the HMWS content in the unstressed and stressed samples to be 45.7% and 51.8%, respectively. The computed RMSE is ca. 1.4%, which is comparable to the prediction errors observed in study (A). This verifies that the mobile phase background signal does not affect the regression model performance, and that our SVM derived regression algorithms are capable of accurately predicting mAb aggregation levels, despite the potential sample-to-sample variations. Our ability to distinguish between samples with relatively similar HMWS content paves the way for further quantitative studies to interrogate varying aggregation behavior of specific mAbs as well as the differential impact of other crucial physiochemical conditions on generation of protein particles.

Figure 5.

Figure 5.

Size exclusion chromatography characterization of mAb specimen: (A) before isothermal incubation; and (B) after one-month isothermal incubation. The monomer/HMWS separation points are indicated by the red dashed lines. The included tables present the HP-SEC peak assignments and area coverage of each component. (C) Aggregation level predictions on pre-/post-isothermal incubation samples with label-free Raman spectroscopy. Independent prediction results were obtained by applying the SVM-based regression model from study A. The SVM-derived model estimated the HMWS component in the pre- and post-isothermal incubation samples to be 45.7% and 51.8%, respectively. The RMSE of prediction on unstressed/stressed protein aggregation is 1.4%. (n.s.: not significant, *p < 0.05, **p < 0.01, ***p< 0.001).

CONCLUSION

Limitations in detecting and quantifying protein particles complicates manufacturing and quality control for monoclonal antibody-based therapeutics. Here, we have shown that Raman spectroscopy in conjunction with support vector machine regression enables quantitative prediction of protein clustering in mAbs with high degree of accuracy and robustness. Notably, each measurement took less than three minutes including spectra acquisition and data analysis, which is much faster than HP-SEC measurements (~20 min per samples) that is extensively used for monitoring protein aggregation in the biopharmaceutical industry. The results in this proof-of-concept study should not be considered as indicative of the best classification performance that is likely to be obtainable after further optimization of the spectroscopic hardware and regression algorithm. In addition, we have uncovered the regions of spectral variances induced by protein particle formation and correlated these with the aggregation mechanisms through PCA and 2DCOS. Our present findings encourage further development of this promising technique with the goal of eventual application as a rapid, in-line aggregation monitoring tool. This has extensive implications in the rapidly emerging biopharmaceutical space especially in facilities handling production and processing of biologics, and may also benefit regulatory authorities by helping develop improved guidance parameters for manufacturing of safe and effective protein therapeutics.

Supplementary Material

1

Highlights.

  • Demonstrate the application of spontaneous Raman spectroscopy coupled with multivariate data analysis for predicting aggregation level of monoclonal antibodies (mAbs) - with minimal sample preparation and in a near real-time manner.

  • Subtle, but consistent, spectral changes observed with increasing product aggregation are explained using principal component analysis (PCA) and 2D correlation analysis (2DCOS).

  • Support vector machine-derived model shows the ability to precisely determine aggregation levels in mAb samples that are subject to long-term storage-like conditions.

ACKNOWLEDGEMENT

This research was supported by MedImmune, LLC. I. B. also acknowledges the support from the National Institute of Biomedical Imaging and Bioengineering (2-P41-EB015871-31), and the National Institute of General Medical Sciences (DP2GM128198).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

CONFLICT OF INTEREST

The authors disclose no potential conflicts of interest.

Declaration of interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  • 1.Nelson AL, Dhimolea E, Reichert JM, Development trends for human monoclonal antibody therapeutics, Nat. Rev. Drug Discov 9 (2010), pp. 767–774. [DOI] [PubMed] [Google Scholar]
  • 2.Reichert JM, Antibodies to watch in 2017, MAbs 9 (2017), pp. 167–181. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Grilo AL, Mantalaris A, The Increasingly Human and Profitable Monoclonal Antibody Market, Trends Biotechnol 37 (2018), pp. 9–16. [DOI] [PubMed] [Google Scholar]
  • 4.Maggon K, Monoclonal antibody “gold rush”, Curr. Med. Chem 14 (2007), pp. 1978–1987. [DOI] [PubMed] [Google Scholar]
  • 5.De Baets G, Schymkowitz J, Rousseau F, Predicting aggregation-prone sequences in proteins, Essays Biochem 56 (2014), pp. 41–52. [DOI] [PubMed] [Google Scholar]
  • 6.Singla A, Bansal R, Joshi V, Rathore AS, Aggregation kinetics for IgG1-based monoclonal antibody therapeutics, AAPS J 18 (2016), pp. 689–702. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Rosenberg AS, Effects of protein aggregates: an immunologic perspective, AAPS J 8 (2006), pp. E501–E507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Ratanji KD, Dearman RJ, Kimber I, Thorpe R, Wadhwa M, Derrick JP, Editor’s Highlight: Subvisible Aggregates of Immunogenic Proteins Promote a Th1-Type Response, Toxicol. Sci 153 (2016), pp. 258–270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Sluzky V, Shahrokh Z, Stratton PA, Eberlein GA and Wang YJ, Chromatographic methods for quantitation of native, denatured and aggregated bFGF in solution formulations, Pharm. Res 11 (1994), pp. 485–490. [DOI] [PubMed] [Google Scholar]
  • 10.Uversky VN, Use of fast protein size-exclusion liquid chromatography to study the unfolding of proteins which denature through the molten globule, Biochemistry 32 (1993), pp. 13288–13298. [DOI] [PubMed] [Google Scholar]
  • 11.Hartmann WK, Saptharishi N, Yang XY, Mitra G, Soman G, Characterization and analysis of thermal denaturation of antibodies by size exclusion high-performance liquid chromatography with quadruple detection, Anal. Biochem 325 (2004), pp. 227–239. [DOI] [PubMed] [Google Scholar]
  • 12.Wang W, Instability, stabilization, and formulation of liquid protein pharmaceuticals, Int. J. Pharm 185 (1999), pp. 129–188. [DOI] [PubMed] [Google Scholar]
  • 13.Fraunhofer W, Winter G, The use of asymmetrical flow field-flow fractionation in pharmaceutics and biopharmaceutics, Eur. J. Pharm. Biopharm 58 (2004), pp. 369–383. [DOI] [PubMed] [Google Scholar]
  • 14.Ladokhin AS, Fluorescence spectroscopy in peptide and protein analysis, Encyclopedia of Analytical Chemistry: Applications, Theory and Instrumentation 2006. 10.1002/9780470027318.a1611 [DOI]
  • 15.Kelly SM, Jess TJ, Price NC, How to study proteins by circular dichroism, Biochim. Biophys. Acta 1751 (2005), pp. 119–139. [DOI] [PubMed] [Google Scholar]
  • 16.Joshi V, Shivach T, Yadav N, Rathore AS, Circular dichroism spectroscopy as a tool for monitoring aggregation in monoclonal antibody therapeutics, Anal. Chem 86 (2014), pp. 11606–11613. [DOI] [PubMed] [Google Scholar]
  • 17.Yu Z, Reid JC, Yang YP, Utilizing dynamic light scattering as a process analytical technology for protein folding and aggregation monitoring in vaccine manufacturing, J. Pharm. Sci 102 (2013), pp. 4284–4290. [DOI] [PubMed] [Google Scholar]
  • 18.Kükrer B, Filipe V, van Duijn E, Kasper PT, Vreeken RJ, Heck AJ, Jiskoot W, Mass spectrometric analysis of intact human monoclonal antibody aggregates fractionated by size-exclusion chromatography, Pharm. Res 27 (2010), pp. 2197–2204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Beck A, Sanglier-Cianférani S, Van Dorsselaer A, Biosimilar, biobetter, and next generation antibody characterization by mass spectrometry, Anal. Chem 84 (2012), pp. 4637–4646. [DOI] [PubMed] [Google Scholar]
  • 20.Amin S, Barnett GV, Pathak JA, Roberts CJ, Sarangapani PS, Protein aggregation, particle formation, characterization & rheology, Curr. Opin. Colloid In 19 (2014), pp. 438–449. [Google Scholar]
  • 21.Tatford OC, Gomme PT, Bertolini J, Analytical techniques for the evaluation of liquid protein therapeutics, Biotechnol. Appl. Biochem 40 (2004), pp. 67–81. [DOI] [PubMed] [Google Scholar]
  • 22.Wen ZQ, Raman spectroscopy of protein pharmaceuticals, J. Pharm. Sci . 96 (2007), pp. 2861–2878. [DOI] [PubMed] [Google Scholar]
  • 23.Navarra G, Tinti A, Leone M, Militello V, Torreggiani A, Influence of metal ions on thermal aggregation of bovine serum albumin: aggregation kinetics and structural changes, J. Inorg. Biochem 103 (2009), pp. 1729–1738. [DOI] [PubMed] [Google Scholar]
  • 24.Brewster VL, Ashton L, Goodacre R, Monitoring guanidinium-induced structural changes in ribonuclease proteins using Raman spectroscopy and 2D correlation analysis, Anal. Chem 85 (2013), pp. 3570–3575. [DOI] [PubMed] [Google Scholar]
  • 25.Zheng R, Zheng X, Dong J, Carey PR, Proteins can convert to β-sheet in single crystals, Protein Sci 13 (2004), pp. 1288–1294. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Sane SU, Wong R, Hsu CC, Raman spectroscopic characterization of drying-induced structural changes in a therapeutic antibody: correlating structural changes with long-term stability, J. Pharm. Sci 93 (2004), pp. 1005–1018. [DOI] [PubMed] [Google Scholar]
  • 27.Wen ZQ, Cao X, Vance A, Conformation and side chains environments of recombinant human interleukin-1 receptor antagonist (rh-IL-1ra) probed by raman, raman optical activity, and UV-resonance Raman spectroscopy, J. Pharm. Sci 97 (2008), pp. 2228–2241. [DOI] [PubMed] [Google Scholar]
  • 28.Zhou C, Qi W, Lewis EN, Carpenter JF, Concomitant Raman spectroscopy and dynamic light scattering for characterization of therapeutic proteins at high concentrations, Anal. Biochem 472 (2015), pp. 7–20. [DOI] [PubMed] [Google Scholar]
  • 29.Zhou C, Qi W, Lewis EN, Carpenter JF, Characterization of sizes of aggregates of insulin analogs and the conformations of the constituent protein molecules: a concomitant dynamic light scattering and Raman spectroscopy study, J. Pharm. Sci 105 (2016), pp. 551–558. [DOI] [PubMed] [Google Scholar]
  • 30.Lewis EN, Qi W, Kidder LH, Amin S, Kenyon SM, Blake S, Combined dynamic light scattering and Raman spectroscopy approach for characterizing the aggregation of therapeutic proteins, Molecules 19 (2014), pp. 20888–20905. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Gómez R de la Cuesta, R. Goodacre, L. Ashton, Monitoring antibody aggregation in early drug development using Raman spectroscopy and perturbation-correlation moving windows, Anal. Chem 86 (2014), pp. 11133–11140. [DOI] [PubMed] [Google Scholar]
  • 32.Paidi SK, Siddhanta S, Strouse R, McGivney JB, Larkin C, Barman I, Rapid identification of biotherapeutics with label-free Raman spectroscopy, Anal. Chem 88 (2016), pp. 4361–4368. [DOI] [PubMed] [Google Scholar]
  • 33.Abu-Absi NR, Kenty BM, Cuellar ME, Borys MC, Sakhamuri S, Strachan DJ, Hausladen MC, Li ZJ, Real time monitoring of multiple parameters in mammalian cell culture bioreactors using an in-line Raman spectroscopy probe, Biotechnol. Bioeng 108 (2011), pp. 1215–1221. [DOI] [PubMed] [Google Scholar]
  • 34.De Beer T, Burggraeve A, Fonteyne M, Saerens L, Remon JP, Vervaet C, Near infrared and Raman spectroscopy for the in-process monitoring of pharmaceutical production processes, Int. J. Pharm 417 (2011), pp. 32–47. [DOI] [PubMed] [Google Scholar]
  • 35.ICH Q5C. Quality of biotechnological products: stability testing of biotechnological/biological products, (1995).
  • 36.McConnell AD, Zhang X, Macomber JL, Chau B, Sheffer JC, Rahmanian S, Hare E, Spasojevic V, Horlick RA, King DJ, Bowers PM, A general approach to antibody thermostabilization, MAbs 6 (2014), pp. 1274–1282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Genton MG, Classes of kernels for machine learning: a statistics perspective, J. Mach. Learn. Res 2 (2001), pp. 299–312. [Google Scholar]
  • 38.Suykens JA, Van Gestel T, Vandewalle J, De Moor B, A support vector machine formulation to PCA analysis and its kernel version, IEEE Trans. Neural. Netw 14 (2003), pp. 447–450. [DOI] [PubMed] [Google Scholar]
  • 39.Pelckmans K, Suykens JA, Van Gestel T, De Brabanter J, Lukas L, Hamers B, De Moor B, Vandewalle J, LS-SVMlab: a matlab/c toolbox for least squares support vector machines, Tutorial. KULeuven-ESAT. Leuven, Belgium, 142 (2002), pp. 1–2. [Google Scholar]
  • 40.Ringnér M, What is principal component analysis?, Nat. Biotechnol 26 (2008), pp. 303. [DOI] [PubMed] [Google Scholar]
  • 41.Demšar J, Curk T, Erjavec A, Gorup Č, Hočevar T, Milutinovič M, Možina M, Polajnar M, Toplak M, Starič A, Štajdohar M, Orange: data mining toolbox in Python, J. Mach. Learn. Res 14 (2013), pp. 2349–2353. [Google Scholar]
  • 42.Noda I, Two-dimensional infrared (2D IR) spectroscopy: theory and applications, Appl. Spectrosc 44 (1990), pp. 550–561. [Google Scholar]
  • 43.Noda I, Roy A, Carriere J, Sobieski BJ, Chase DB, Rabolt JF, Two-Dimensional Raman Correlation Spectroscopy Study of Poly[(R)-3-hydroxybutyrate-co-(R)-3-hydroxyhexanoate] Copolymers, Appl Spectrosc 71 (2017), pp. 1427–1431. [DOI] [PubMed] [Google Scholar]
  • 44.De Gelder J, De Gussem K, Vandenabeele P, Moens L, Reference database of Raman spectra of biological molecules, J. Raman Spectrosc 38 (2007), pp. 1133–1147. [Google Scholar]
  • 45.Huang Z, Chen X, Chen Y, Feng S, Chen R, Chen J, Dou M, Zeng H, Raman spectroscopic characterization and differentiation of seminal plasma, J. Biomed. Opt 16 (2011), pp. 110501. [DOI] [PubMed] [Google Scholar]
  • 46.Movasaghi Z, Rehman S, Rehman IU, Raman spectroscopy of biological tissues, Appl. Spectrosc. Rev 42 (2007), pp. 493–541. [Google Scholar]
  • 47.Krimm S, and Bandekar J, Vibrational spectroscopy and conformation of peptides, polypeptides, and proteins, Adv. Protein Chem 38 (1986), pp. 181–364. [DOI] [PubMed] [Google Scholar]
  • 48.Pandey R, Zhang C, Kang JW, Desai PM, Dasari RR, Barman I, Valdez TA, Differential diagnosis of otitis media with effusion using label-free Raman spectroscopy: A pilot study, J. Biophotonics 11 (2018), pp. e201700259. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Zhang C, Winnard PT Jr, Dasari S, Kominsky SL, Doucet M, Jayaraman S, Raman V, Barman I, Label-free Raman spectroscopy provides early determination and precise localization of breast cancer-colonized bone alterations, Chem. Sci 9 (2018), pp. 743–753. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Boulet-Audet M, Byrne B, Kazarian SG, High-throughput thermal stability analysis of a monoclonal antibody by attenuated total reflection FT-IR spectroscopic imaging, Anal. Chem 86 (2014), pp. 9786–9793. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Lin GL, Pathak JA, Kim DH, Carlson M, Riguero V, Kim YJ, Buff JS, Fuller GG, Interfacial dilatational deformation accelerates particle formation in monoclonal antibody solutions, Soft Matter 12 (2016), pp. 3293–3302. [DOI] [PubMed] [Google Scholar]
  • 52.Schüle S, Frieß W, Bechtold-Peters K, Garidel P, Conformational analysis of protein secondary structure during spray-drying of antibody/mannitol formulations, Eur. J. Pharm. Biopharm 65 (2007), pp. 1–9. [DOI] [PubMed] [Google Scholar]
  • 53.Ami D, Lavatelli F, Rognoni P, Palladini G, Raimondi S, Giorgetti S, Monti L, Doglia SM, Natalello A, Merlini G, In situ characterization of protein aggregates in human tissues affected by light chain amyloidosis: a FTIR microspectroscopy study, Sci. Rep 6 (2016), pp. 29096. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Roberts CJ, Therapeutic protein aggregation: mechanisms, design, and control, Trends Biotechnol 32 (2014), pp. 372–380. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Wen ZQ, Raman spectroscopy of protein pharmaceuticals, J. Pharm. Sci 96 (2007), pp. 2861–2878. [DOI] [PubMed] [Google Scholar]
  • 56.Ye S, Li H, Yang W, Luo Y, Accurate determination of interfacial protein secondary structure by combining interfacial-sensitive amide I and amide III spectral signals, J. Am. Chem. Soc 136 (2014), pp. 1206–1209. [DOI] [PubMed] [Google Scholar]
  • 57.Ashton L, Johannessen C, Goodacre R, The importance of protonation in the investigation of protein phosphorylation using Raman spectroscopy and Raman optical activity, Anal. Chem 83 (2011), pp. 7978–7983. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

RESOURCES