Abstract
The vast complexity of the proteome currently overwhelms any single analytical technology in capturing the full spectrum of proteoform diversity. In this study, we evaluated the complementarity of two cutting-edge proteomic technologies—single-molecule protein sequencing and individual ion mass spectrometry—for analyzing recombinant human IL-6 (rhIL-6) at the amino acid, peptide, and intact proteoform levels. For single-molecule protein sequencing, we employed the recently released Platinum instrument. Next-generation protein sequencing (NGPS) on Platinum utilizes cycles of N-terminal amino acid recognizer binding and aminopeptidase cleavage to enable parallelized sequencing of single peptide molecules. We found that NGPS produces single amino acid coverage of multiple key regions of IL-6, including two peptides within helices A and C, which harbor residues that reportedly impact IL-6 function. For top-down proteoform evaluation, we used individual ion mass spectrometry (I2MS), a highly parallelized Orbitrap-based charge detection MS platform. Single ion detection of gas-phase fragmentation products (I2MS2) gives significant sequence coverage in key regions of IL-6, including two regions within helices B and D that are involved in IL-6 signaling. Together, these complementary technologies delivered a combined 52% sequence coverage, offering a more complete view of IL-6 structural and functional diversity than either technology alone. This study highlights the complementarity of these protein detection methods to cover protein segments relevant to biological interactions more comprehensively.
Graphical Abstract

Supplementary Information
The online version contains supplementary material available at 10.1007/s00216-025-06120-7.
Keywords: Single-molecule protein sequencing, Next-generation protein sequencing, Platinum, Top-down mass spectrometry, Individual ion mass spectrometry, Proteoform
Introduction
Proteins are fundamental to nearly all biological processes, serving as key cellular components and prime drug targets [1, 2]. Protein-based therapies are among the top-selling pharmaceutical products [3]. The structure and function of proteins are determined by their primary amino acid sequence and post-translational modifications (PTMs), making the analysis of these features crucial for drug development and biopharmaceuticals.
Cytokines, such as interleukin-6 (IL-6), play a vital role in immune regulation, and engineered variants are widely used in therapeutic applications [4, 5]. In vivo, these essential proteins exist in multiple forms, or proteoforms [6], generated not only by genetic variation but also by alternative splicing [7], proteolysis [8], and various post-translational modifications (PTMs) [9, 10], all of which can affect their stability and interactions with biological targets. Unexpected amino acid substitutions, errors in protein synthesis, and variations in expression systems can give rise to a wide array of proteoforms of a recombinant protein with direct impact on potency and function [11]. Understanding and controlling these factors is essential for refining recombinant IL-6 formulations and optimizing their clinical efficacy.
Current approaches for therapeutic protein characterization, including intact mass analysis and multi-attribute methods (MAMs), have been deployed to great effect to support the growth of protein therapeutics in recent years. However, these approaches often do not completely characterize proteoforms, require significant expertise and overhead, and have limitations in sensitivity, mass range, and sample complexity [12, 13]—leading many to rely on western blot or other dated but accessible measures for characterization of expressed proteins. Several new technologies offer promising approaches to overcoming these challenges. Next-generation protein sequencing (NGPS) and individual ion mass spectrometry (I2MS) are highly parallelized, single-molecule approaches to analyze protein sequences and PTMs. NGPS uses engineered protein-based recognizers to sequence peptides with single amino acid resolution, while I2MS is a mass spectrometry technique that directly measures the intact mass of individual proteoform ions.
NGPS on the Platinum® and Platinum® Pro instruments marketed by Quantum-Si, Inc. employs fluorophore-labeled N-terminal amino acid (NAA) recognizers that reversibly bind cognate NAAs and aminopeptidases that reveal sequential NAAs to provide sequence and PTM information on immobilized Lys-C digested peptides [14]. In contrast to other recognizer-based methods, NGPS NAA-recognizer interactions produce multi-faceted kinetic signatures sensitive to adjacent amino acids and post-translational modifications, providing detailed sequence and PTM information about specific regions and enabling inferred protein-level identifications [14]. These capabilities are similar to conventional bottom-up proteomic workflows. However, Platinum aims to make detailed sequence characterization and protein identification available outside of traditional proteomics settings. Platinum is smaller and more affordable than commercially available mass spectrometers and does not require specialized lab space or expertise to analyze samples using kits and a cloud-based analysis suite available from Quantum-Si [15].
I2MS uses multiplexed Orbitrap-based charge detection mass spectrometry (CDMS) to directly measure the mass of individual proteoform ions, producing significantly simplified mass-domain spectra for mixtures of proteoforms without the challenges of deconvolution. The application of I2MS provides a > 500 × improvement in analytical sensitivity, > 10 × increase in mass range, and > 10 × higher resolution for the characterization of proteoforms over traditional intact mass and top-down mass spectrometry (TD-MS) approaches [16, 17]. These improvements enable input proteins to be larger, more dilute, and have more complex proteoform landscapes. Further, I2MS is compatible with automation, enabling automated sample processing and analysis [18]. Tandem-MS with the detection of individual ions (I2MS2) resolves fragment ions from the measured molecular ion to provide sequence analysis across the whole protein, confident protein identification, and localization of PTMs [16]. A version of this technology is marketed by Thermo Fisher Scientific as Direct Mass Technology mode on select MS instruments.
NGPS and I2MS utilize distinct fluorescence- or mass spectrometry–based techniques to characterize proteins and their proteoforms. In the context of biopharmaceutical quality control, there is an increasing demand for orthogonal methods—those that employ different physical principles to measure the same property—and for complementary approaches that provide a more comprehensive assessment of protein characteristics. Both I2MS and NGPS represent promising single-molecule technologies that meet these needs [15, 17]. Moreover, when used together, NGPS and I2MS can provide complementary insights, covering overlapping and distinct protein regions, thus enhancing overall sequence coverage and depth.
In this study, we used two independent technologies to analyze a recombinant human IL-6 (rhIL-6) sample: NGPS for single-molecule protein sequencing [14] and I2MS2 for intact mass profiling of proteoforms and readout of top-down fragmentation spectra [17]. By integrating these orthogonal methods for the first time, we show how they complement each other to provide more comprehensive coverage of key regions crucial for IL-6 function and therapeutic potential.
Materials and methods
Peptide sequencing on platinum
Experiments were conducted in accordance with the reagent kits and protocols described at https://www.quantum-si.com/resources/access/ [15]. Briefly, an aliquot of a stock solution of rhIL-6 sourced from AcroBiosystems (IL6-H4218) was diluted to a concentration of 1 µM in 100 µL Optima™ LC/MS grade water (Fisher Scientific, W64), reduced with TCEP, and cysteine residues were alkylated with chloroacetamide. Digestion of rhIL-6 with Lys-C was performed to generate peptides with C-terminal lysine residues. These peptides underwent diazotransfer and bioconjugation reactions to attach a macromolecular linker to the C-terminal lysines. After fluorescent quantification using included “Quant buffer,” conjugated peptides were immobilized in nanoscale reaction chambers on a semiconductor chip using the provided Loading solution. The chip was loaded into the Platinum instrument where loading quality control checks were performed. The sequencing mixture, consisting of aminopeptidases and six NAA recognizers that target 13 NAAs (R, L, I, V, F, Y, W, A, S, N, Q, D, E), was then added to the chip, and the sequencing run was initiated using the Platinum’s on-screen controls. During on-chip sequencing, fluorophore-labeled NAA recognizers reversibly bind cognate NAAs, producing recognition segments (RSs) and fluorescence properties that are captured by the semiconductor chip. To carry out the sequencing process, aminopeptidases cleave the peptide bond and expose the subsequent NAA for recognition [14]. This workflow was performed in duplicate for rhIL-6 with representative results included below.
Analysis of sequencing data; analysis versions
Data analysis was performed using Quantum-Si’s cloud-based Platinum Analysis Software platform. Primary Analysis v2.5.1, Peptide Alignment v2.3.0, and Protein Inference v2.5.2 workflows were used. Details can be found in the Platinum Analysis Software Data Sheet (February 2, 2024). The Primary Analysis workflow is the first step in processing data, which characterizes the apertures across the chip based on peptide loading, recognizer activity, recognizer reads, and recognizer read lengths. These results are used as input for downstream analysis workflows.
Peptide alignment v2.3.0 workflow
For the Peptide Alignment workflow, a reference sequence is required to call amino acids from the recognizer reads. Reads from the sequencing data were aligned to the FASTA reference of mature rhIL-6. Peptides were aligned based on the correspondence of observed recognition segments to the expected reference profile, using recognizer identity.
Peptide Alignment workflow also computes a false discovery rate (FDR) for each aligned peptide. This calculation is adapted from methods used in peptide identification by mass spectrometry [19], based on decoy peptide matching. Thus, FDR represents the relative number of alignments to the reference peptide sequence versus the total number of off-target alignments, Such as scrambled sequences with the same length as the target peptide. The standard cutoff for FDR is 0.1, or 10%.
Protein inference v2.5.2 workflow
A pre-defined reference set of 8076 human proteins was used to infer proteins from unknown samples and confirm the identity of samples. The proteins in this reference panel span 10–70 kDa and contain at least three in silico Lys-C-digested peptides with three unique, visible residues. Inferred proteins are ranked by their respective Inference Score. The Inference Score is a natural log calculation of the FDR associated with the inferred protein.
Sample preparation and data collection for I2MS and I2MS2 analysis
RhIL-6 (AcroBiosystems, IL6-H4218) was reconstituted in Optima™ LC/MS grade water (Fisher Scientific, W64), aliquoted, and stored at −80 °C according to the manufacturer’s recommendations until use. Samples were analyzed with the SampleStream Platform (Integrated Protein Technologies, Evanston, IL) [20] coupled to a modified Q-Exactive HF (Thermo Fisher Scientific; Bremen, Germany) mass spectrometer. Briefly, rhIL-6 was diluted to ~ 0.8 µM with water and transferred to a low-retention autosampler vial (Waters, 186009186). For each injection, 10 µL was buffer exchanged with SampleStream into 80 µL denaturing buffer (70:30 water:acetonitrile with 0.2% formic acid), deposited into a clean vial to mix, and then 75 µL was aspirated to infuse at ~ 0.1 µM. Relevant SampleStream parameters included a 125-µL focus volume, 225 µL/min focus flow rate, 60 °C flow cell temperature, and a 5-kDa molecular weight cutoff membrane. Source conditions included a custom nano-electrospray emitter (CoAnn Tech, TIP36007540-10), 1.8 kV spray voltage, 1.0 µL/min flow rate, and 320 °C inlet capillary temperature. Instrument parameters included RF 50%, 5e6 fixed AGC target, 120,000 resolution (at 200 m/z), 1 µscan, −1 kV central electrode voltage, 0.3 (arb) trapping gas pressure setting, 600–2500 m/z scan range, and a 78-min acquisition length (~ 4583 scans). The injection time for each acquisition was determined via automated ion control (AIC) to remain on the individual ion level [21].
For I2MS2 experiments, a charge state for each precursor was selected using The Fisher, a tool developed internally by the Kelleher group [22]. Briefly, ions within a 0.8 m/z isolation window width and 20 Da mass window width centered on a precursor mass were counted as ions corresponding to either the desired precursor or other species within an I2MS data file. The charge state that maximized the number of desired precursor ions and minimized the number of ions from different species was selected for isolation. Precursors were isolated and fragmented via higher energy collisional dissociation (HCD) normalized to charge state. Fragment ions were measured within a 150–2500 m/z scan range. Normalized collisional energy (NCE) and injection time were set manually to optimize fragmentation and ion counts. Fragmentation experiments were performed in duplicate with representative results included below.
I2MS and I2MS2data processing and analysis
The I2MS charge assignment and ion mass determination was performed as previously described [17, 23].
A fragment search against the precursor amino acid sequence and expected PTMs was carried out using TDValidator (Proteinaceous, Inc., Evanston, IL). Fragment ions were identified by matching their isotopic distributions to theoretical isotopic distributions generated using an averagine model [24] and the Mercury7 [25] algorithm. To make ions searchable in TDValidator, neutral mass I2MS spectra were transformed into theoretical + 1 (M + H) distributions. All fragment ions were identified within a ± 10 ppm tolerance of their theoretical values for the isotopic distribution (max PPM tolerance) and ± 5 ppm tolerance for isotopologues within the same distribution (sub PPM tolerance). Other search metrics included a 1.5 S/N cutoff and 0.01 score cutoff. Spectra were manually curated to remove poor fragment ion matches. To calculate P-scores, fragment monoisotopic masses were generated with the THRASH algorithm in TDValidator using a 1.5 S/N cutoff, + 1 charge (+ 30 charge for m/z domain data), 30,000 Da maximum mass, and 0.9 minimum RL value and uploaded to ProSight Lite (http://prosightlite.northwestern.edu/) with ± 10 ppm tolerances.
Results and discussion
To evaluate the ability of each proteomic technology to detect key regions of rhIL-6, we first performed NGPS of rhIL-6 using the Platinum instrument (Fig. 1a). The Platinum platform includes three components: kits for bottom-up protein processing and sequencing; a benchtop Platinum instrument that accommodates semiconductor chips for sequencing of polypeptides (Fig. 1b and c); and cloud-based software for analysis of sequencing data (Fig. 1d and e).
Fig. 1.
Next-generation protein sequencing (NGPS) of IL-6 with the Platinum instrument. a The Platinum instrument sequences single peptide molecules with single amino acid resolution. b Sequencing kits include semiconductor chips, aminopeptidases, and six dye-labeled NAA recognizers that reversibly bind 13 target NAAs. c Binding of dye-labeled NAA recognizers generates kinetic information indicating which amino acid is being detected. d Results of in silico Lys-C digestion of mature IL-6. Missing C-terminal residues 171–183 lack a lysine residue (K) and thus are not amenable for peptide capture and sequencing on Platinum. Colored boxes indicate potential recognition events. Gray boxes indicate amino acids not amenable to recognition. Red asterisks indicate automated alignments of sequencing data to reference sequence. e NGPS of rhIL-6 on Platinum detects five IL-6 peptides (dotted rectangles). NAA recognizers elicit kinetic signatures such as pulse duration (PD), which reflects the affinity between specific recognizers and NAAs. PD histograms represent the statistical distribution of kinetic data for all pulses associated with a specific residue and support inference of the corresponding amino acid. Values are reported as the median of the mean across reaction chambers
For on-chip sequencing, surface-immobilized peptides are exposed to a mixture of freely diffusing NAA recognizers and aminopeptidases (Fig. 1b) [14]. Six NAA recognizers, labeled with different fluorophores, reversibly bind 13 target NAAs (Fig. 1b) and elicit characteristic pulsing patterns upon binding to each NAA (Fig. 1c). The semiconductor chip converts fluorescence signals into digital readouts, enabling real-time sequencing of single peptide molecules in parallel [14]. Cycles of binding and cleavage proceed to sequentially reveal the order of NAAs and enable identification of the peptide sequence [14].
For analysis of sequencing data, only peptides that meet specific thresholds (see “Materials and methods,” section “Peptide Alignment v2.3.0 workflow”) are eligible for high-confidence alignment to the reference rhIL-6 sequence (red asterisks, Fig. 1d). Based on these criteria, three rhIL-6 peptides (V1PPGEDSK8, E41TCNK45, and M66AEK69) are ineligible for alignment. In addition, the C-terminal segment 171–183 (E171FLQSSLRALRQM183) does not contain K and thus is not amenable to conjugation and on-chip sequencing. NGPS sequences five rhIL-6 peptides (Fig. 1e) and determines the identity of 46/183 single amino acids, indicating ~ 25% amino acid level coverage within rhIL-6 (Fig. 1e).
To support high-confidence identification of IL-6, we also examined the false discovery rate (FDR) for each peptide and used Protein Inference analysis (see “Materials and methods,” section “Protein Inference v2.5.2 workflow”) to determine the specificity of IL-6 mapping relative to a protein panel. Platinum analysis outputs FDR for each sequenced peptide using a target-decoy approach that is analogous to methodologies employed in MS, with a cutoff set at 0.1 (10%). FDR scores for each peptide are Q27IRYILDGISALRK40 (FDR: 0.0), V120LIQFLQK127 (FDR: 0.0), D70GCFQSGFNEETCLVK85 (FDR: 0.02), E54ALAENNLNLPK65 (FDR: 0.07), and L150QAQNQWLQDMTTHLILRSFK170 (FDR: 0.09). Interestingly, an FDR score of 0.0 was computed for peptide V120LIQFLQK127, despite contiguous isobaric residues leucine (L) in position 2 and isoleucine (I) in position 3 (Fig. 1e). The recognizer for branched chain NAAs (LIV) exhibits differential pulse duration (PD) profiles for L (6.02 s) and I (0.77 s), discerning the order of L and I residues with the same mass. This example demonstrates that a single NAA-recognizer can differentiate NAAs with similar physicochemical properties on the basis of differences in PD. In addition to FDR, we also used the Protein Inference analysis software, “Protein Inference v2.5.2 workflow” which screens the sequencing data against a reference panel of 8076 proteins. IL-6 was identified as the top protein hit with 99.99% confidence (see Electronic Supplementary Material Fig. S1). These results demonstrate NGPS confidently identifies rhIL-6 with single amino acid resolution.
Next, we deployed the same rhIL-6 sample for intact mass measurement via I2MS. I2MS accurately identifies unmodified rhIL-6 (~ 20.8 kDa) and several higher molecular weight proteoforms up to ~ 22 kDa (Fig. 2a), consistent with reports from the manufacturer and others that glycosylation is expected for our HEK293-expressed rhIL-6 [26]. For comparison, a conventional composite m/z domain spectrum is provided in Electronic Supplementary Material Fig. S2. Human IL-6 has been shown to undergo a number of PTMs, including O- and N-linked glycosylation and phosphorylation, though few reports detail the composition or localization of IL-6 O-glycans [9, 27]. Using the intact masses of these proteoforms, the composition of the putative O-glycans was determined by searching the GlyGen database [28] for those glycans corresponding to the observed intact mass shift, and their chemical formula and mass were calculated using the publicly available NIST Glyco Mass Calculator [29] (see Electronic Supplementary Material Table S1). Thus, I2MS enables the observation of intact proteins and proteoforms, which is distinct from the data produced by NGPS.
Fig. 2.
I2MS detects intact rhIL-6 proteoforms, and I2MS2 covers broad regions of rhIL-6 primary structure. a The application of I2MS to TD-MS enables measurement of the intact mass of intact protein ions of canonical rhIL-6 (~ 20.8 kDa) labeled 1. Higher mass proteoforms (PFRs) labeled 2–13 indicate the addition of PTMs. b TD-MS provides broad sequence coverage of canonical rhIL-6 (PFR 1) by higher energy collision dissociation (HCD), achieving 39% sequence coverage. Shaded half cystines indicate cysteine residues involved in disulfide bonds
We then isolated and fragmented discrete rhIL-6 proteoforms via I2MS2. Using higher energy collision dissociation (HCD), I2MS2 achieves 39% fragmentation coverage of the unmodified form of rhIL-6, with shared and distinct regions identified by NGPS (Figs. 1e and 2b). Of the 12 specified higher mass species, 7 of the more abundant species showed satisfactory ion counts of the desired precursor with minimal co-isolation of other species (PFRs 4, 5, 8, 9, 11, 12, 13) using The Fisher, an in-house tool for determining I2MS2 isolation windows from the intact I2MS2 spectrum. I2MS2 results of these selected higher mass species are consistent with putative O-glycans with compositions of N-acetyl hexosamine, hexose, deoxyhexose, and sialic acid variably located towards the C-terminus in Helix D and the adjacent C-D loop. Localization of the O-glycans at specific serine or threonine residues along the base rhIL-6 sequence was predicted using the OGP repository [30]. Sites T6, T48, S49, T117, T147, T166, T170, T171, and T177 generated a probability for O-glycosylation greater than 0.5 and thus were used as potential O-glycan locations. Canonical rhIL-6 and selected higher mass species with their respective O-glycan composition and localization are Supported with fragment map P-scores ranging from 6.6E−25 to 2.3E−04 (see Electronic Supplementary Material Fig. S3-S10 and Online Resource 1). The O-glycan localization (T147, T166, and T177 across all glycoforms) which generated the smallest P-score for each proteoform is reported. O-glycosylation located towards the C-terminus of human IL-6 is consistent with that reported in lung adenocarcinoma cells isolated from malignant pleural effusion [27], thus demonstrating the high utility of I2MS and I2MS2 for measuring specific glycosylated proteoforms of IL-6 associated with disease.
Both NGPS and I2MS2 detect the E54ALAENNLNL63 sequence region (Figs. 1e and 2b). Also, while NGPS provides single amino acid resolution of the D70GCFQSGFNE79 fragment, only a single residue cleavage was observed within this peptide by I2MS2 (Figs. 1e and 2b). Similarly, NGPS reports the sequence of the V120LIQFLQK127 peptide (Fig. 1e), which also contains no residue cleavages by I2MS2 (Fig. 2b).
For the region encompassing Q27IRYILDGISAL38 (Figs. 1e and 2b), NGPS and I2MS2 provide complementary coverage. While NGPS enables single amino acid resolution of 11/12 amino acids (Fig. 1e), I2MS2 detects fragment ions on either side of the glycine (G) residue, which is not detected by NGPS due to the lack of a G recognizer (Fig. 1b). While NGPS does not currently provide a C recognizer, I2MS2 detects C-containing fragments C43NKSNMCESSK53 and C82LVKIITGLLEFEVYLEYLQNR103 (Fig. 2b). In addition to this 22-amino-acid-long internal fragment, I2MS2 also provides near complete amino acid coverage across N131LDAITTPDPTTNASLLTK149 (Fig. 2b). Both NGPS and I2MS2 sequence the C-terminal region of rhIL-6 encompassing L150QAQNQW156 (Figs. 1e and 2b), which contains the only tryptophan (W) in IL-6 [31, 32]. Interestingly, amino acids within this fragment have been implicated in IL-6 binding interactions with receptors.
Overall, our results demonstrate NGPS and TD-MS provide sequence information for overlapping and distinct regions within rhIL-6. To place these sequencing results in the context of IL-6 tertiary structure and function, we mapped the amino acid regions detected by NGPS (Fig. 1e) and I2MS2 (Fig. 2b) using an IL-6 crystal structure as a reference [33] (Fig. 3a).
Fig. 3.
NGPS and I2MS/I2MS2 cover hydrophobic and hydrophilic regions of IL-6 important for IL-6 interactions. a Mature human IL-6 schematic. Numbering is approximate based on a reported three-dimensional structure of human IL-6. b Kyte-Doolittle hydropathy plot reports surface-exposed regions. Positive scores correspond to hydrophobic regions. Negative scores correspond to hydrophilic regions. Amino acid sequence position is on the x-axis. Average hydropathy score is calculated for windows of 9 amino acids. c NGPS (Platinum) and I2MS provide broad coverage of IL-6, combining to resolve 52% of single amino acids with coverage reported in key regions known to impact IL-6 interactions (shaded in gray)
IL-6, a four-helix bundle cytokine that is subject to differential signal peptide cleavage and glycosylation [8, 10, 34], is a multi-functional protein that transmits cellular signaling via IL-6 receptor alpha (IL-6R) and beta (gp130) [35]. The molecular information for IL-6 binding and activity is enabled via adoption of a four-helix fold, with a mini-helix before Helix D [33]. Each alpha helix contains 20–30 amino acids, with long AB and CD loops that accommodate an up-up-down-down helical orientation (Fig. 3a). Previous studies indicate that multiple segments within IL-6 topology are necessary for biological interactions [36]. Hence, broad coverage of IL-6 sequence is key for elucidating residues important for IL-6 interactions, which occur via hydrophobic and hydrophilic interactions spread across different domains.
To determine if NGPS and I2MS provide broad coverage of the helices and loop regions, we generated a Kyte-Doolittle hydropathy plot [37] (Fig. 3b). Interestingly, NGPS and I²MS analyze broad regions of IL-6 that are hydrophobic (positive scores) and hydrophilic (negative scores) (Fig. 3b). NGPS resolves single amino acids within peptide Q27IRYILDGISAL38 (Fig. 3c), located within Helix A, which is one of the most hydrophobic regions in IL-6 (Fig. 3b). Further, NGPS also detects V120LIQFLQK127 located within a hydrophobic portion of Helix C (Fig. 3b).
The most hydrophobic portion of IL-6 encompasses C82LVKIITGLLEFEVYLEYLQNR103 within Helix B (Fig. 3b), an internal fragment that I2MS detects (Fig. 3c). While this fragment is nonpolar, other cysteine-containing segments lie within hydrophilic regions (Fig. 3b). I2MS also discerns N131LDAITTPDPTTNASLLTK149 located within an amphipathic region of the CD loop and the mini-helix that precedes Helix D (Fig. 3a, c). Both I2MS and NGPS cover amino acids within Helix D (Fig. 3c), a C-terminal region that contains a stretch of amphipathic residues (Fig. 3b). Overall, NGPS and I2MS sequence various regions of IL-6 with different amino acid compositions and properties, highlighting the complementarity of these orthogonal analytical methods (Fig. 3c).
For sequence structure-function analysis, we performed a literature survey to determine which sequenced regions are relevant to IL-6 interactions. Removal of amino acids within the Q27IRYILD33 segment of Helix A reduces or abolishes IL-6 activity [36, 38] (Fig. 3c). Within the AB loop, E54ALAENNLNL63 contains residues important for IL-6 binding to IL-6R [39] (Fig. 3c). Similarly, D70GCFQSGFNE79 harbors residues such as phenylalanine (F), both of which are relevant for IL-6 binding interactions (Fig. 3c) [39–41]. We demonstrate that NGPS resolves both F residues within D70GCFQSGFNE79 (Fig. 1e, Fig. 3c). NGPS also detects peptide V120LIQFLQK127 within Helix C (Fig. 1e, Fig. 3c). Notably, a segment including V120LIQFLQK127 is truncated in an IL-6 splice variant with altered signaling (Fig. 3c) [7]. The mini-helix and amphipathic D helix contain a stretch of leucine and polar Q/N residues that influence IL-6 binding [31, 32]. Therefore, both NGPS and I2MS detect regions of IL-6 relevant to its biological function and interactions.
Limitations of this study include the use of recombinant IL-6 and, as a proof-of-concept study, its small sample size. Future studies will aim to expand the scope of analysis for combining NGPS and I2MS2 across a broader range of target proteins and proteoforms. Because both Platinum and I2MS do not utilize any online separations, managing sample complexity and dynamic range will be a key consideration for extension to other sample types such as cell lysates, media, or biofluids. Provided the initial steps can be optimized appropriately, we see a straightforward path for the application of both technologies to immunoprecipitates and SDS-PAGE gel bands. Indeed, applications of I2MS to complex immunoprecipitated samples and direct sampling have recently been demonstrated [22, 42].
Conclusion
Taken together, our data demonstrate complementarity of single-molecule protein sequencing (NGPS on Platinum) and I2MS to cover key regions of IL-6. While NGPS provided single amino acid resolution for fragments like V120LIQFLQK127, which are essential for IL-6 receptor binding, I2MS enabled detection of larger proteoforms, providing critical information on PTMs, such as glycosylation, that affect IL-6’s stability and bioactivity. When combined, NGPS and I2MS cover many key regions of IL-6, achieving 52% combined sequence coverage of single amino acids. This sets up improvements in technology and informs future studies of how polymorphisms or mutations affect PTMs on endogenous IL-6. These complementary datasets underscore the potential of combining single-molecule protein sequencing and mass spectrometry to obtain a more comprehensive picture of IL-6 structural and functional diversity, which is vital for understanding its therapeutic potential.
Supplementary Information
Below is the link to the electronic supplementary material.
(DOCX 8.50 MB)
(XLSX 94.7 KB)
Acknowledgements
We thank researchers at Quantum-Si, particularly Brian Reed and Meredith Carpenter, and members of the Northwestern Proteomics Center of Excellence (PCE) for technical assistance.
Author contributions
Conception and project design: KAS, MAC, NLK. Data acquisition: KAS, TDF, AL, TS, EF, AS. Data analysis and interpretation of results: KAS, TDF, MW, HH, TS, EF, AS, MAC, NLK. Original draft preparation: KAS and MAC. All authors reviewed the results and approved the final version of the manuscript.
Funding
Research reported in this publication was supported by the National Institute of General Medical Sciences of the National Institutes of Health under Award Number RM1GM156535. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Data availability
Raw and processed individual ion mass spectrometry data files and processed Platinum data files are available in the MassIVE data repository and can be accessed via accession MSV000098537.
Declarations
Competing interests
KAS is a shareholder and former employee of Quantum-Si. NLK serves as a consultant to Thermo Fisher Scientific and is engaged in various entrepreneurial activities related to TD-MS. The other authors declare no competing interests.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Patterson SD, Aebersold RH. Proteomics: the first decade and beyond. Nat Genet. 2003;33(S3):311–23. [DOI] [PubMed] [Google Scholar]
- 2.Cohen P. Protein kinases — the major drug targets of the twenty-first century? Nat Rev Drug Discov. 2002;1(4):309–15. [DOI] [PubMed] [Google Scholar]
- 3. Chen F, Zhong H, Chan G, Ouyang D. A comprehensive analysis of biopharmaceutical products listed in the FDA’s purple book. AAPS PharmSciTech. 2024;25(5). [DOI] [PubMed]
- 4.Marshall SA, Lazar GA, Chirino AJ, Desjarlais JR. Rational design and engineering of therapeutic proteins. Drug Discov Today. 2003;8(5):212–21. [DOI] [PubMed] [Google Scholar]
- 5.Saxton RA, Glassman CR, Garcia KC. Emerging principles of cytokine pharmacology and therapeutics. Nat Rev Drug Discov. 2023;22(1):21–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Smith LM, Kelleher NL, Consortium for Top Down P. Proteoform: a single term describing protein complexity. Nat Methods. 2013;10(3):186–7. [DOI] [PMC free article] [PubMed]
- 7.Bihl MP, Heinimann K, Rüdiger JJ, Eickelberg O, Perruchoud AP, Tamm M, et al. Identification of a novel IL-6 isoform binding to the endogenous IL-6 receptor. Am J Respir Cell Mol Biol. 2002;27(1):48–56. [DOI] [PubMed] [Google Scholar]
- 8. Schumertl T, Lokau J, Rose-John S, Garbers C. Function and proteolytic generation of the soluble interleukin-6 receptor in health and disease. Biochimica et Biophysica Acta (BBA) - Molecular Cell Research. 2022;1869(1):119143. [DOI] [PubMed]
- 9.Santhanam U, Ghrayeb J, Sehgal PB, May LT. Post-translational modifications of human interleukin-6. Arch Biochem Biophys. 1989;274(1):161–70. [DOI] [PubMed] [Google Scholar]
- 10.May LT, Shaw JE, Khanna AK, Zabriskie JB, Sehgal PB. Marked cell-type-specific differences in glycosylation of human interleukin-6. Cytokine. 1991;3(3):204–11. [DOI] [PubMed] [Google Scholar]
- 11.Wong HE, Huang C Jr., Zhang Z. Amino acid misincorporation in recombinant proteins. Biotechnol Adv. 2018;36(1):168–81. [DOI] [PubMed] [Google Scholar]
- 12.Smith LM, Kelleher NL. Proteoforms as the next proteomics currency. Science. 2018;359(6380):1106–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Carbonara K, Andonovski M, Coorssen JR. Proteomes are of proteoforms: embracing the complexity. Proteomes. 2021;9(3):38. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Reed BD, Meyer MJ, Abramzon V, Ad O, Ad O, Adcock P, et al. Real-time dynamic single-molecule protein sequencing on an integrated semiconductor device. Science. 2022;378(6616):186–92. [DOI] [PubMed] [Google Scholar]
- 15.Sittipongpittaya N, Skinner KA, Jeffery ED, Watts EF, Sheynkman GM. Protein sequencing with single amino acid resolution discerns peptides that discriminate tropomyosin proteoforms. J Proteome Res. 2025;24(8):3798–807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Kafader JO, Durbin KR, Melani RD, Des Soye BJ, Schachner LF, Senko MW, et al. Individual ion mass spectrometry enhances the sensitivity and sequence coverage of top-down mass spectrometry. J Proteome Res. 2020;19(3):1346–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Kafader JO, Melani RD, Durbin KR, Ikwuagwu B, Early BP, Fellers RT, et al. Multiplexed mass spectrometry of individual ions improves measurement of proteoforms and their complexes. Nat Methods. 2020;17(4):391–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Des Soye BJ, McGee JP, Hollas MAR, Forte E, Fellers RT, Melani RD, et al. Automated immunoprecipitation, sample preparation, and individual ion mass spectrometry platform for proteoforms. Anal Chem. 2024. [DOI] [PMC free article] [PubMed]
- 19.Käll L, Storey JD, MacCoss MJ, Noble WS. Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. J Proteome Res. 2007;7(1):29–34. [DOI] [PubMed] [Google Scholar]
- 20.Park HM, Winton VJ, Drader JJ, Manalili Wheeler S, Lazar GA, Kelleher NL, et al. Novel interface for high-throughput analysis of biotherapeutics by electrospray mass spectrometry. Anal Chem. 2020;92(2):2186–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.McGee JP, Senko MW, Jooss K, Des Soye BJ, Compton PD, Kelleher NL, et al. Automated control of injection times for unattended acquisition of multiplexed individual ion mass spectra. Anal Chem. 2022;94(48):16543–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.McGee JP, Su P, Durbin KR, Hollas MAR, Bateman NW, Maxwell GL, et al. Automated imaging and identification of proteoforms directly from ovarian cancer tissue. Nat Commun. 2023;14(1):6478. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Su P, Mcgee JP, Hollas MAR, Fellers RT, Durbin KR, Greer JB, et al. Standardized workflow for multiplexed charge detection mass spectrometry on orbitrap analyzers. Nat Protoc. 2025;20(6). [DOI] [PMC free article] [PubMed]
- 24.Senko MW, Beu SC, McLaffertycor FW. Determination of monoisotopic masses and ion populations for large biomolecules from resolved isotopic distributions. J Am Soc Mass Spectrom. 1995;6(4):229–33. [DOI] [PubMed] [Google Scholar]
- 25.Rockwood AL, Haimi P. Efficient calculation of accurate masses of isotopic peaks. J Am Soc Mass Spectrom. 2006;17(3):415–9. [DOI] [PubMed] [Google Scholar]
- 26.Tan E, Chin CSH, Lim ZFS, Ng SK. HEK293 cell line as a platform to produce recombinant proteins and viral vectors. Front Bioeng Biotechnol. 2021;9:796991. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Hung CH, Wu SY, Yao CD, Yeh HH, Lin CC, Chu CY, et al. Defective N-glycosylation of IL6 induces metastasis and tyrosine kinase inhibitor resistance in lung cancer. Nat Commun. 2024;15(1):7885. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.York WS, Mazumder R, Ranzinger R, Edwards N, Kahsay R, Aoki-Kinoshita KF, et al. GlyGen: computational and informatics resources for glycoscience. Glycobiology. 2020;30(2):72–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. De Leoz ML, Stein S. NIST glyco mass calculator [Available from: https://www.nist.gov/static/glyco-mass-calc/#/.
- 30.Huang J, Wu M, Zhang Y, Kong S, Liu M, Jiang B, et al. OGP: a repository of experimentally characterized O-glycoproteins to facilitate studies on O-glycosylation. Genomics Proteomics Bioinformatics. 2021;19(4):611–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Nishimura C, Ekida T, Masuda S, Futatsugi K, Itoh Si, Yasukawa K, et al. Chemical modification and 1H‐NMR studies on the receptor‐binding region of human interleukin 6. European Journal of Biochemistry. 1991;196(2):377–84. [DOI] [PubMed]
- 32.Simpson RJ, Hammacher A, Smith DK, Matthews JM, Ward LD. Interleukin-6: structure-function relationships. Protein Sci. 1997;6(5):929–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Somers W, Stahl M, Seehra JS. 1.9 A crystal structure of interleukin 6: implications for a novel mode of receptor dimerization and signaling. The EMBO journal. 1997;16(5):989–97. [DOI] [PMC free article] [PubMed]
- 34.Widjaja AA, Chothani SP, Cook SA. Different roles of interleukin 6 and interleukin 11 in the liver: implications for therapy. Hum Vaccin Immunother. 2020;16(10):2357–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Scheller J, Ettich J, Wittich C, Pudewell S, Floss DM, Rafii P. Exploring the landscape of synthetic IL-6-type cytokines. FEBS J. 2023;291(10):2030–50. [DOI] [PubMed] [Google Scholar]
- 36.Snouwaert JN, Kariya K, Fowlkes DM. Effects of site-specific mutations on biologic activities of recombinant human IL-6. J Immunol. 1991;146(2):585–91. [PubMed] [Google Scholar]
- 37.Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. J Mol Biol. 1982;157(1):105–32. [DOI] [PubMed] [Google Scholar]
- 38. Brakenhoff JP, Hart M, Aarden LA. Analysis of human IL-6 mutants expressed in Escherichia coli. Biologic activities are not affected by deletion of amino acids 1–28. The Journal of Immunology. 1989;143(4):1175–82. [PubMed]
- 39.Toniatti C, Cabibbo A, Sporena E, Salvati AL, Cerretani M, Serafini S, et al. Engineering human interleukin-6 to obtain variants with strongly enhanced bioactivity. EMBO J. 1996;15(11):2726–37. [PMC free article] [PubMed] [Google Scholar]
- 40.Ehlers M, GrÖTzinger J, Fischer M, Bos HK, Brakenhoff JPJ, Rose-John S. Identification of single amino acid residues of human IL-6 involved in receptor binding and signal initiation. J Interferon Cytokine Res. 1996;16(8):569–76. [DOI] [PubMed] [Google Scholar]
- 41. Kalai M, Montero-Julian FlA, Grötzinger J, Fontaine Vr, Vandenbussche P, Deschuyteneer R, et al. Analysis of the human interleukin-6/human interleukin-6 receptor binding interface at the amino acid level: proposed mechanism of interaction. Blood. 1997;89(4):1319–33. [PubMed]
- 42.Drown BS, Gupta R, Mcgee JP, Hollas MAR, Hergenrother PJ, Kafader JO, et al. Precise readout of MEK1 proteoforms upon MAPK pathway modulation by individual ion mass spectrometry. Anal Chem. 2024;96(11):4455–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
(DOCX 8.50 MB)
(XLSX 94.7 KB)
Data Availability Statement
Raw and processed individual ion mass spectrometry data files and processed Platinum data files are available in the MassIVE data repository and can be accessed via accession MSV000098537.



