Abstract
Advances in liquid chromatography–mass spectrometry have significantly improved proteomic analyses of human plasma. However, information at the level of intact proteoforms remains limited due to the high dynamic range of protein abundance and the complexity of post-translational modifications. To address this challenge, we introduce soluble plasma proteoform analysis via acetonitrile precipitation (SPAP), a streamlined workflow for top-down mass spectrometry-based proteomics that isolates small, intact proteoforms from the acetonitrile-soluble plasma fraction, enabling direct measurement of proteoform diversity and post-translational modifications with high resolution. This simple and scalable method employs cold acetonitrile to precipitate abundant plasma proteins, thereby enriching the sample for lower-molecular-weight proteoforms. We first assessed the method’s performance using a reference plasma sample. To explore its clinical applicability, we applied SPAP to a cohort of 40 individuals, including 30 patients with liver cirrhosis and 10 healthy controls. In total, we report 3746 proteoforms from 255 proteins, including those with phosphorylation, truncation, and disulfide bond modifications. Reproducibility was confirmed with a coefficient of variation of <10% for the majority of enriched proteoforms, including those potentially associated with hemostasis, lipoprotein metabolism, cytoskeletal structure, and protease regulation. SPAP enabled effective stratification of the three cirrhosis stages, verifying previously published results and supporting the identification of candidate biomarkers. Although liver cirrhosis was used as a model system, the SPAP workflow is broadly applicable to human disease with proteoform-level resolution, offering a new path to stronger correlations in smaller cohorts and addressing key challenges in diagnostic and biomarker discovery.
Keywords: proteoforms, plasma, top-down proteomics, mass spectrometry, acetonitrile precipitation


Introduction
The plasma retains many of the essential characteristics of our blood. Frequently utilized for clinical diagnostics, plasma is reasonably representative of the body and reflects various physiological and pathological states. , However, unraveling the extent to which the plasma/serum proteome reflects and correlates with human phenotypes remains a significant challenge. This complexity arises from the vast dynamic rangespanning over 10 orders of magnitudeand the dominance of a small subset of highly concentrated proteins, such as immunoglobulins and albumin, which constitute 99% of the mass of plasma protein.
In the field of proteomics, most of the research related to plasma and serum relies on bottom-up proteomics (BUP). − In this approach, proteins are enzymatically digested and identified based on their derived peptides, but crucial information about the original precursor proteoforms is missing. To address these limitations, some recent studies have shifted to employ a top-down proteomics (TDP) approach. This alternative method involves identifying and quantifying intact proteins directly from plasma or serum using either discovery or targeted strategies. − Kline and colleagues reached a remarkable achievement by identifying over 1000 proteoforms from serum for the first time. Their innovative approach combined several techniques, such as immunodepletion, gel fractionation, and gas-phase fractionation (GPF) of proteoform ions using high-field asymmetric waveform ion mobility spectrometry (FAIMS). Recently, we demonstrated the identification of >2000 proteoforms from plasma samples by employing gel fractionation and proteoform enrichment with engineered nanoparticles. Sun and co-workers also annotated proteoforms of biomarker-related proteins using nanoparticle proteoform enrichment combined with capillary electrophoresis (CE). ,
While existing plasma proteoform analyses are effective, they often involve multiple fractionation steps, making them cumbersome and less suitable for high-throughput applications, limiting their utility for large clinical cohorts. To address this, we developed soluble plasma proteoform analysis via acetonitrile precipitation (SPAP)a simplified, scalable, and reproducible TDP workflow that leverages the acetonitrile-soluble fraction of plasma. This fraction, obtained by precipitating high-molecular-weight proteins with cold acetonitrile, is enriched in low-molecular-weight (LMW) proteins (<30 kDa) and naturally depleted of the most abundant plasma proteins. − Using SPAP, we achieved high-throughput detection of intact proteoforms directly from this soluble fraction, enabling quantification of nearly 3746 proteoforms across diverse functional classes. To illustrate the method’s utility, we applied it to a clinically stratified cohort of patients with liver cirrhosisa disease marked by systemic dysregulation and lacking sensitive molecular staging tools. These findings demonstrate that the acetonitrile-soluble plasma fraction is a rich and readily obtained source of clinically relevant proteoforms.
Materials and Methods
Plasma Samples
Sample from a Healthy Individual
A plasma sample from a healthy individual was obtained from STEMCELL Technologies (Vancouver, BC, Canada) and used as a reference sample (Figure A). This reference material enabled optimization and evaluation of the method’s technical performance at the proteoform level, including reproducibility (CVs), label-free quantification, and top-down mass spectrometry workflows. Its availability in a larger volume (∼10 mL) made it suitable for method development prior to clinical application.
1.
Workflow analysis for precipitated acetonitrile plasma samples. (A) Reference sample from plasma of a healthy individual. Each sample evaluated is a pool of plasma from 3 or 4 individuals. (B) Cirrhosis cohort of plasma samples, consisting of a control group and three groups of patients with different stages of cirrhosis (I, II, and III). (C) Experimental workflow followed for the analysis of acetonitrile-precipitated plasma samples; pHTN: portal hypertension.
Patients with Liver Cirrhosis and Controls
Peripheral blood samples were collected from 40 subjects, 30 with liver cirrhosis (stage I: n = 9, stage II: n = 10, and stage III: n = 11) and 10 healthy controls at Northwestern Medicine. Plasma was isolated and stored at −80 °C until protein extraction was performed. Cirrhosis diagnosis and staging were confirmed through medical record review, including Fibroscan, magnetic resonance elastography, or biopsy as described in our previous paper. Briefly, decompensation was defined by the presence of any of the following complications: hepatic encephalopathy, ascites, and gastrointestinal bleeding. Portal hypertension (pHTN) was diagnosed via imaging, thrombocytopenia, or endoscopic evidence of varices. Demographic, clinical, and laboratory data, including MELD-Na, were collected at the time of blood drawing for each subject. Samples from each condition (healthy and cirrhosis stages I, II, and III) were pooled into three groups using a random selection of individuals (Figure B). Finally, 12 pools of plasma samples, with three pools per condition, were processed for TDP analysis.
Study Design for Quantitative Analysis
A plasma sample (100 μL) obtained from the healthy individual (Figure A, our reference sample) was divided into six technical replicates for a quantitative analysis. On the other hand, only one technical replicate was evaluated from the 12 cirrhosis and control plasma pools (Figure B). After following the protocol for acetonitrile precipitation described below, we used 10 μL from each resuspended sample and diluted it with 60 μL of water with 0.1% formic acid prior to injection for LC-MS analysis.
Study Approval
This study adhered to NIH guidelines for human subject research. The Northwestern IRB granted a Waiver of Consent and approved the study under protocol STU00216399.
Chemicals and Reagents
Acetonitrile (ACN), water (LC-MS grade), and formic acid (FA) were purchased from Thermo Fisher Scientific.
Acetonitrile Precipitation of Plasma Samples
Acetonitrile (ACN) precipitation was performed using a protocol adapted from a published BUP study (Figure C, top). Briefly, 20 or 100 μL of plasma was centrifuged at 13,000 rpm for 30 min at room temperature to separate and discard the lipid layer. Then, an equal volume of cold ACN (stored at −20 °C) was added to the plasma, followed by vortexing for 15 s to initiate protein precipitation. Samples were incubated on ice for 1 h and vortexed for 1–2 min every 20 min. After incubation, samples were centrifuged at 13,000 rpm for 40 min to pellet the precipitated proteins. The resulting supernatant was transferred to a Protein LoBind tube and centrifuged again at 13,000 rpm for 15 min to remove any remaining particulates. This clarified supernatant was then transferred to a fresh LoBind tube, dried in a SpeedVac for 1 h, and resuspended in 20 or 100 μL of water containing 0.1% formic acid (FA), matching the original plasma volume. When needed, the protein concentration was measured using a bicinchoninic acid (BCA) protein assay kit (Thermo Fisher Scientific) following the manufacturer's instructions.
Top-Down Proteomics
Top-Down Liquid Chromatography–Mass Spectrometry
Precipitated samples from both 20 or 100 μL plasma samples were diluted 6-fold and analyzed under different acquisition conditions using an Ascend tribrid mass spectrometer. Proteins (10 μL per run) were separated using a Vanquish Neo UHPLC system (Thermo Fisher Scientific). Reversed-phase liquid chromatography was conducted on a MAbPac EASY-Spray column (150 mm length by 150 μm i.d., Thermo Fisher Scientific) and an in-house packed PLRP-S trap column (25 mm length by 150 μm i.d., Agilent). The total run time was 120 min, utilizing a gradient of mobile phase A (99.9% water, 0.1% FA) and mobile phase B (19.9% water, 80% ACN, 0.1% FA). The flow rate was maintained at 1 μL/min. The gradient profile was as follows: 5% B at 0 min, 20% B at 5 min, 70% B at 110 min, 99% B from 111 to 114 min, and 5% B from 115 to 120 min. The column outlet was connected inline to an EASY-Spray source and either an Orbitrap Eclipse or Orbitrap Ascend mass spectrometer (Thermo Fisher Scientific) (Figure C, bottom) operating in intact protein mode with 2 mTorr N2 pressure in the ion routing multipole (IRM). The transfer capillary temperature was set to 320 °C, the ion funnel RF was set to 60%, and a source CID of 15 V was applied.
For the Orbitrap Eclipse (reference sample for quantitative analysis), MS1 spectra were recorded at a resolving power of 120,000 (at m/z 200), with a normalized AGC target of 1000%, a maximum injection time of 100 ms, and 1 μscan. The data-dependent top-N-2sec MS2 method employed 32 NCE for HCD to produce fragmentation spectra at a resolving power of 60,000 (at m/z 200), with a normalized AGC target of 2000%, a maximum injection time of 1200 ms, and 1 μscan. Precursors were isolated using a quadrupole with a 3 m/z isolation window, a dynamic exclusion duration of 60 s, and a threshold intensity of 1 × 104.
For the Orbitrap Ascend (reference sample for qualitative analysis and cirrhosis cohort), MS1 spectra were acquired at 120,000 resolving power (at m/z 200), a normalized AGC target of 500 or 1000%, a 100 ms maximum injection time, and 1 μscan. The data-dependent top-N-2sec MS2 method used 27 or 32 NCE for HCD to generate fragmentation spectra acquired at a 60,000 resolving power (at m/z 200), with a normalized AGC target of 800 or 2000%, an 800 or 1200 ms maximum injection time, and 1 μscan. Precursors were isolated with quadrupole using a 3 m/z isolation window, a dynamic exclusion of 60 s duration, and a threshold of 1 × 104 intensity. Raw files and tdReport files can be found in MassIVE (Accession MSV000098558). The search results in the .tdReport format can be viewed by using TDViewer freely available at http://topdownviewer.northwestern.edu.
Top-Down Data Search and Analysis
The raw data files were analyzed using the publicly accessible standard workflow on TDPortal (https://portal.nrtdp.northwestern.edu, Code Set 4.0.0). This workflow conducts mass inference and searches a database of human proteoforms, derived from SwissProt (June 2020) with curated histones, maintaining a 1% false discovery rate (FDR) at the protein, isoform, and proteoform levels. A proteoform database was created from SwissProt human containing 2.4 million proteoform entries. A −15 ppm m/z spectral shift was applied to account for instrument calibration error (cirrhosis cohort). For quantitative proteoform analysis, a CSV intensity sheet was created using an isotopic fitting algorithm across the chromatogram to determine the intensities of all proteoforms identified in the study with 10% FDR confidence in each sample. This included those previously classified as “unidentified” and not meeting the 1% FDR identification threshold. The 3 initial searches used the same parameters and search space. As a result, we were able to combine the intermediate search result files (.pufdb) into a single .tdReport and apply a 1% FDR threshold across all the files.
Data derived from randomized biological and technical replicates were used to assess analytical performance and differentially expressed proteoforms. Missing values were filtered to remove proteoforms present in less than 50% of the samples from at least one biological condition (i.e., group). To stabilize variance and make the data more comparable across samples, intensities were log2-transformed and normalized by subtracting the median intensity of each sample from all its intensities and then dividing the result by the sample standard deviation. Variability associated with sample processing was captured by calculating the coefficient of variation (CV) between technical and biological replicates using the inverse of the log2-transformed intensities (i.e., intensities not log2-transformed). To assess differentially expressed proteoforms, the log2-transformed intensities were z-score-normalized across samples and used as a dependent variable in a hierarchical mixed linear regression model. Biological and technical replicates were set as random nested variates in the model, and p-values were adjusted using the Benjamini and Hochberg approach to control the false discovery rate (FDR) induced by multiple comparison tests. In all cases, adjusted p-values <0.05 were considered significant. The analysis was done on the RStudio platform, , using the Ime4 package. Following a previous study, hemoglobin subunits were not included in downstream analyses as their presence may stem from sample handling or preparation.
Volcano plots were generated to display each proteoform as a function of estimated effect size (log2 fold change) and the statistical confidence of differences between the two groups (−log10 of adjusted p-value). Identified proteins were matched with the Human Protein Atlas database (https://www.proteinatlas.org/) using accession numbers for grouping proteins by abundance. Gene ontology (GO) analyses were performed on the David online tool (https://davidbioinformatics.nih.gov/) and using the “clusterProfiler” R package. Venn diagrams were performed using the BioVenn application (https://www.biovenn.nl/). Spearman correlation and related figures were generated using GraphPad Prism, version 10.2.3. Other figures were generated with R-custom scripts (available on request).
Bottom-Up Proteomics
Sample Preparation and Mass Spectrometry Analysis
Samples were prepared for LC-MS according to an established protocol at the Northwestern University Proteomics Core. LC-MS/MS analyses were conducted using the Vanquish Neo UHPLC system (Thermo Fisher, VN-S10-A-01) coupled to an Orbitrap Exploris 240 mass spectrometer (Thermo Fisher, BRE725535). Peptide separation was performed using a C18 column (Ion Opticks, AUR3-15075C18-CSI, 15 cm × 75 μm, 1.7 μm) using 1 μg equivalent of the sample per injection. Peptide elution was performed at a nano flow rate of 200 nL/min on a 120 min gradient using mobile phase A (99.9% Optima LC/MS grade water and 0.1% formic acid) and mobile phase B (80% ACN, 19.9% Optima LC/MS grade water, and 0.1% FA). Gradient conditions were 0% B initially, 8% B at 1 min, 28% B at 86 min, 50% B at 106 min, 100% B at 107 min, and finally 0% B from 111 to 120 min to equilibrate the column for the next injection. Positive electrospray ionization was performed using a Nanospray Flex Ion Source (Thermo Fisher Scientific) at a spray voltage of 2200 V. MS1 (full scan) settings were a 350–1600 m/z scan range, a 60% RF lens, a 120k orbitrap resolution, a 300% normalized AGC target, a 25 ms maximum injection time, 1 microscan, and a 5.0 × 103 intensity threshold. Data-dependent acquisition (DDA) by TopN permitted fragmentation (MS2) of isolated precursor ions with charges 2+ to 5+ inclusive. MS2 settings were a 30 s dynamic exclusion, a 5 ppm mass tolerance (low and high), a 2 s cycle time, a 1.5 m/z isolation window, a 30% normalized collisional energy, a 15k orbitrap resolution, a scan range mode set to define a first mass of 200 m/z, a 100% normalized AGC target, and a 50 ms maximum injection time.
Bottom-Up Data Analysis
Protein presence in the MS raw data was determined using the Mascot search engine (version 2.5.1). MS/MS spectra were matched against the SwissProt database. Searches included carbamidomethyl cysteine as a fixed modification, with oxidized methionine, deamidated asparagine and aspartic acid, and an acetylated N-term as variable modifications. Up to two missed tryptic cleavages were allowed. FDRs of 1% at the peptide level and 3% at the protein level were applied. Identified peptides, proteins, and BUP metrics were visualized and exported using Scaffold software (version 5.0, Proteome Software, Inc., Portland, OR).
Results and Discussion
Dynamic Range and Reproducible Quantification of Plasma Proteoforms Unveiled through Acetonitrile Precipitation
We first identified 1022 proteoforms from 169 proteins (Tables S01 and S02) in the soluble fractions of precipitated plasma from the reference sample collected from a healthy individual (Figure A). A high analytical reproducibility was observed at chromatogram and intensity levels (Figures S1C and S2), with a 6% overall coefficient of variation (CV), including the precipitation process showing a <8% median CV (Figure A and Figure S1D). These results underscore the robustness of ACN precipitation in plasma, opening its application to probe proteoform biology.
2.
Proteoform landscape of the precipitated plasma sample from the reference healthy individual (six technical replicates). (A) Technical coefficient of variation (CV) represented by a density plot. Most of the proteoforms had a CV <10% across technical replicates. (B) Waterfall plot of 1022 identified PFRs (169 proteins). Each black dot represents a proteoform, and each vertical boxplot encloses proteoforms from one specific protein. (C) Cellular component and (D) molecular functions enriched (adj. p < 0.05) by all proteins identified in healthy patient samples.
A plasma proteoform waterfall plot of 1022 proteoform intensities was generated using their normalized median intensity grouped by proteoform family (Figure B). Proteoforms from the tubulin beta chain (TUBB) were ranked as the 34th most abundant, encompassing proteoforms ranging from 17 to 24 log2 intensities and proteoforms from proteins above rank 160 on the waterfall plot, such as the EH domain containing 3 (EHD3), with intensity ranging between log2 intensities of 15–18. An interesting example is the proteoform family of thymosin beta 4 X-linked (TMSB4X), which contains proteoforms with intensities spanning almost the whole dynamic range across 9 steps on the log2 intensity values (i.e., a range of 512-fold) (Figure B). This analysis demonstrates that the dynamic range observed within individual proteoform families (i.e., proteoforms derived from a single gene product) can span the overall dynamic range of all identified proteoforms. This suggests that the diversity in proteoform abundance is not limited to interprotein variation but is also a prominent feature within single protein families, highlighting the complexity of proteoform-level expression in the proteome.
The molecular weight (MW) distribution of all proteoforms from the healthy reference sample exhibited a bimodal curve, with the major peak at <10 kDa and another for ∼20–30 kDa soluble proteoforms (Figure S1A). Among the observed post-translational modifications (PTMs) on these ∼1000 proteoforms, the most prevalent were truncation (76%), N-terminal acetylation (10.6%), disulfide bonds (3.5%), and phosphorylation (3.4%) (Figure S1B).
Proteoform analysis of plasma and serum samples by LC-MS has consistently shown an enrichment of truncated forms, particularly those <15 kDa, across various methodologies. , This trend differs from analyses of whole-cell lysates, suggesting that this may be an inherent feature of serum/plasma proteoforms observed across a growing cohort of people, in addition to potential sample preparation artifacts.
We categorized the proteoforms based on plasma protein concentrations reported by the Human Protein Atlas (HPA) (https://www.proteinatlas.org/humanproteome/blood+protein, Table S03). Notably, 70% of the proteoforms originate from proteins classified as medium or low abundant (Figure S3A). , Among these, 20% (i.e., 183 proteoforms from 75 proteins) are considered as low abundant (i.e., <10 ng/mL). , Our findings reveal a clear relationship between the categorical protein abundance reported by the HPA and the number of proteoforms per protein. In Figure S3A, we observe a decline in the average number of proteoforms per protein for those categorized as low abundant. Notably, there is a significant discrepancy between the overall average proteoforms per protein (6) and the ratio observed specifically for high abundant proteins (23). This observation hints at the expectation that there are more proteoforms from low abundant proteins possible to detect as technology improves.
Joint Bottom-Up and Top-Down Analysis of Precipitated Proteoforms
Relative and absolute high-throughput quantification by MS of plasma proteins has been frequently reported by BUP, leading to the creation of the largest protein plasma repository. , To showcase the effectiveness of proteoform quantification through LFQ and explore the relationship between TDP and BUP, we analyzed a comparable soluble fraction using both methods (see Tables S04 and S05). Additionally, we used data from the human plasma protein database to perform correlations with our findings.
In this experiment, more than 76% of the proteins with at least one proteoform identified were also identified by BUP (Figure S3B). Since several metrics to determine protein abundance by MS are based on the number of peptides identified per protein, we contrasted the number of peptides per protein and the normalized peptide spectral count from BUP relative to the number of proteoforms per protein (i.e., the sum of PFR intensities in a proteoform family and the most intense proteoforms per family from TDP). Additionally, these metrics were then also correlated with the plasma protein concentrations reported in the HPA by BUP (Figure S3C).
In summary, we noted a moderate positive correlation between several metrics (Figure S3C). All metrics from TDP showed very high Pearson correlation with each other, notably the sum of all PFR intensities by proteoform family and the number of PFRs per protein, with a significant correlation (r = 0.97). Additionally, these TDP metrics showed moderate positive correlations with key BUP metrics (Spearman’s r = 0.42–0.50), a range commonly interpreted as moderate strength. , The TDP metrics correlated moderately with the plasma concentration of proteins in the Human Protein Atlas, with the normalized count by BUP showing the highest (r = 0.7). The TDP metrics correlated moderately with the plasma concentrations of proteins in the Human Protein Atlas, with the normalized count from BUP showing the strongest correlation (r = 0.7). This alignment is expected, given that both data sets were generated using bottom-up proteomics (BUP) strategies, which enhances the comparability and interpretability of the observed correlation.
Importantly, the moderate correlation between TDP and BUP data is not a limitation but rather a reflection of these fundamentally different approaches. BUP infers protein abundance from peptide-level data, often aggregating signals across multiple proteoforms. In contrast, TDP quantifies intact proteoforms directly, capturing the diversity of post-translational modifications and sequence variants. This distinction explains why correlations between TDP and BUP are inherently lower and why the lowest correlations were observed when comparing TDP metrics to BUP-derived protein concentrations that do not account for proteoform heterogeneity.
We argue that this moderate correlation between BUP and TDP reflects a strength of proteoform-resolved quantification, which moves beyond the averaged signals of BUP to better reflect the natural complexity of protein expression. By capturing the distinct abundance profiles of individual proteoformssome of which span orders of magnitude within a single protein familyTDP offers a complementary and more representative view of the proteome.
Mapping Functional Pathways and Cellular Origins of Proteoforms from the Soluble Plasma Fraction
Before delving into the biological interpretation of the identified proteoforms, it is important to acknowledge a limitation of top-down proteomics. Functional inference based on proteoform identification can be affected by both sample preparation artifacts and endogenous degradation. Although top-down proteomics enables the characterization of intact proteoforms, not all truncated or modified forms necessarily represent functional entities; some may arise from nonbiological processes like oxidation on the bench or proteolytic action during sample processing. Additionally, the bioinformatics analysiswhile comprehensiverelies on conventional protein-level annotations and grouping strategies, which may obscure proteoform-specific biological roles. Therefore, the biological interpretations presented herein should be regarded as provisional and hypothesis-generating, rather than conclusive, given the current limitations in proteoform-specific functional annotation.
The proteoforms identified in the soluble fraction obtained from ACN-precipitated plasma were predicted to be mostly associated with the liver, pancreas, lungs, cerebrospinal fluid, and, of course, plasma blood cells (platelets, B cells, and T cells), see Table S06. For instance, liver-associated proteoforms are often involved in metabolism and detoxification, while those from blood cells participate in immune responses. Proteoforms from the pancreas are typically related to insulin production and glucose regulation, and those from the lungs are essential for respiratory functions. Additionally, cerebrospinal fluid proteoforms can be indicative of neurological health (Table S06). A vast number of these proteoforms are predicted to be intracellularly located within the cytoplasm and cytoskeleton (Figure C and Table S07). Cytoplasmic proteins are known to constitute part of the plasma as tissue leakage markers. Notably, the actin binding, cadherin binding, and guanyl nucleotide binding proteins are all essential for maintaining cellular structure, facilitating communication, and regulating various cellular functions, ensuring proper cell behavior and tissue integrity (https://www.ncbi.nlm.nih.gov/books/NBK9893/).
These findings align with the predicted protein nature within the ACN-soluble fraction. Specifically, some small proteoforms remain soluble in native plasma due to their binding interactions with soluble carrier proteins (such as albumin). However, when the concentration of organic solvents increases, this interaction is disrupted. As a result, the globular proteins with nonamphipathic characteristics can precipitate out of solution. Thus, the examination of the soluble proteoforms may provide insights into the physiological state of various organs within the body. Upon examining other significantly enriched biological processes, we observed notable enrichments in key functional plasma pathways. Notably, the hemostasis pathway emerged prominently, representing intricate interactions among blood cells, vessels, and coagulation factors (Figure S1E). Additionally, lipoprotein metabolism was enriched in the acetonitrile-soluble fraction, with a specific focus on HDL-mediated lipid transport (Figure D andFigure S1E,F). The identification of 72 apolipoprotein proteoforms (including APOA1, 45 PFRs; APOA2, 11 PFRs; APOC1, 8 PFRs; APOC2, 4 PFRs; and APOC3, 4 PFRs) serves as a valuable representation of amphipathic proteoforms (Table S08).
Plasma Proteoform Dysregulation in Liver Cirrhosis
To probe proteoform dynamics across four populations in a 40-individual cohort related to liver cirrhosis, we compared this platform to that from a previous study using a surfactant (SDS)-based extraction and one-dimensional gel separation approach (i.e., the PEPPI workflow). These groups consisted of samples from the controls (10 subjects) and 30 patients at the three progressive stages of liver disease: compensated cirrhosis, compensated with portal hypertension, and full blown decompensated cirrhosis (Figure B).
From the N = 40 cohort, we identified a total of 1935 proteoforms associated with 109 proteins (Tables S09 and S10). The proteoform waterfall plot exhibited a trend similar to that of the reference sample, indicating that the dynamic range of individual proteoform families reflects the dynamic range of the total identified proteoforms (Figure S4A). Remarkably, 637 proteoforms were found to be differentially expressed among the pooled samples from the four groups (Table S11). The quantitative analysis revealed clear differences among the four groups, as depicted in the principal component analysis (Figure A) and the volcano plots (Figure B). Like the previous result from the plasma proteoform analysis of the reference sample, the PTMs represented in the proteoforms differentially expressed were truncation, N-terminal acetylation, and phosphorylation (Figure S4B).
3.
Proteoform landscape of precipitated plasma samples from the cirrhosis cohort. (A) Principal component analysis based on the log2 intensity of all quantified proteoforms in the four groups of cirrhosis samples. (B) Volcano plots showing significant (q-value <0.05 and log2 fold change >1.5) proteoform changes (blue and red colors) between cirrhosis stages and vs control. Each dot represents a proteoform. (C) Top 10 molecular functions enriched by DEP from the current cirrhosis study. (D) Top 6 molecular functions enriched by DEP found in this study and a previous study.
Establishing the relationship between canonical and truncated proteoforms within a single TDP experiment remains challenging. For instance, canonical proteoforms may show stable abundance, while their truncated variants exhibit differential expression due to biological regulation or technical artifacts. Although such findings are not inherently incorrect, they may not fully reflect the underlying biological complexity.
Proteoforms dysregulated were associated with the cytoplasmic vesicle lumen, platelet granules, cell substrate junction, and lipoprotein particles. Regarding biological processes and pathway enrichment, we observed changes in enzymatically active roles and lipoprotein metabolic processes (Figure C).
A notable fraction of the cirrhosis proteoform signature found in our previous study , was also seen in this experiment. Specifically, 40% of the proteoform families were detected in common (Table S13), and 80% of those were also found to be dysregulated in our study. Interestingly, nearly 70% of the observed changes showed the directional trend when comparing proteoforms in this new cohort of cirrhosis patients (Table S14).
At the individual proteoform level, only 32 out of 167 differentially expressed proteoforms (DEPs) previously reported as part of the cirrhosis signature (Table S12) were detected in the current experiment, highlighting a degree of orthogonality between the two analytical approaches. Among these 32 shared proteoforms, 11 were statistically significant in this study when comparing only liver cirrhosis stages. Notably, 10 of these 11 proteoforms exhibited consistent directional changes, either upregulated or downregulated, as observed in the prior study (Table S15).
While the overlapping number of differentially expressed proteoforms is modest, this high level of directional agreement suggests a strong degree of reproducibility in the application of TDPs to the outbred population of liver cirrhosis patients in the population. Such verification cohorts are necessary on the path toward clinical impact.
To further explore the biological relevance of these findings, enrichment analyses were performed separately for the DEPs identified in each experiment. Despite the limited overlap in individual proteoforms, both data sets revealed enrichment in similar biological processes, including peptidase activity and lipoprotein-related pathwayskey processes previously reported for this system associated with liver cirrhosis (Figure C). This convergence at the pathway level reinforces the functional consistency of the results. Moreover, this experiment contributed additional proteoforms to these enriched networks (green dots in Figure D), expanding the biological context. For instance, 41 proteoforms of APOC3 were identified as part of the lipoprotein particle receptor binding process (Table S16), further supporting the involvement of lipid metabolism in cirrhosis progression.
Applying the prior analytical strategy, no APOC3 proteoforms had been identified. Apolipoprotein C3 (APOC3) is primarily found in triglyceride-rich lipoproteins like chylomicrons and very low-density lipoproteins (VLDL), but it also influences the structure and function of HDL. Synthesized mainly in the liver, APOC3 matures into an N-terminal domain (1–40 AA) and a C-terminal domain (41–79 AA) after the removal of a 20 AA signal peptide. The C-terminal region is crucial for binding and interacting with LDL and VLDL, affecting the lipid-binding capacity and the metabolism of triglyceride-rich lipoproteins. ,,
In liver cirrhosis, proteomic studies have shown an upregulation of APOC3 levels in the blood, linked to impaired lipid metabolism regulation. , Interestingly, most APOC3 proteoforms exhibited decreased expression in advanced cirrhosis stages. Notably, 30 out of 41 proteoforms were truncated at the C-terminal, with 18 of them decreasing in decompensated cirrhosis patients compared to compensated patients (Figure S4C). Most protein families with proteoforms differentially expressed between conditions showed consistent trends, either predominantly upregulated or downregulated. However, for most proteins, a substantial proportion of proteoforms were not differentially expressed, underscoring the functional heterogeneity within these families (Figure S4D).
Truncation forms of APOC3 exhibited consistent differential expression patterns across disease stages (Table S16), suggesting a potential biological relationship. While this observation looks reasonably consistent across a growing cohort, it supports the idea that truncated proteoforms may reflect meaningful biological changes. If true, then these observations may indicate altered LDL binding and suggest a potential role for the N-terminal domain in this behavior, particularly given that the canonical form of APOC3 did not show differential expression between conditions (Figure S5). However, this interpretation requires orthogonal validation, such as binding assays or structural characterization, to confirm the functional relevance of the observed proteoform differences. These truncation forms span from 1.9 to 8.7 kDa (Table S16). Figures S6 and S7 show examples of MS spectra from these APOC3 proteoforms, clearly illustrating the differences at the C-terminus.
Regarding proteoforms that enriched the peptidase activity process after ACN precipitation, 28 PFRs from eight protein-coding genes were uniquely observed to change. Notably, 23 of these proteins have not been previously identified by PEPPI. Interestingly, two members of the cysteine protease inhibitors family, cystatin A and C, were identified with a total of seven PFRs (cystatin A: 2, cystatin C: 5) (Table S11).
The two dysregulated PFRs from cystatin A showed contrasting behaviors. The N-terminal acetylated proteoforms appeared downregulated in early stages (compensated) of cirrhosis compared to the decompensated stage, while the unacetylated form was upregulated. Similar trends were observed when comparing cystatin A-PFR expression between controls and patients with decompensated cirrhosis. In general, the N-terminal acetylation might affect its stability, localization, or interaction with other proteins, leading to different regulatory mechanisms compared to the unacetylated form. , Although these expression levels might be regulated differently in response to cellular stress or damage associated with cirrhosis, to the best of our knowledge, a direct connection with cystatin A in cirrhosis has not been previously reported.
On the other hand, cystatin C has been previously observed to be upregulated in cirrhosis patients compared to healthy controls, primarily associated with kidney impairment. We observed increased levels of cystatin C proteoforms in cirrhosis stages compared to controls, including the canonical form and a well-known mutation at position 94 (L-G) associated with cystatin C amyloid deposition in the brain. However, only some noncanonical proteoforms appear to be downregulated from compensated to decompensated stages (mutated and N-terminal truncated forms), suggesting that these proteoforms could be better indicators of the progression of liver cirrhosis.
Enhanced Proteoform Detection in Plasma: Complementary Insights from Acetonitrile Precipitation
To enhance the identification of proteoforms from the plasma of our reference sample (healthy individual), we conducted a new qualitative analysis with ACN-precipitated samples, utilizing the same mass spectrometer employed for the cirrhosis cohort analysis. The latest Orbitrap hybrid (Ascend) mass spectrometer provides greater sensitivity than the Orbitrap Eclipse.
To compile a comprehensive set of identified proteoforms, we integrated the results from ACN-precipitated plasma samples, encompassing both the healthy human control and the liver cirrhosis cohort (Figure A and Tables S17 and S18). The FDR was controlled at both the proteoform and protein levels, maintaining it below 1% in total. We identified 3746 proteoforms from 255 proteoform families. We consistently identified more than 1000 proteoforms from more than 100 proteins in each study. Identification rates increased significantly with the use of the latest Ascend Tribrid mass spectrometer, reaching a maximum of 1935 proteoforms in the cirrhosis cohort. Furthermore, when analyzing the reference sample with the Ascend mass spectrometer, the number of proteins reported was higher (Table S19).
4.
Acetonitrile precipitation results and the comparison with the in-house plasma data sets. (A) Bar plot representing the PFRs and proteins from acetonitrile plasma samples. (B, C) Bar plots comparing the identification of proteoforms (B) and proteins (C) in acetonitrile-treated plasma samples with data from two previous studies (aka Literature_TDP (refs and )), categorized according to plasma protein abundance levels (high, medium, and low) reported in the Human Protein Atlas (HPA). (D) Venn diagram comparing proteoforms from precipitated plasma and results from our two previous studies (Literature_TDP). (E) Same comparison as (D) but at the protein level. Categorical abundance: high (>60 μg/mL), medium (10 ng/mL to 60 μg/mL), and low (<10 ng/mL). (F) Molecular weight distribution.
We conducted a qualitative comparison of all identified proteoforms following acetonitrile precipitation with our recently published data sets. Initially, we focused on proteoforms analyzed from plasma samples from our two latest top-down studies , (Tables S20 and S21). Additionally, we compared the identified proteins from the proteoform analysis with the results from plasma proteins in the HPA to enhance our understanding of the system (Table S22).
We found significant orthogonality between our data and the published set of proteoforms. Specifically, in this comparison, we identified over 3000 unique proteoforms from each set. Notably, only 663 proteoforms overlap (Figure D). At the protein level, we observed a similar trend: more than 190 proteins (77%) were unique after ACN precipitation, with 33% of the proteins being shared, while 85 proteins were uniquely reported in previous studies (Figure E). These differences between data sets could be explained by variations in mass distribution. As expected, the SPAP method enriched for lower-molecular-weight proteoforms (less than 20 kDa, Figure F). Additionally, we compared our findings to proteins reported in the HPA from plasma samples of healthy patients. As anticipated, over 83% of the proteins we observed with at least one proteoform (222) in our study were also identified in plasma by BUP (reported in the HPA); however, more than 10 times of the number of proteoforms from these proteins were reported (3494 PFRs), pointing to the complexity of the plasma proteome at the proteoform level (Figure S8).
We compared the reported abundance of proteins with proteoforms identified after ACN precipitation to those with proteoforms previously reported. We observed that the number of proteoforms from proteins reported as low abundant proteins was markedly higher after ACN precipitation (317 vs 52) (Figure B and Table S23). This difference was also particularly noticeable in the number of proteins with low abundances, where it was almost five times higher (109 vs 24, see Figure C). This trend is commonly observed when the most abundant proteins are depleted, significantly broadening the spectrum of potential functions of the identified proteins.
Limitations of This Precipitation-Based Plasma Proteoform Analysis
Our comparison involved distinct plasma samples from healthy controls and liver cirrhosis patients, contributing to biological heterogeneity and variability in proteoform detection. Initially, we used a reference plasma sample from a single healthy individual. While this approach simplifies the plasma population and provides a clear baseline for method optimization, it limits the generalizability of the findings. This limitation is particularly evident when analyzing contaminated clinical samples (e.g., hemoglobin), which can compromise the number of identifications. However, the use of a well-characterized reference sample enables robust evaluation of technical performance at the proteoform level and may reduce variability introduced by biological heterogeneity. We then applied this strategy to samples from 40 individuals, including 30 with liver cirrhosis.
We reported the total number of proteoforms and proteins identified after ACN precipitation, following a conservative FDR cutoff (1% at the proteoform and protein level), controlling the global FDR. Additionally, we cross-referenced the protein annotations with a public data set of over 4000 human plasma proteins (accessed on August 10, 2023) ,, that provided additional context. This analysis was limited to low-molecular-weight (LMW) proteoforms (<40 kDa), a known constraint of top-down proteomics using an LC-MS workflow. − As in previous TDP analysis of plasma, many truncation forms were identified, complicating the relationship with canonical forms (>40 kDa). , Consistent trends in the fold changes in the biological behavior of some truncated forms were observed under various conditions, as seen with APOC3 in this study.
However, functional interpretation of truncated proteoforms remains uncertain in the absence of orthogonal validation. Future studies should prioritize the development of proteoform-aware enrichment tools and experimental strategies to elucidate the biological relevance of these proteoforms.
Conclusions
This study demonstrates that the SPAP approach is a robust, scalable, and cost-effective top-down proteomics workflow for plasma analysis. By leveraging the ACN-soluble fraction, SPAP circumvents key barriers in plasma top-down proteomics, effectively depleting high-abundance proteins while enriching for low-molecular-weight, intact proteoforms amenable to LC-MS analysis. Applied to a clinically stratified liver cirrhosis cohort, SPAP enabled the identification of 3746 proteoforms from 255 unique proteins, including those with post-translational modifications and sequence variations relevant to cirrhosis biology. The integration of high-resolution proteoform analysis with a simplified sample preparation workflow addresses longstanding limitations in plasma-based top-down proteomics. Although liver cirrhosis was used here as a model system, the approach is broadly applicable to other diseases where proteoform-level resolution is critical. These findings position SPAP as a powerful tool for advancing translational proteomics and enhancing clinical research in diagnostics, disease stratification, and therapeutic monitoring.
Supplementary Material
Acknowledgments
This study was supported by the National Institute of General Medical Sciences of the National Institutes of Health under P41GM108569 (N.L.K.) and Transplant Innovation Endowment Grant (D.P.L.). P.P. received support from the Steven J. Stryker, MD, Gastrointestinal Surgery Research and Education Endowment. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Glossary
Abbreviations
- ACN
acetonitrile
- PFR
proteoform
- DEP
differentially expressed proteoform
- HDL
high-density lipoprotein
- LC-MS
liquid chromatography–mass spectrometry
- TDP
top-down proteomics
- BUP
bottom-up proteomics
- HPA
Human Protein Atlas
- PEPPI
passively eluting proteins from polyacrylamide gels as intact species for MS
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/jasms.5c00289.
(Tables S01–S23) Proteoform and protein identifications from healthy and cirrhosis cohorts, abundance categorization, BUP identifications, tissue localization, cellular component analysis, apolipoprotein PFRs, differentially expressed proteoforms, DEP signature comparison, proteoform family comparisons, APOC3 proteoforms, accession numbers, plasma protein database, and protein abundances (XLSX)
(Figures S1–S8) precipitated plasma sample analysis, ACN precipitation reproducibility, proteoform–protein abundance association, cirrhosis cohort results, APOC3 truncated proteoform sequences, MS spectra of APOC3 proteoforms, and ACN precipitation comparison with in-house data sets (PDF)
#.
Department of Medicine, Division of Nephrology, University of Illinois College of Medicine, Chicago, Illinois 60612, United States
&.
Thermo Fisher Scientific, San Jose, California 95134, United States.
The authors declare the following competing financial interest(s): Conflict-of-interest statement: N.L.K. and J.B.G are involved in entrepreneurial activities in top-down proteomics and consultations for Thermo Fisher Scientific. R.D.M. is a current Thermo Fisher Scientific employee. A.S is also affiliated with Lund University. The other authors have declared that no conflict of interest exists.
References
- Anderson N. L., Anderson N. G.. The human plasma proteome: history, character, and diagnostic prospects. Mol. Cell Proteomics. 2002;1(11):845–867. doi: 10.1074/mcp.R200007-MCP200. [DOI] [PubMed] [Google Scholar]
- Muthusamy B., Hanumanthu G., Suresh S., Rekha B., Srinivas D., Karthick L., Vrushabendra B. M., Sharma S., Mishra G., Chatterjee P., Mangala K. S., Shivashankar H. N., Chandrika K. N., Deshpande N., Suresh M., Kannabiran N., Niranjan V., Nalli A., Prasad T. S., Arun K. S., Reddy R., Chandran S., Jadhav T., Julie D., Mahesh M., John S. L., Palvankar K., Sudhir D., Bala P., Rashmi N. S., Vishnupriya G., Dhar K., Reshma S., Chaerkady R., Gandhi T. K., Harsha H. C., Mohan S. S., Deshpande K. S., Sarker M., Pandey A.. Plasma Proteome Database as a resource for proteomics research. Proteomics. 2005;5(13):3531–3536. doi: 10.1002/pmic.200401335. [DOI] [PubMed] [Google Scholar]
- Keshishian H., Burgess M. W., Gillette M. A., Mertins P., Clauser K. R., Mani D. R., Kuhn E. W., Farrell L. A., Gerszten R. E., Carr S. A.. Multiplexed, Quantitative Workflow for Sensitive Biomarker Discovery in Plasma Yields Novel Candidates for Early Myocardial Injury. Mol. Cell Proteomics. 2015;14(9):2375–2393. doi: 10.1074/mcp.M114.046813. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Geyer P. E., Kulak N. A., Pichler G., Holdt L. M., Teupser D., Mann M.. Plasma Proteome Profiling to Assess Human Health and Disease. Cell Syst. 2016;2(3):185–195. doi: 10.1016/j.cels.2016.02.015. [DOI] [PubMed] [Google Scholar]
- Geyer P. E., Voytik E., Treit P. V., Doll S., Kleinhempel A., Niu L., Müller J. B., Buchholtz M. L., Bader J. M., Teupser D., Holdt L. M., Mann M.. Plasma Proteome Profiling to detect and avoid sample-related biases in biomarker studies. EMBO Mol. Med. 2019;11(11):e10427. doi: 10.15252/emmm.201910427. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Almeida N., Rodriguez J., Pla Parada I., Perez-Riverol Y., Woldmar N., Kim Y., Oskolas H., Betancourt L., Valdés J. G., Sahlin K. B., Pizzatti L., Szasz A. M., Kárpáti S., Appelqvist R., Malm J., B. Domont G., C. S. Nogueira F., Marko-Varga G., Sanchez A.. Mapping the Melanoma Plasma Proteome (MPP) Using Single-Shot Proteomics Interfaced with the WiMT Database. Cancers. 2021;13(24):6224. doi: 10.3390/cancers13246224. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith L. M., Kelleher N. L.. Proteoform: a single term describing protein complexity. Nat. Methods. 2013;10(3):186–187. doi: 10.1038/nmeth.2369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cheon D. H., Nam E. J., Park K. H., Woo S. J., Lee H. J., Kim H. C., Yang E. G., Lee C., Lee J. E.. Comprehensive Analysis of Low-Molecular-Weight Human Plasma Proteome Using Top-Down Mass Spectrometry. J. Proteome Res. 2016;15(1):229–244. doi: 10.1021/acs.jproteome.5b00773. [DOI] [PubMed] [Google Scholar]
- Tran J. C., Doucette A. A.. Gel-eluted liquid fraction entrapment electrophoresis: an electrophoretic method for broad molecular weight range proteome separation. Anal. Chem. 2008;80(5):1568–1573. doi: 10.1021/ac702197w. [DOI] [PubMed] [Google Scholar]
- Shen Y., Liu T., Tolić N., Petritis B. O., Zhao R., Moore R. J., Purvine S. O., Camp D. G., Smith R. D.. Strategy for degradomic-peptidomic analysis of human blood plasma. J. Proteome Res. 2010;9(5):2339–2346. doi: 10.1021/pr901083m. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Seckler H. D. S., Fornelli L., Mutharasan R. K., Thaxton C. S., Fellers R., Daviglus M., Sniderman A., Rader D., Kelleher N. L., Lloyd-Jones D. M., Compton P. D., Wilkins J. T.. A Targeted, Differential Top-Down Proteomic Methodology for Comparison of ApoA-I Proteoforms in Individuals with High and Low HDL Efflux Capacity. J. Proteome Res. 2018;17(6):2156–2164. doi: 10.1021/acs.jproteome.8b00100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Melani R. D., Gerbasi V. R., Anderson L. C., Sikora J. W., Toby T. K., Hutton J. E., Butcher D. S., Negrao F., Seckler H. S., Srzentic K., Fornelli L., Camarillo J. M., LeDuc R. D., Cesnik A. J., Lundberg E., Greer J. B., Fellers R. T., Robey M. T., DeHart C. J., Forte E., Hendrickson C. L., Abbatiello S. E., Thomas P. M., Kokaji A. I., Levitsky J., Kelleher N. L.. The Blood Proteoform Atlas: A reference map of proteoforms in human hematopoietic cells. Science. 2022;375(6579):411–418. doi: 10.1126/science.aaz5284. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kline J. T., Belford M. W., Boeser C. L., Huguet R., Fellers R. T., Greer J. B., Greer S. M., Horn D. M., Durbin K. R., Dunyach J. J., Ahsan N., Fornelli L.. Orbitrap Mass Spectrometry and High-Field Asymmetric Waveform Ion Mobility Spectrometry (FAIMS) Enable the in-Depth Analysis of Human Serum Proteoforms. J. Proteome Res. 2023;22(11):3418–3426. doi: 10.1021/acs.jproteome.3c00488. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Forte E., Sanders J. M., Pla I., Kanchustambham V. L., Hollas M. A. R., Huang C. F., Sanchez A., Peterson K. N., Melani R. D., Huang A., Polineni P., Doll J. M., Dietch Z., Kelleher N. L., Ladner D. P.. Top-Down Proteomics Identifies Plasma Proteoform Signatures of Liver Cirrhosis Progression. Mol. Cell Proteomics. 2024;23:100876. doi: 10.1016/j.mcpro.2024.100876. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang C. F., Hollas M. A., Sanchez A., Bhattacharya M., Ho G., Sundaresan A., Caldwell M. A., Zhao X., Benz R., Siddiqui A., Kelleher N. L.. Deep Profiling of Plasma Proteoforms with Engineered Nanoparticles for Top-Down Proteomics. J. Proteome Res. 2024;23:4694. doi: 10.1021/acs.jproteome.4c00621. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhu G., Sadeghi S. A., Mahmoudi M., Sun L.. Deciphering nanoparticle protein coronas by capillary isoelectric focusing-mass spectrometry-based top-down proteomics. Chem. Commun. (Camb) 2024;60(81):11528–11531. doi: 10.1039/D4CC02666G. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sadeghi S. A., Ashkarran A. A., Wang Q., Zhu G., Mahmoudi M., Sun L.. Mass Spectrometry-Based Top-Down Proteomics in Nanomedicine: Proteoform-Specific Measurement of Protein Corona. ACS Nano. 2024;18(38):26024–26036. doi: 10.1021/acsnano.4c04675. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Prates J., Martins G., López-Fernández H., Lodeiro C., Capelo J. L., Santos H. M.. Modulating the protein content of complex proteomes using acetonitrile. Talanta. 2018;182:333–339. doi: 10.1016/j.talanta.2018.01.057. [DOI] [PubMed] [Google Scholar]
- Das L., Murthy V., Varma A. K.. Comprehensive Analysis of Low Molecular Weight Serum Proteome Enrichment for Mass Spectrometric Studies. ACS Omega. 2020;5(44):28877–28888. doi: 10.1021/acsomega.0c04568. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kay R., Barton C., Ratcliffe L., Matharoo-Ball B., Brown P., Roberts J., Teale P., Creaser C.. Enrichment of low molecular weight serum proteins using acetonitrile precipitation for mass spectrometry based proteomic analysis. Rapid Commun. Mass Spectrom. 2008;22(20):3255–3260. doi: 10.1002/rcm.3729. [DOI] [PubMed] [Google Scholar]
- LeDuc R. D., Fellers R. T., Early B. P., Greer J. B., Shams D. P., Thomas P. M., Kelleher N. L.. Accurate Estimation of Context-Dependent False Discovery Rates in Top-Down Proteomics. Mol. Cell Proteomics. 2019;18(4):796–805. doi: 10.1074/mcp.RA118.000993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ntai I., Toby T. K., LeDuc R. D., Kelleher N. L.. A Method for Label-Free, Differential Top-Down Proteomics. Methods Mol. Biol. 2016;1410:121–133. doi: 10.1007/978-1-4939-3524-6_8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Benjamini Y., Hochberg Y.. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society: Series B (Methodological) 1995;57(1):289–300. doi: 10.1111/j.2517-6161.1995.tb02031.x. [DOI] [Google Scholar]
- R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing : Vienna, Austria, 2023. https://www.R-project.org/ (accessed. [Google Scholar]
- RStudio: Integrated Development for R; RStudio : Boston MA, USA, 2020. http://www.rstudio.com/ (accessed. [Google Scholar]
- Bates D., Mächler M., Bolker B., Walker S.. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software. 2015;67(1):1–48. doi: 10.18637/jss.v067.i01. [DOI] [Google Scholar]
- Huang D. W., Sherman B. T., Lempicki R. A.. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 2009;4(1):44–57. doi: 10.1038/nprot.2008.211. [DOI] [PubMed] [Google Scholar]
- Wu T., Hu E., Xu S., Chen M., Guo P., Dai Z., Feng T., Zhou L., Tang W., Zhan L., Fu X., Liu S., Bo X., Yu G.. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innovation (Camb) 2021;2(3):100141. doi: 10.1016/j.xinn.2021.100141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hulsen T., de Vlieg J., Alkema W.. BioVenn - a web application for the comparison and visualization of biological lists using area-proportional Venn diagrams. BMC Genomics. 2008;9:488. doi: 10.1186/1471-2164-9-488. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Szczepanski A. P., Zhao Z., Sosnowski T., Goo Y. A., Bartom E. T., Wang L.. ASXL3 bridges BRD4 to BAP1 complex and governs enhancer activity in small cell lung cancer. Genome Med. 2020;12(1):63. doi: 10.1186/s13073-020-00760-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kline J. T., Belford M. W., Huang J., Greer J. B., Bergen D., Fellers R. T., Greer S. M., Horn D. M., Zabrouskov V., Huguet R., Boeser C. L., Durbin K. R., Fornelli L.. Improved Label-Free Quantification of Intact Proteoforms Using Field Asymmetric Ion Mobility Spectrometry. Anal. Chem. 2023;95(23):9090–9096. doi: 10.1021/acs.analchem.3c01534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tu C., Rudnick P. A., Martinez M. Y., Cheek K. L., Stein S. E., Slebos R. J., Liebler D. C.. Depletion of abundant plasma proteins and limitations of plasma proteomics. J. Proteome Res. 2010;9(10):4982–4991. doi: 10.1021/pr100646w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Uhlen M., Oksvold P., Fagerberg L., Lundberg E., Jonasson K., Forsberg M., Zwahlen M., Kampf C., Wester K., Hober S., Wernerus H., Björling L., Ponten F.. Towards a knowledge-based Human Protein Atlas. Nat. Biotechnol. 2010;28(12):1248–1250. doi: 10.1038/nbt1210-1248. [DOI] [PubMed] [Google Scholar]
- Pontén F., Jirström K., Uhlen M.. The Human Protein Atlas--a tool for pathology. J. Pathol. 2008;216(4):387–393. doi: 10.1002/path.2440. [DOI] [PubMed] [Google Scholar]
- Blein-Nicolas M., Zivy M.. Thousand and one ways to quantify and compare protein abundances in label-free bottom-up proteomics. Biochim. Biophys. Acta. 2016;1864(8):883–895. doi: 10.1016/j.bbapap.2016.02.019. [DOI] [PubMed] [Google Scholar]
- Prion S., Haerling K. A.. Making Sense of Methods and Measurement: Spearman-Rho Ranked-Order Correlation Coefficient. Clinical Simulation in Nursing. 2014;10(10):535–536. doi: 10.1016/j.ecns.2014.07.005. [DOI] [Google Scholar]
- Evans, J. D. Straightforward Statistics for the Behavioral Science; Brooks/Cole, 1997. [Google Scholar]
- Liotta L. A., Ferrari M., Petricoin E.. Clinical proteomics: written in blood. Nature. 2003;425(6961):905. doi: 10.1038/425905a. [DOI] [PubMed] [Google Scholar]
- Louis D. N., Perry A., Wesseling P., Brat D. J., Cree I. A., Figarella-Branger D., Hawkins C., Ng H. K., Pfister S. M., Reifenberger G., Soffietti R., von Deimling A., Ellison D. W.. The 2021 WHO Classification of Tumors of the Central Nervous System: a summary. Neuro Oncol. 2021;23(8):1231–1251. doi: 10.1093/neuonc/noab106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Packard C. J., Pirillo A., Tsimikas S., Ference B. A., Catapano A. L.. Exploring apolipoprotein C-III: pathophysiological and pharmacological relevance. Cardiovasc. Res. 2024;119(18):2843–2857. doi: 10.1093/cvr/cvad177. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jong M. C., Hofker M. H., Havekes L. M.. Role of ApoCs in lipoprotein metabolism: functional differences between ApoC1, ApoC2, and ApoC3. Arterioscler Thromb Vasc Biol. 1999;19(3):472–484. doi: 10.1161/01.ATV.19.3.472. [DOI] [PubMed] [Google Scholar]
- Arnesen T.. Towards a functional understanding of protein N-terminal acetylation. PLoS Biol. 2011;9(5):e1001074. doi: 10.1371/journal.pbio.1001074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McTiernan N., Kjosas I., Arnesen T.. Illuminating the impact of N-terminal acetylation: from protein to physiology. Nat. Commun. 2025;16(1):703. doi: 10.1038/s41467-025-55960-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pomacu M. M., Trasca M. D., Padureanu V., Buga A. M., Andrei A. M., Stanciulescu E. C., Banita I. M., Radulescu D., Pisoschi C. G.. Interrelation of inflammation and oxidative stress in liver cirrhosis. Exp Ther Med. 2021;21(6):602. doi: 10.3892/etm.2021.10034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krones E., Fickert P., Zitta S., Neunherz S., Artinger K., Reibnegger G., Durchschein F., Wagner D., Stojakovic T., Stadlbauer V., Fauler G., Stauber R., Zollner G., Kniepeiss D., Rosenkranz A. R.. The chronic kidney disease epidemiology collaboration equation combining creatinine and cystatin C accurately assesses renal function in patients with cirrhosis. BMC Nephrol. 2015;16:196. doi: 10.1186/s12882-015-0188-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Abrahamson M., Jonsdottir S., Olafsson I., Jensson O., Grubb A.. Hereditary cystatin C amyloid angiopathy: identification of the disease-causing mutation and specific diagnosis by polymerase chain reaction based analysis. Hum. Genet. 1992;89(4):377–380. doi: 10.1007/BF00194306. [DOI] [PubMed] [Google Scholar]
- Shuken S. R., McAlister G. C., Barshop W. D., Canterbury J. D., Bergen D., Huang J., Huguet R., Paulo J. A., Zabrouskov V., Gygi S. P., Yu Q.. Deep Proteomic Compound Profiling with the Orbitrap Ascend Tribrid Mass Spectrometer Using Tandem Mass Tags and Real-Time Search. Anal. Chem. 2023;95(41):15180–15188. doi: 10.1021/acs.analchem.3c01701. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Uhlén M., Karlsson M. J., Hober A., Svensson A. S., Scheffel J., Kotol D., Zhong W., Tebani A., Strandberg L., Edfors F., Sjöstedt E., Mulder J., Mardinoglu A., Berling A., Ekblad S., Dannemeyer M., Kanje S., Rockberg J., Lundqvist M., Malm M., Volk A. L., Nilsson P., Månberg A., Dodig-Crnkovic T., Pin E., Zwahlen M., Oksvold P., von Feilitzen K., Häussler R. S., Hong M. G., Lindskog C., Ponten F., Katona B., Vuu J., Lindström E., Nielsen J., Robinson J., Ayoglu B., Mahdessian D., Sullivan D., Thul P., Danielsson F., Stadler C., Lundberg E., Bergström G., Gummesson A., Voldborg B. G., Tegel H., Hober S., Forsström B., Schwenk J. M., Fagerberg L., Sivertsson Å.. The human secretome. Sci. Signal. 2019;12(609):eaaz0274. doi: 10.1126/scisignal.aaz0274. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




