Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Dec 14.
Published in final edited form as: J Proteome Res. 2025 Nov 10;24(12):6079–6090. doi: 10.1021/acs.jproteome.5c00626

Integrative Analysis of Fucosylated Tetra Glycoforms in Hepatocellular Carcinoma: A NanoLC-PRM-MS/MS and Machine Learning Approach

Chithravel Vadivalagan 1, Jie Zhang 2, Jianliang Dai 3, Suyu Liu 4, Amit G Singal 5, Kevin Bass 6, Neehar D Parikh 7, David M Lubman 8
PMCID: PMC12701774  NIHMSID: NIHMS2122328  PMID: 41212027

Abstract

Hepatocellular carcinoma (HCC), commonly associated with cirrhosis, is a major cause of cancer-related mortality due to its poor prognosis. Herein, we investigated fucosylated glycoforms of serum haptoglobin (Hp) as potential biomarkers for HCC of metabolic associated dysfunction associated liver disease (MASLD) and alcohol related liver disease (ALD) etiologies. We analyzed 119 patient samples, including 60 with cirrhosis and 59 with HCC. Isolated Hp protein was digested using trypsin and Glu-C, and site-specific N-glycans were quantified using PRM with LC-HCD-MS/MS. Differential analysis revealed significant variations in fucosylated tetra-antennary glycoforms at N241(VVLHPN241YSQVD and VVLHPN241YSQVDIGLIK), particularly in distinguishing cirrhosis and HCC (P < 0.05). A combined analysis of identified tetra-antennary fucosylated markers, along with AFP, gender, and age, demonstrated improved AUC. Tetra-antennary glycoforms exhibited an AUC of 0.871 (95% CI: 0.80–0.93) when incorporated into the AFP + age + gender + marker panel compared to AFP alone (0.756) with a sensitivity of 0.763 at a specificity of 0.80. 3-fold cross validation was further used to assess the performance of the optimal biomarker panel. Thus, a combination of fucosylated tetra-antennary glycoforms may serve as important markers for distinguishing HCC from cirrhosis.

Keywords: PRM-HCD, machine learning, N-glycans, tetra-antennary, glycoforms, HCC, biomarker

Graphical Abstract

graphic file with name nihms-2122328-f0001.jpg

INTRODUCTION

Hepatocellular carcinoma (HCC) is the most common primary hepatic malignancy and the fourth most common worldwide cause of cancer-related mortality. While most HCC in Western countries occurs in the setting of cirrhosis, the epidemiology of cirrhosis has shifted from viral hepatitis to metabolic dysfunction associated steatotic liver disease (MASLD) and alcohol associated liver disease (ALD).1,2 Surveillance for HCC in high-risk patients has been associated with improved detection of HCC at an early stage, when treatment is most effective in improving survival.3,4 Due to the high risk of HCC in patients with cirrhosis, international guidelines recommend routine screening for all patients with cirrhosis, regardless of liver disease cause.5 The continued high proportion of patients detected at late stages underscores the need for improved risk stratification, surveillance strategies, and early detection tools.1,6,7

The diagnosis of HCC presents significant challenges due to multiple factors. Early stage tumors are often asymptomatic, making detection difficult outside of screening programs. Serum AFP is the most used biomarker for HCC in clinical practice, but its sensitivity when combined with ultrasound, is only about ~60%, missing over one-third of early stage cases.6 AFP-L3, a variant with high affinity for Lens culinaris agglutinin, has been FDA-approved for HCC diagnosis. However, its sensitivity remains limited, with an overall sensitivity of around 50–60%.8,9 An improved biomarker with higher sensitivity and specificity is essential for the early stage diagnosis of HCC.10

Protein glycosylation changes could serve as predictive, therapeutic, and diagnostic markers for HCC.11 Alterations in glycan structures may be associated with specific proteins or peptides.12,13 Several investigators have reported that structural changes in the N-glycosylation of haptoglobin at specific peptide positions are potential biomarkers for the early detection of HCC.1419 Fucosylation glycoforms have emerged as potential biomarkers for the early detection of HCC.20 Their distinct glycan structures enable the differentiation of HCC from other liver diseases, such as cirrhosis, thus improving diagnostic accuracy and facilitating timely intervention.21,22 PRM-based target acquisition in MS/MS offers a significant approach for identifying subtle but important glycosylation and fucosylation changes in the same peptide.14,2325

Based on its effective performance, PRM is particularly useful for analyzing the well-established glycosylation site, N241, in Hp. According to studies, patients with HCC had a noticeably higher level of fucosylated glycan structure in their serum Hp than those with cirrhosis.2628 Our earlier research showed that samples associated with MASLD show notable differences in fucosylated glycan structures. Prior work revealed unique structural manifestations, such as mono- or bifucosylation and tri- or tetra-antennary structures. These results emphasize the need for more precise structure predictions that take into consideration a variety of etiologies. The use of PRM was standardized in our earlier work and shown potential for tetra-antennary glycan detection. To provide greater insight for biomarker investigations, validation with larger cohorts and targeted capture of site-specific N241 glycans in serum Hp are required.

In the present study, our focus is to enhance biomarker discovery and optimize the selection of tetra-antennary glycan forms for differentiating HCC from cirrhosis. To improve the accuracy of glycan form predictions, we employed a double digestion strategy (Trypsin/Glu-C) to cleave y+ ions at the N241 site and utilized a quantitative workflow (NanoLC-PRM-MS/MS) to detect changes in single amino acid variations at this site,29,30 specifically targeting N241-glycan forms in serum Hp. These glycan forms are being investigated as potential biomarkers for early stage HCC detection. The study involved a cohort of 59 patients with HCC (63.8% early stage per Milan Criteria) and 60 patients with cirrhosis, who were receiving medical care across different sites in the United States. We employed Skyline quantification to assess the differential expression of site-specific glycoform heterogeneity within a 40-target window.

Given the limitations of AFP and the emerging promise of glycoproteomics, we sought to determine whether a novel diagnostic panel targeting malignancy-associated tetra-antennary glycoforms could outperform AFP for the detection of early stage and MASLD-associated HCC. Our objective was to rigorously assess the diagnostic performance of this glycoform panel, both independently and in combination with AFP, while accounting for age as a potential confounder. Altered glycans in the N241 site glycopeptides, along with age and AFP, were incorporated into a machine-learning algorithm to develop such early stage biomarkers for HCC.

MATERIALS AND METHODS

Study Design

The expression of site-specific N241-glycopeptides of Hp was measured in serum samples from patients with HCC and cirrhosis utilizing a NanoLC-PRM-MS/MS methodology, allowing for target-based quantification of glycopeptide heterogeneity between disease groups. The diagnosis of cirrhosis and HCC was based upon guidelines from the American Association for the Study of Liver Disease (AASLD).31 The serum samples were obtained from patients followed at UT Southwestern Medical Center (Dallas, Tx) and the University of Michigan (Ann Arbor, MI) health systems. Hp was isolated from 20 μL of patient serum using an HPLC-based anti-Hp column.15 To enrich glycopeptides, Hp was double digested with trypsin/GluC, targeting y+ ions at site N241, followed by a 3 kDa filter to recover glycopeptides. Glycopeptides were then evaluated on an Orbitrap Fusion Lumos Tribrid MS via a nanoLC-DDA/PRM-MS/MS process.

Data interpretation and quantification were performed with Byos for target prediction and Skyline for quantification, integrating findings from 40 key PRM targets and examining probable positive ion isolations (Table S1). The study enhanced a machine learning approach for precise target prediction and quantification by integrating traditional quantification methods, thereby boosting its clinical relevance and reliability.32,33 This method enabled the detection of site-specific N241-glycopeptides to find glycopeptide biomarkers that distinguish distinct etiologies of HCC and cirrhosis.

Materials and Reagents

MS grade Trypsin and Glu-C were obtained from Promega (Madison, USA) and Thermo Fisher Scientific (WI, USA), respectively. Amicon Ultra 0.5 mL 3 kDa centrifugal filters were purchased from Millipore-Sigma-Aldrich (Milwaukee, USA). The C18 NanoLC-Trap column (nanoviper) was sourced from ThermoFisher Scientific (WI, USA). All other chemicals and reagents used in this study were purchased from Sigma-Aldrich (MO, USA).

PATIENT SERUM SAMPLES

The patient cohort was sourced from the University of Michigan (UMich), Ann Arbor, MI and UT Southwestern Medical Center (UTSW), Dallas, Tx. It consisted of individuals with cirrhosis (n = 60) and HCC (n = 59), representing multiple etiologies, but categorized primarily into two groups: MASLD and ALD. All control patients with cirrhosis had absence of HCC on imaging obtained within 3 months of sample collection. Samples were processed per EDRN protocols, and then aliquoted and stored at −80 °C without any prior freeze–thaw cycles. Institutional Review Board (IRB) approval was obtained at both UTSW and UMich, and the samples were transferred to UMich under a material transfer agreement between the institutions.

The study protocol was reviewed and approved by the Institutional Review Boards of UT Southwestern Medical Center (Protocol ID: STU092013–010) and the University of Michigan (IRB: HUM00112432). All participants provided written informed consent prior to participation, and all collected data and samples were anonymized immediately after collection. Participation in the study was entirely voluntary, and individuals could withdraw at any time without penalty. The samples used in the current study were already archived and no samples were collected for the current work.

A detailed overview of the clinical characteristics can be found in Table 1. All HCC patients included in this study underlying cirrhosis, the target population for HCC surveillance. This study aimed to enhance early diagnosis through screening, supported by evidence confirming that glycosylation patterns of Hp, particularly in fucosylation and sialylation, differ significantly between cirrhosis and HCC.

Table 1.

Clinical Data and Sampling Characteristics of Study Participantsa

variable level overall cirrhosis HCC p-value
n 119 60 59
race asian 1(0.8) 1(1.7) 0(0.0) 0.619
black 1(0.8) 0(0.0) 1(1.7)
white 116(97.5) 59(98.3) 57(96.6)
other 1(0.8) 0(0.0) 1(1.7)
ethnicity hispanic 60(50.4) 30(50.0) 30(50.8) 1.000
non-hispanic 59(49.6) 30(50.0) 29(49.2)
gender female 47(39.5) 28(46.7) 19(32.2) 0.134
male 72(60.5) 32(53.3) 40(67.8)
age median [IQR] 62[54,67] 59[47,65] 65[61,68] <0.001
etiology1 MASLD 43(36.1) 20(33.3) 23(39.0) 0.570
other 76(63.9) 40(66.7) 36(61.0)
stage A 29(50.0) 0 29(50.0) 1.000
B 8(13.8) 0 8(13.8)
C 16(27.6) 0 16(27.6)
D 5(8.6) 0 5(8.6)
INR median [IQR] 1.20[1.10,1.30] 1.20[1.10,1.30] 1.20[1.10,1.30] 0.406
sodium median [IQR] 138[136,140] 139[136,140] 137.50[136,140] 0.397
creatinine median [IQR] 0.83[0.68,1.01] 0.84[0.70,1.01] 0.81[0.66,1.00] 0.649
albumin median [IQR] 3.50[3.00,3.95] 3.45[2.98,4.00] 3.60[3.15,3.90] 0.622
biliribun median [IQR] 1.30[0.80,2.25] 1.25[0.80,2.40] 1.30[0.80,2.15] 0.962
AFP median [IQR] 4.00[2.62,9.97] 3.10[2.30,4.90] 7.00[3.90,217.90] <0.001
a

Categorical variables described by count (%) and continuous variables described by median [IQR]. P-values were obtained from Wilcoxon ranksum tests for continuous variables and from Fisher’s exact tests for categorical variables.

Purification of Hp

The Hp purification protocol utilized in this study was standardized according to a previously published method. Serum Hp was extracted from 20 μL of serum obtained from patients with HCC and Cirrhosis. The purity of Hp for each sample was determined through SDS-PAGE and silver staining, and the Hp concentration was estimated to be approximately ~3 μg per patient. The Hp samples then underwent double digestion and were subsequently analyzed using NanoLC-PRM-MS/MS. Our prior work14 contains comprehensive details describing this HPLC (Beckman Coulter HPLC system (Fullerton, CA) with a UV detector.) technique.

Enzymatic Digestion and Peptide Enrichment

We targeted site-specific N241 “y+” ions and employed a double enzymatic digestion strategy using Trypsin and Glu-C, followed by glycopeptide enrichment, based on a slight modification of methods outlined in previous studies.14,15 Briefly, purified Hp (~3 μg) was dissolved in 50 mM ABC buffer. To prevent denaturation, 1.0 μL of 200 mM dithiothreitol (DTT) was added, followed by alkylation with 1.0 μL of 500 mM iodoacetamide (IAA). Trypsin (1.5 μL of 400 μg/μL) was then added to the sample, which was incubated at 37 °C overnight. The reaction was subsequently terminated by heating at 90 °C for 15 min. Afterward, 1.6 μL of 400 μg/μL Glu-C was introduced, and the sample was incubated at 37 °C overnight, with the reaction again inhibited at 90 °C for 15 min.

The resulting digested glycopeptide samples were resuspended in ABC buffer, and glycopeptides were enriched using a 3 kDa filter as previously described. Additionally, the study tested alternative enrichment techniques, including HILIC TopTips and TiO2, though the tetra-antennary glycans were primarily identified using the 3 kDa enrichment method. The enriched glycopeptides were vacuum-dried and subsequently reconstituted in 20 μL of a solvent mixture consisting of H2O/acetonitrile 98:2% (v/v) with 0.1% formic acid (FA) before being subjected to NanoLC-PRM-MS/MS analysis.

NanoLC-HCD-DDA/PRM-MS/MS

The workflow is based on the method outlined in our previous study,15 with slight modifications. Nanobased LC separation was performed using the UltiMate 3000 RSLCnano System (Thermo Fisher Scientific, Germany), coupled with NanoLC-HCD-DDA/PRM-MS/MS on an Orbitrap Fusion Lumos Tribrid mass spectrometer (Thermo Fisher Scientific, USA). Glycopeptides were separated on a trap column (Acclaim PepMap RSLC: 75 μm × 50 cm, nanoViper, C18, 2 μm, 100 Å, Thermo Fisher) maintained at 46 °C, with gradient elution at a flow rate of 300 nL/min. The study identified targeted precursors using a DDA survey scan with a PRM window setting, focusing on the key glycopeptides N241, N184, and N207, which are major glycopeptides from Hp. Forty precursors were identified based on the DDA run and previous reports from the lab.15

The MS was set to DDA mode with an MS1 scan range of m/z 150–2000, and MS1 data were acquired in the Orbitrap (90 ms injection time), followed by HCD-MS/MS acquisition with stepped collision energies of 19%, 25%, 30% and 40%. In PRM mode, as in DDA detection, two distinct collision energies were used to fragment the glycopeptides. The PRM analysis was conducted using 40 predefined precursor ions (Table S1), as outlined above. Target precursor ions from Hp included potential glycopeptides with mono-, bi-, tri-, and tetra-fucosylated glycans at sites N184, N207, and N241, along with those sialylated bi-, tri- and tetra-antennary patterns.34 The MS raw data obtained from Orbitrap through NanoLC-HCD_PRM-MS/MS workflow have been deposited in the Japan Proteome STandard Repository (jPOST; https://repository.jpostdb.org/). The data resultant accession numbers are PXD064014 for ProteomeXchange and JPST003811 for jPOST.35

Interpreting Relative Data Quantification

The human Hp sequence (P00738), comprising 406 amino acids, was obtained from UniProt. DDA spectra were analyzed using Byonic software (version 5.2.31) according to the parameters outlined by Lin et al.16 Glycopeptides were manually inspected for the presence of characteristic oxonium ions, including m/z values: 204.09 (HexNAc), 292.10 (NeuAc), 274.09 (NeuAc-H2O), 366.14 (HexHexNAc), 512.20 (HexHexNAcFuc), and 657.23 (HexNAcHexNeuAc).

These ions correspond to specific glycan structures, confirming the composition of the glycopeptides. Quantification of glycopeptides at the target site N241 was performed using the Skyline (64 bit) 24.1.0.199 platform, focusing on 21 selected precursor ions (Table S1) from PRM results. For DDA target prediction analysis, oxonium ions (listed above) and other reliable b/y ions were used for identification, with y+ ions employed for quantification. Skyline parameters, including peptide, transition, and library settings, were configured as previously described by Yu et al.15

Statistical Analysis

Categorical variables were presented as frequencies and percentages, while continuous variables were summarized using medians and interquartile ranges (IQRs). Group comparisons for categorical variables were conducted using Fisher’s exact test, and continuous variables were compared between case and control groups using the Wilcoxon rank-sum test. Receiver operating characteristic (ROC) analysis was performed for individual biomarkers, as well as the marker combinations. The area under the ROC curve (AUC) was estimated along with 95% confidence intervals (CIs) calculated from 1000 bootstrap replicates. The multivariate logistic regression model was used to develop optimal marker combinations, and their performance was compared to a model using AFP alone.

To identify optimal biomarker panels, we used a machine learning based approach. Logistic regression models were applied to evaluate and combine candidate biomarkers. An exhaustive search was conducted across all possible combinations of up to five markers, and each combination was ranked by its estimated area under the ROC curve (AUC). This automated, data-driven approach systematically evaluated every candidate panel, thereby reducing subjective bias and avoiding reliance on manual or hypothesis-driven selection. The most informative marker sets were prioritized for further interpretation and comparison with AFP.

Because this was a biomarker discovery study and no independent validation cohort was available, we used repeated 3-fold cross-validation to assess robustness. In each iteration, the data set was randomly partitioned into three approximately equal subsets: two subsets (two-thirds of the data) were used for model training, and the remaining subset (one-third) was used for testing. This process was repeated so that each subset served once as the testing set. Predictive performance was summarized by AUC with 95% confidence intervals estimated from 1000 bootstrap replicates. All p-values were reported as raw values without adjustment for multiple comparisons. A p-value <0.05 was considered statistically significant. Statistical analyses were performed using R software (version 4.4.1; R Foundation for Statistical Computing, Vienna, Austria).

RESULTS AND DISCUSSION

Patient’s Sampling and Characteristics

A total of 119 patients were included in the study, comprising 60 individuals with cirrhosis and 59 with HCC. There were 37 patients with BCLC A and B (early/intermediate stage) and 21 with late-stage HCC (Stages C and D). Based on disease etiology, patients were classified into two primary groups MALSD and non-MASLD cases. A comparative evaluation of glycopeptide expression indicated significant differences between cirrhosis and HCC patients, indicating possible biomarker candidates for early HCC identification. Table 1 summarizes the study cohort’s clinical characteristics, including demographics, liver disease etiology, and biochemical markers. The median age at diagnosis was 59 years (IQR: 47–65) for the cirrhosis group and 65 years (IQR: 60.5–68) for the HCC group (p < 0.001).

The majority of patients were Child class A with similar proportions of Child Pugh class A and B between cases and controls. We did not observe any significant differences in other clinical parameters, including median INR, sodium, creatinine, albumin, and bilirubin levels, between cirrhosis and HCC patients. However, AFP levels were significantly higher in HCC patients, with a median AFP of 3.10 ng/mL in cirrhosis patients and 7.00 ng/mL in HCC patients (p < 0.001).

Quantification of Hp N241 Glycopeptide as a Potential Clinical Biomarker Using nanoLC-HCD-DDA/PRM-MS/MS

This study quantifies the differential expression of N241 glycopeptides to assess their potential for early diagnosis in multiple etiologies of cirrhosis and HCC. Using NanoLC-PRM-MS/MS, we targeted the quantification of 20 glycopeptides identified in our previous14,15 and current DDA acquisition.

Forty precursors for site-specific Hp glycopeptides were selected15 and subsequently used for PRM quantification. Among them, 20 targets were discovered at the N241 glycopeptide site with different glycoforms. Notably, ten glycoforms typically included completely sialylated and fucosylated glycan motifs, indicating considerable differences between cirrhosis and HCC. Serum Hp from patients with HCC and cirrhosis exhibits four asparagine (N)-linked sialylation sites (N184, N207 and/or N211, and N241) on its β-chain, as reported by Zhu et al.14

In accordance with previous studies, enzymatic double digestion using Trypsin and Glu-C specifically targets cleavage at the y+/b+ site of N241, facilitating improved site-specific glycopeptide analysis. The glycopeptide enrichment stage is crucial for isolating bi-, tri-, and tetra-antennary glycans, as this study primarily focuses on N241 tetra-antennary glycoforms. Three distinct enrichment strategies were evaluated (3 kDa filter, HILIC TopTips, and Titanium dioxide). Our results demonstrate that the 3 kDa filter was the most effective in capturing tetra-antennary glycoforms.

In recent decades, mass spectrometry has been used because of its potential accuracy in identifying structural differences and proteome changes.3639 Target acquisition offers significant advantages for site-specific quantitative analysis over DDA-MS, as it scans the whole mass range.15 The DDA analysis also revealed many overlapping masses. However, the focused PRM acquisition revealed greater sensitivity by assigning more time to target ions while concurrently monitoring precursors, fragment ions, and MS2 product ion spectra of precursors.40 Based on the accuracy of PRM in detecting target precursors, this study examined site-specific peptidoglycans at N241 in Hp obtained from patients’ sera.

Representative MS/MS spectra of the glycopeptide VVLHPN241YSQVDIGLIK are presented in Figure 1, highlighting oxonium ions (e.g., HexNAc+ and NeuAc+), peptide backbone fragments (b+/y+ ions), and glycan-specific fragments. Identification of all N-glycopeptides was performed using Byonic, with assignments confirmed through MS/MS spectra analysis, manual verification of retention times, m/z values, peptide fragmentation patterns, and glycan-specific ions. Notably, glycosylation at N241 demonstrated a higher degree of α2,3-sialylation compared to N184 and N207/211, likely due to its increased hydrophilicity.41 Given the analytical performance in Byos, we selected N241 site-specific glycoforms in Hp from patients with HCC and cirrhosis for quantitative analysis. NanoLC-PRM-MS/MS spectra were validated using Skyline, targeting 20 m/z derived from DDA analysis and previous studies, as described above.

Figure 1.

Figure 1.

N-glycopeptides of VVLHPN241YSQVDIGLIK were analyzed using representative MS/MS spectra containing the glycan A4G4F2S4, a complex glycan structure with multiple hexoses and fucose residues. The fragmentation patterns revealed detailed information about both the peptide backbone and the attached glycan. Specifically, the a, b, and y ions from the peptide backbone, which reflect cleavage at various peptide bonds, were thoroughly examined, providing insights into the sequence-specific fragmentation behavior. Additionally, oxonium ions, characteristic of the glycan fraction, were observed and mapped to aid in the structural elucidation of the glycan composition. The glycosidic fragment ions originating from the glycan A4G4F2S4 were also carefully analyzed, offering a deeper understanding of the glycosidic linkages and their stability under the MS/MS conditions.

Characterization of 241N-Glycopeptides

The 241N-glycopeptide was comprehensively characterized using NanoLC-PRM-MS/MS. This analysis targeted 40 specific m/z ions incorporating both glycan and peptide fragments within a single MS/MS spectrum. A more recent innovation is Stepped Collision Energy (SCE) HCD-MS/MS, which creates complementary glycan and peptide fragments by utilizing various collision energies in HCD-MS/MS.12,23 Further validation was conducted using Byos software with a mass tolerance of 10 ppm. A targeted approach was applied to 20 m/z ions corresponding to the N241 glycosylation site, quantified using Skyline software. The representative glycoforms, their corresponding y/b+ ions, m/z values, and ion charge numbers are illustrated in Figure S1.

The study highlights that fucosylated glycan structures play a significant role in differentiating HCC from cirrhosis. Mono-, bi-, tri-, and tetra-fucosylated glycans were identified across bi-, tri-, and tetra-antennary structures, with tetra-antennary glycoforms showing the most significant differences. Tri- and tetra-fucosylated glycans were higher in early stage HCC than in controls, but mono-, tri- and tetra-fucosylated glycans were higher in late-stage HCC. Oh et al.34 reported that FUT8 mediates core fucosylation (Fucα16GlcNAc), while FUT3, FUT4, and FUT7 regulate α1–fucosylation. Aberrant fucosylation of Hp has been identified as a potential biomarker for liver, gastric, and lung cancers, as well as alcoholic liver disease.

Furthermore, we found that early stage HCC had considerably greater tetra-sialylation than controls.42 A representative MS/MS spectrum of the N241_A4G4F2S4 glycopeptide is presented in Figure 1, demonstrating a complex glycan structure with multiple hexoses and fucose residues. Additionally, oxonium ion fragments were detected, confirming glycopeptide identification and aiding in glycan and peptide characterization. The major oxonium ions observed included m/z 512.20 (HexHexNAcFuc) and m/z 803.30 (FucHexNA-cHexNeuAc), as depicted in Figure 2. The spectrum confirms glycan assignment accuracy with annotated fucose ions and high concordance in mass error plots.43 The identified glycoforms, represented by the peptide VVLHPN241YSQVD and VVLHPN241YSQVDIGLIK, support the presence of fully sialylated and fucosylated glycans.

Figure 2.

Figure 2.

N-glycopeptides of the sequence VVLHPN241YSQVDIGLIK, featuring the glycan A4G4F2S4, were analyzed through representative MS/MS spectra. The spectra thoroughly characterized the b/y ions and oxonium ions from the peptide backbone, as well as the glycosidic fragments, offering a comprehensive understanding of the fragmentation patterns.

N241 Site Specific Fuco- and Sialylated Glycoforms Heterogeneity in Multi Etiological HCC and Cirrhosis

We performed a comprehensive identification of fuco- and sialylated glycoforms in serum Hp with high confidence using Byos. In DDA of total serum Hp, we detected 1149 (Table S3) glycoform hits associated with three major site-specific glycopeptides (N184, N207/211, and N241). PRM acquisition identified 34 unique forward peptides, with 3096 (Table S4) spectra matched to forward peptides. The estimated spectrum-level false discovery rate (FDR) for true protein identifications was 0.1%. Modification analysis revealed both fixed and variable modifications, including carbamidomethylation (+57.021464 @ C, fixed), oxidation (+15.994915 @ M/W), dethiomethylation (−48.003371 @ M), deamidation (+0.984016 @ N/Q), glutamine-to-pyroglutamate conversion (−17.026549 @ N-terminal Q), glutamate-to-pyroglutamate conversion (−18.010565 @ N-terminal E), ammonia loss (−17.026549 @ N), and dioxidation (+31.989829 @ W).

In HCD spectra of N-linked glycopeptides, carbamidomethylation of Met residues results in the synthesis of sulfonium ether, which produces a fixed positive charge that removes the typical Y1 fragment and instead produces a Y1–48 Da ion that can be misinterpreted with Met sulfoxide side chain loss.41 The identified glycoforms predominantly contained bi-, tri-, and tetra-antennary fuco- and sialylated N-glycans. Different charge states were characterized in +2, +3, and +4 ions, with most glycopeptides from the N241 site predominantly exhibiting charge states of +3 and +4. For targeted PRM acquisition, precursor ions with charge states of +3 and +4 were selected. Interestingly, the study revealed that N241 glycopeptides were significantly dominated by +4 ions, which were subsequently used for the final quantitative analysis. The ten finalized ions were further subjected to differential heterogeneity assessment and machine learning analysis.

The major glycoforms observed included the bi-, tri-, and tetra-antennary glycans (A2G2F1S2, A3G3F3S2, A3G3F2S2, A3G3F4S2, A4G4F3S2, A4G4F1S3, A4G4F3S3, A4G4F2S4, and A3G3F3S2.1 A3G3F3S2.2) (Table S2). These structures exhibit differential expression patterns between HCC and cirrhosis. Among the ten glycoforms analyzed, N241_A3G3F4S2, A4G4F2S4, A3G3F3S2.1, and A3G3F3S2.2 demonstrated elevated levels in HCCs compared to cirrhosis. However, several of these site-specific glycopeptides did not reach statistical significance, and their detection was variable among HCC patients; nevertheless, they were retained for exploratory analyses. Notably, in cirrhosis associated with both MASLD and non-MASLD etiologies, the diagnostic performance of A4G4F1S3 and A4G4F3S3 tetra-antennary glycoforms at site N241 were significantly elevated compared to all early- and late-stage HCC cases. This difference was statistically significant (p < 0.05). The details of site-specific glycoform selection follow the approach described by Oh et al.34

Relative Quantification of N241 Glycoforms in Cirrhosis and HCC

Skyline-calibrated quantification was employed to determine the absolute molecular quantities and relative abundances of site-specific N241 glycopeptides in serum Hp across all study participants. Each sample underwent NanoLC-PRM-MS/MS analysis, and the abundance of site-specific N241 glycoforms was quantified in individual patients to evaluate its levels in cirrhosis and HCC cohorts. The relative abundance of all site-specific N241 glycoforms in serum Hp for each patient included in this study is provided in Table S2.

At site N241, the predominant glycoform is the triantennary trifucosylated glycan A3G3F3, accompanied by A3G3F2 and A3G3F4. Notably, all triantennary glycoforms exhibit bisialylated structures, including A3G3F3S2, A3G3F2S2, A3G3F4S2, A3G3F3S2.1, and A3G3F3S2.2. The tetra-antennary N-glycoforms (A4G4F3S2, A4G4F1S3, A4G4F3S3, and A4G4F2S4), which include bi-, tri-, and tetra-sialic acids along with many fucose residues, were found to be 40% abundant overall. Interestingly, HCC has a considerably greater number of tetra-antennary glycoforms than cirrhosis (p < 0.05). These glycoforms feature a loss of fucose residues in the terminal region, contributing to well-defined and structurally complete configurations, as corroborated by the Oxford structural formula and Skyline. Protein residues inside a rounded pocket are commonly used to identify the fucose moiety coupled to GlcNAc, while the asparagine moiety bound to GlcNAc is left exposed to the solvent. Structural analyses of α-fucosidases have demonstrated that the fucose residue is positioned within a well-defined active site, while the adjacent GlcNAc interacts with aromatic residues in a neighboring subsite. This structural organization highlights the enzyme’s specificity for fucosylated substrates.34,44

Together with two missed cleavage sites (A3G3F3S2.1 and A3G3F3S2.2), the triantennary N-glycoforms were the most prevalent overall at site N241, accounting for around 50% of the total distribution. Next, making up 40% of the overall glycoform distribution, were the tetra-antennary N-glycoforms. With just 10% of the total, the biantennary glycoforms were the least prevalent, according to the PRM study, which was based on about 40 precursor windows. N184, N207, and N241 were the three distinct glycosylation sites where all bi-, tri-, and tetra-antennary glycoforms were included in the overall PRM settings. However, site N241 has the most glycoforms, mostly triantennary, followed by tetra-antennary N-glycoforms.

Diagnostic Performance of Tetra- Antennary Glycoforms at Site N241 for Total Cohort, Early, and MASLD

We performed ROC analysis for Hp site-specific N241-glycopeptides to differentiate the total cohort, early stage HCC, and MASLD-HCC from patients with cirrhosis. The AUC values of N241-tetra-glycoforms, in combination with AFP, gender, and age, are summarized in Table 2ad. Additionally, we analyzed all ten targeted biomarkers in this study, with detailed results presented in Tables S5S8. The primary focus of this study was the analysis of tetra-antennary glycoforms as combinational markers, with summarized findings in Table 3.

Table 2.

Diagnostic Performance of Tetra- Antennary Markers at Site N241 for all HCC, Early HCC, and MASLDa

all HCC (n = 59) vs cirrhosis (N = 60)
early stage (N = 37) vs cirrhosis (N = 60)
MASLD (N = 23) vs cirrhosis (N = 60)
markers AUC 95% CI p-value AUC 95% CI p-value AUC 95% CI p-value
a. Diagnostic Performance of Individual Markers at Site N241 for Al HCC, Early HCC, and MASLD
AFP 0.755 (0.665,0.836) <0.0001 0.739 (0.632,0.838) 0.0001 0.848 (0.715,0.954) 0.0001
gender 0.572 (0.488,0.656) 0.1320 0.558 (0.456,0.650) 0.2948 0.617 (0.492,0.751) 0.0955
age 0.695 (0.595,0.782) 0.0003 0.738 (0.635,0.827) 0.0001 0.624 (0.45,0.792) 0.1759
A4G4F3S2 0.563 (0.461,0.663) 0.2360 0.552 (0.438,0.665) 0.3874 0.559 (0.384,0.728) 0.5174
A4G4F1S3 0.631 (0.522,0.731) 0.0132 0.601 (0.486,0.718) 0.0959 0.502 (0.335,0.684) 0.9903
A4G4F3S3 0.658 (0.559,0.753) 0.0030 0.667 (0.553,0.777) 0.0059 0.665 (0.506,0.826) 0.0660
A4G4F2S4 0.573 (0.472,0.674) 0.1593 0.582 (0.462,0.691) 0.1706 0.574 (0.404,0.748) 0.4065
b. Diagnostic Performance of Glycopeptide Markers at Site N241 in the Combination of AFP with All HCC, early HCC, and MASLD
AFP 0.755 (0.665,0.836) Reference 0.739 (0.632,0.838) Reference 0.848 (0.715,0.954) Reference
AFP + A4G4F3S2 0.796 (0.718,0.870) 0.5040 0.781 (0.678,0.875) 0.5658 0.841 (0.698,0.957) 0.9152
AFP + A4G4F1S3 0.763 (0.675,0.842) 0.9114 0.743 (0.633,0.845) 0.9602 0.845 (0.700,0.952) 0.9565
AFP + A4G4F3S3 0.766 (0.686,0.846) 0.8660 0.756 (0.652,0.857) 0.826 0.862 (0.720,0.961) 0.8890
AFP + A4G4F2S4 0.802 (0.723,0.877) 0.4358 0.789 (0.691,0.880) 0.4888 0.836 (0.693,0.945) 0.8732
c. Diagnostic Performance of Glycopeptide Markers at Site N241 in the Combination of Gender with All HCC, Early HCC, and MASLD
AFP 0.755 (0.665,0.836) Reference 0.739 (0.632,0.838) Reference 0.848 (0.715,0.954) Reference
Gender + A4G4F3S2 0.662 (0.555,0.755) 0.1592 0.634 (0.515,0.75) 0.1955 0.634 (0.461,0.812) 0.0439
Gender + A4G4F1S3 0.602 (0.495,0.709) 0.0217 0.598 (0.484,0.711) 0.076 0.708 (0.542,0.874) 0.1671
Gender + A4G4F3S3 0.614 (0.511,0.718) 0.0375 0.591 (0.475,0.708) 0.0707 0.684 (0.514,0.853) 0.1185
Gender + A4G4F2S4 0.717 (0.621,0.807) 0.543 0.712 (0.599,0.818) 0.7214 0.746 (0.587,0.889) 0.3009
d. Diagnostic Performance of Glycopeptide Markers at Site N241 in the Combination of Gender with All HCC, Early HCC, and MASLD
AFP 0.755 (0.665,0.836) Reference 0.739 (0.632,0.838) Reference 0.848 (0.715,0.954) Reference
Age + A4G4F3S2 0.718 (0.618,0.806) 0.5663 0.751 (0.651,0.841) 0.8743 0.621 (0.447,0.8) 0.0414
Age + A4G4F1S3 0.712 (0.616,0.805) 0.5073 0.755 (0.654,0.836) 0.8285 0.622 (0.447,0.797) 0.0408
Age + A4G4F3S3 0.704 (0.602,0.793) 0.4347 0.740 (0.641,0.831) 0.9925 0.626 (0.452,0.807) 0.0426
Age + A4G4F2S4 0.736 (0.631,0.823) 0.7641 0.781 (0.691,0.866) 0.5603 0.657 (0.493,0.816) 0.0795
a

p-values in Table 2a are based on Wilcoxon test. P-values in Table 2bd are based on the comparisons of AUC of the combination panels vs AFP only.

Table 3.

Diagnostic Performance of Combinational Marker with Tetra-Glycoforms * PPV-Positive Predictive Value; NPV-Negative Predictive Value

panel AUC 95%CI P-value specificity sensitivity accuracy PPV NPV
AFP + Age + A2G2F1S2+A4G4F3S3 Total HCC from Cirrhosis
0.895 (0.832,0.946) 0.0066 0.800 0.797 0.795 0.797 0.793
AFP + Age + A2G2F1S2+A4G4F3S3+A3G3F3S2.2 Early-HCC from Cirrhosis
0.911 (0.856,0.960) 0.0049 0.800 0.865 0.821 0.727 0.902
AFP + Age + Gender2+A2G2F1S2+A4G4F3S3 MASLD-HCC from Cirrhosis
0.942 (0.850,1.000) 0.2244 0.800 1.000 0.902 0.852 1.000

The N241-glycopeptide bearing four tetra-glycoforms exhibited varying AUC values across the total cohort, early HCC, and MASLD-HCC groups. Among them, A4G4F3S3 demonstrated the highest AUC, ranging from 0.658 (95% CI: 0.559–0.753) to 0.667 (95% CI: 0.553–0.777), and A4G4F1S3 showed moderate AUC values, with the highest being 0.631 (95% CI: 0.522–0.731). In contrast, A4G4F3S2 and A4G4F2S4 exhibited lower AUCs, with A4G4F3S2 ranging from 0.552 (95% CI: 0.438–0.665) to 0.563 (95% CI: 0.461–0.663) and A4G4F2S4 from 0.573 (95% CI: 0.472–0.674) to 0.582 (95% CI: 0.462–0.691), suggesting varying diagnostic potential from cirrhosis (Table 2a).

In all stages, the N241-tetra-glycoforms in combination with AFP demonstrated improved AUC values. The total cohort exhibited AUC values ranging from 0.763 (95% CI: 0.675–0.842) to 0.802 (95% CI: 0.723–0.877). For the early stage cohort, AUC values ranged from 0.743 (95% CI: 0. 633–0. 845) to 0.789 (95% CI: 0.691–0.880), while for MASLD, values ranged from 0.836 (95% CI: 0.693–0.945) to 0.862 (95% CI: 0.720–0.961) (Table 2b). In addition to AFP, we also incorporated other factors, such as age and gender, to better evaluate the performance of tetra-glycoforms in distinguishing HCC from cirrhosis. Interestingly, our results demonstrate that age significantly enhances performance, particularly in MASLD, with a significantly increased AUC (P < 0.05).

All three groups were included in the combined analysis with age and gender. In the total cohort, the AUC demonstrated significantly improved performance in the Gender + A4G4F1S3 and Gender + A4G4F3S3 groups compared to marker alone, with AUC values of 0.602 (95% CI: 0.495–0.709) and 0.614 (95% CI: 0.511–0.718), respectively (P < 0.05). In the MASLD-HCC group, the AUC values were significantly higher when incorporating age with all four tetra-glycoforms, ranging from 0.621 (95% CI: 0.447–0.800) to 0.657 (95% CI: 0.493–0.816) (P < 0.05) (Table 2cd).

Multi-Marker Panels with Tetra-Glycoforms for HCC Diagnosis: Performance Across Cohorts

Building on our previous study,15 we aimed to identify the most effective multimarker panel for distinguishing HCC from cirrhosis. Multimarker panels were evaluated using AFP, gender, and age as anchor markers. The optimal panel was selected based on the highest estimated AUC values in correlation with tetra-glycoforms. The performance of the selected tetra-glycoform panels is summarized in Table 3, while Figure S2 illustrates the corresponding ROC curves.

Multimarker panels incorporating tetra-glycoforms exhibited varying AUC values across different cohorts, including the total cohort, early HCC, and MASLD-HCC groups. In the MASLD cohort, tetra-glycoforms in combination panels demonstrated improved AUC values compared to AFP. The AUC values in the MALSD cohort were 0.942 (95% CI: 0.850–1.000) (P < 0.2244). For patients with early stage HCC, AUC values were 0.911 (95% CI: 0.856–0.960) (P < 0.0049). In the total cohort, AUC values were 0.895 (95% CI: 0.832–0.946) (P < 0.0066) (Table 3).

The diagnostic performance of tetra-glycoforms in combination panels was critically assessed across different patient groups. The sensitivity was determined to be 0.797 for the total cohort with accuracy 0.795 (PPV 0.797; NPV 0.793), 0.865 for early stage cases with accuracy 0.821 (PPV 0.727; NPV 0.902), and 1.000 for MASLD with accuracy 0.902 (PPV 0.852; NPV 1.000), all evaluated at a specificity of 0.80. In contrast, AFP exhibited markedly lower sensitivities of 0.586 for the total cohort, 0.557 for early stage cases, and 0.817 for MASLD. These findings highlight the superior diagnostic efficacy of tetra-glycoforms, emphasizing their potential clinical utility in enhancing detection accuracy compared to AFP. The observed disparity underscores the need for further validation and integration of tetra-glycoforms in diagnostic protocols to improve early detection and disease management.

To further refine performance, we also analyzed multimarker panel combinations without age as a variable. This analysis identified all three combinations (Total Cohort, Early and MASLD) with results that did not reach significance (P > 0. 05), yielding AUC values of 0.834 (95% CI: 0.754–0.903), 0.825 (95% CI: 0.738–0.898) and 0.891 (95% CI: 0.77–0.983) (Table 4). These findings highlight the potential of tetra-glycoform-based multimarker panels for improving HCC detection across different clinical contexts.

Table 4.

AUCs of Marker-Panels for the Tetra-Antennary Glycoforms Markers and Variablesa

all HCC vs cirrhosis
early HCC vs cirrhosis
MASLD All HCC vs Cirrhosis
marker panel AUC (95%CI) Sens80 p-value AUC (95%CI) Sens80 p-value AUC (95%CI) Sens80 p-value
AFP 0.756(0.667–0.837) 0.586 reference 0.74(0.631–0.84) 0.557 reference 0.850(0.716–0.958) 0.817 reference
marker-combination 0.702(0.607–0.79) 0.424 0.3999 0.702(0.597–0.801) 0.387 0.6218 0.609(0.432–0.775) 0.348 0.0237
AFP + Age + Gender2+Marker-combination 0.871(0.8–0.931) 0.763 0.0336 0.887(0.815–0.942) 0.703 0.0190 0.901(0.785–0.981) 0.739 0.5228
AFP + Gender2+ marker-combination 0.834(0.754–0.903) 0.644 0.1676 0.825(0.738–0.898) 0.649 0.2100 0.891(0.77–0.983) 0.739 0.6200
a

Marker-combination: A4G4F3S2+A4G4F1S3+A4G4F3S3+A4G4F2S4; Sens80: the sensitivity when specificity was fixed at 80%; P-values of marker panels were obtained by comparing the ROC curves of the marker panels to that of AFP alone (reference).

Evaluating the Influence of Demographic Variables on Diagnostic Accuracy and Panel Performance in HCC Diagnosis Using Three-Fold Cross-Validation

To validate the strength of our biomarker panels and assess the risk of overfitting, we employed a 3-fold cross-validation machine leaning approach. Using this approach, we were able to assess how demographic factors, specifically age and gender, affect the precision of diagnoses in various disease phases and subpopulations (Table 5 and Figure 3). We acknowledge that the sample size in the current study is modest, and larger cohorts will be needed to further establish the robustness of our findings. We plan to test and validate the performance of the optimal biomarker combinations in future studies. In the present work, to mitigate the limitations of sample size, we used cross-validation. The results from 3-fold cross-validation demonstrate that incorporating demographic information such as age and gender generally improved the predictive performance of biomarker panels, especially in early stage disease cohorts.

Table 5.

Summary of Estimated AUC Values Based on 3-Fold Cross-Validationa

combinations with demographic information
combinations without demographic information
data set training AUC test AUC panel training AUC test AUC panel
all HCC vs Cirrhosis 0.897 0.877 AFP + Age + A2G2F1S2+A4G4F3S3 0.847 0.827 AFP + A2G2F1S2+A4G4F3S3
early HCC vs Cirrhosis 0.918 0.868 AFP + Age + A2G2F1S2+A4G4F3S3+A3G3F3S2.2 0.849 0.799 AFP +A2G2F1S2+A4G4F3S3+A3G3F3S2.2
MASLD All HCC vs Cirrhosis 0.954 0.886 AFP + Age + Gender2+A2G2F1S2+A4G4F3S3 0.913 0.867 AFP + A2G2F1S2+A4G4F3S3
a

3-fold cross-validation was conducted to evaluate the performance of the optimal combination panels in Table 3.

Figure 3.

Figure 3.

ROC curves of Hp N241 glycopeptide markers in combination with tetra-glycoforms for differentiating the total cohort, early stage patients, and MASLD from cirrhosis. Blue line (AFP); the green line (markers); the Brown line (without age + markers); and the black line (age + markers), as specified in Table 4.

3-fold cross-validation results show that adding demographic data, including age and gender, often enhanced biomarker panel’s predictive capabilities, particularly in cohorts with early stage disease. Across the total cohort, 4, 5-marker panels that included demographic variables achieved higher AUC values compared to their nondemographic counterparts, with the best test AUC reaching 0.868. The inclusion of age and gender resulted in a significant improvement in both training and test AUCs, demonstrating the importance of demographic parameters in early diagnosis. This performance boost was especially noticeable in early stage patients.

In the total cohort, although demographic inclusion had less impact, the overall performance remained high, with the top-performing panel (AFP + Age + A2G2F1S2 + A4G4F3S3) reaching a training AUC of 0.897 and a test AUC of 0.877, suggesting robust biomarker signatures in more advanced disease. Interestingly, in the MASLD subgroup, biomarker-only panels performed equally well or better than those including demographics. In the whole MASLD cohort, for example, a 5-marker panel with demographic characteristics had the highest test AUC of 0.886. It also demonstrated consistent outcomes with early stage patients showing identical AUCs. These findings collectively underscore the potential of biomarker panels, particularly when paired with demographic data, to enhance diagnostic accuracy across disease stages. They also imply that carefully chosen biomarker combinations could have adequate predictive value on their own in some subpopulations, such as MASLD. This method supports a tailored approach to biomarker panel development, optimizing combinations based on patient subgroups and disease stages to achieve the best clinical performance.

Tetra-Glycoform Biomarkers on the Path to Clinical Translation

The current standard of care for HCC surveillance is semiannual ultrasound and AFP in patients with cirrhosis.31 Surveillance is poorly utilized in clinical practice due to barriers associated with obtaining imaging and suboptimal sensitivity and specificity.6,45,46 Blood based biomarker approaches, including the tetra-glycoform markers we have identified, could address surveillance barriers and improve performance of current modalities.10 Further validation of the tetra-glycoform markers is needed prior to implementation, including the performance of Phase III and IV biomarker validation studies.47 These studies would rigorously evaluate the performance of these markers and the feasibility and reproducibility with prior to implementation in clinical practice.

CONCLUSION

This study demonstrates that tetra-antennary glycoforms, specifically bi-, tri-, and tetra-fucosylated glycopeptides derived from serum Hp, effectively distinguish HCC from cirrhosis across diverse etiologies. These distinctions arise from subtle glycan structural modifications, assessed using NanoLC-PRM-MS/MS. Notably, four tetra-glycoforms (A4G4F3S2, A4G4F1S3, A4G4F3S3, and A4G4F2S4) at site N241 play a crucial role in this differentiation, as the fucose residue is strategically positioned within an active site, and the adjacent GlcNAc interacts with neighboring residues, contributing to structural specificity.

The diagnostic utility of these tetra-glycoforms was further validated through ROC analysis, revealing excellent HCC discrimination compared to AFP alone. The combined fucosylated tetra-glycoform panel achieved an AUC of 0.871 (95% CI: 0. Eight −0.993, P < 0.03) in the total cohort, 0.887 (95% CI: 0.815–0.942, P < 0.01) in early stage HCC, and 0.901 (95% CI: 0.785–0.981) in MASLD-HCC. Sensitivity at 0.80 specificity was 0.763 for the total cohort, 0.703 for early stage cases, and 0.739 for MASLD-HCC, outperforming AFP alone, which exhibited lower sensitivities of 0.586, 0.557, and 0.817, respectively.

To further optimize diagnostic performance, we evaluated multimarker panel combinations excluding age as a variable to assess its influence. This analysis identified all the combinations with results that did not reach significance (P > 0.05), yielding AUC values of 0.834, 0.825, and 0.859. These findings underscore the potential of tetra-glycoform-based multimarker panels for enhancing HCC detection across various etiologies, supporting their clinical utility in early and accurate diagnosis.

Fucosylated tetra-antennary glycoforms of serum haptoglobin were significantly altered in patients with HCC compared to those with cirrhosis. While these glycoforms alone were not evaluated as independent diagnostic tools, their incorporation into a panel with AFP, age, and gender improved discrimination between HCC and cirrhosis. These results suggest that glycoforms may serve as complementary biomarkers to enhance current diagnostic approaches, though further validation is required before clinical implementation.

Supplementary Material

supp figures and Tables 5-8
supp Table 1
supp Table 2
supp Table 3
supp Table 4

Supporting Information

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jproteome.5c00626.

Table-S5: Diagnostic Performance of Individual Markers at Site N241 for all HCC, Early HCC, and NASH Table-S6: Diagnostic Performance of glycopeptide Markers at Site N241 in the combination of AFP with all HCC, Early HCC, and NASH Table-S7: Diagnostic Performance of Glycopeptide Markers at Site N241 in the combination of Age with all HCC, Early HCC, and NASH Table-S8: Diagnostic Performance of Glycopeptide Markers at Site N241 in the combination of Gender with all HCC, Early HCC, and NASH Figure S1: The figure illustrates the representative intensity profiles of 15 targeted PRM precursors corresponding to site-specific N241 glycoforms identified in our analysis. The data depicts relative glycoform abundances, emphasizing variations in glycoform composition and intensity among the analyzed samples. Figure S2: ROC curves of Hp N241- Glycopeptide markers in combination with tetra-glycoforms differentiating the total cohort, Early stage and MASLD from cirrhosis. Blue line (AFP); Green line (markers) the marker’s combination as specified in Table 3. (PDF)

Table-S1: Skyline isolation list of 40 precursors used for PRM window setting. The list includes 20 target Site N241 glycoforms representing predicted Skyline targets (XLSX)

Table-S2: Clinical Information of 119 Samples and Intensities of Twenty Target Glycoforms (XLSX)

Table-S3: Representative Data set for DDA Analysis (XLSX)

Table-S4: Representative Data set for PRM analysis (XLSX)

ACKNOWLEDGMENTS

We gratefully acknowledge the support of the National Cancer Institute for this work through grants R01-CA160254-11 (D.M.L.) and R01-CA160254-11 S (D.M.L.). Additionally, D.M.L. acknowledges support from the Maud T. Lane Professorship. Drs. Singal and Parikh are supported by NCI U01 CA271887 and U01 CA283935. We also acknowledge Drs. Yu Lin, Jianhui Zhu, Hye Kyong Kweon, Komal Abhange and Natan Y Lubman for their valuable help during protocol standardization.

Footnotes

The authors declare the following competing financial interest(s): Amit Singal has served as a consultant or on advisory boards for Bayer, FujiFilm Medical Sciences, Exact Sciences, HelioGenomics, Roche, Glycotest, Abbott, DELFI, IMCare, and Universal Dx.

Complete contact information is available at: https://pubs.acs.org/10.1021/acs.jproteome.5c00626

Contributor Information

Chithravel Vadivalagan, Department of Surgery, University of Michigan Medical Center, Ann Arbor, Michigan 48109, United States.

Jie Zhang, Department of Surgery, University of Michigan Medical Center, Ann Arbor, Michigan 48109, United States.

Jianliang Dai, Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, United States.

Suyu Liu, Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, United States.

Amit G. Singal, Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, Texas 75390, United States

Kevin Bass, Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, Texas 75390, United States.

Neehar D. Parikh, Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan 48109, United States

David M. Lubman, Department of Surgery, University of Michigan Medical Center, Ann Arbor, Michigan 48109, United States

Data Availability Statement

The MS raw data obtained from Orbitrap through NanoLC-HCD_PRM-MS/MS workflow have been deposited in the Japan Proteome STandard Repository (jPOST; https://repository.jpostdb.org/). The data accession numbers are PXD064014 for ProteomeXchange and JPST003811 for jPOST.

REFERENCES

  • (1).Singal AG; Lampertico P; Nahon P Epidemiology and surveillance for hepatocellular carcinoma: New trends. J. Hepatol. 2020, 72 (2), 250–261. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (2).Singal AG; Kanwal F; Llovet JM Global trends in hepatocellular carcinoma epidemiology: implications for screening, prevention and therapy. Nat. Rev. Clin. Oncol. 2023, 20 (12), 864–884. [DOI] [PubMed] [Google Scholar]
  • (3).Singal A; Parikh N; Rich N; John B; Pillai A Hepatocellular carcinoma surveillance and staging. In Hepatocellular Carcinoma: Translational Precision Medicine Approaches, 2019; pp 27–51. [Google Scholar]
  • (4).Singal AG; Zhang E; Narasimman M; Rich NE; Waljee AK; Hoshida Y; Yang JD; Reig M; Cabibbo G; Nahon P; et al. HCC surveillance improves early detection, curative treatment receipt, and survival in patients with cirrhosis: a meta-analysis. J. Hepatol. 2022, 77 (1), 128–139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (5).Park S; Davis AM; Pillai AA Prevention, diagnosis, and treatment of hepatocellular carcinoma. Jama 2024, 332 (12), 1013–1014. [DOI] [PubMed] [Google Scholar]
  • (6).Tzartzeva K; Obi J; Rich NE; Parikh ND; Marrero JA; Yopp A; Waljee AK; Singal AG Surveillance imaging and alpha fetoprotein for early detection of hepatocellular carcinoma in patients with cirrhosis: a meta-analysis. Gastroenterology 2018, 154 (6), 1706–1718. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (7).Wolf E; Rich NE; Marrero JA; Parikh ND; Singal AG Use of hepatocellular carcinoma surveillance in patients with cirrhosis: a systematic review and meta-analysis. Hepatology 2021, 73 (2), 713–725. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (8).Marrero JA; Feng Z; Wang Y; Nguyen MH; Befeler AS; Roberts LR; Reddy KR; Harnois D; Llovet JM; Normolle D; et al. α-fetoprotein, des-γ carboxyprothrombin, and lectin-bound α-fetoprotein in early hepatocellular carcinoma. Gastroenterology 2009, 137 (1), 110–118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (9).Parikh ND; Mehta AS; Singal AG; Block T; Marrero JA; Lok AS Biomarkers for the early detection of hepatocellular carcinoma. Cancer Epidemiol. Biomarkers Prev. 2020, 29 (12), 2495–2503. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (10).Parikh ND; Tayob N; Singal AG Blood-based biomarkers for hepatocellular carcinoma screening: Approaching the end of the ultrasound era? J. Hepatol. 2023, 78 (1), 207–216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (11).Wang Y; Chen H Protein glycosylation alterations in hepatocellular carcinoma: function and clinical implications. Oncogene 2023, 42 (24), 1970–1979. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (12).Mechref Y; Peng W; Gautam S; Ahmadi P; Lin Y; Zhu J; Zhang J; Liu S; Singal AG; Parikh ND; et al. Mass spectrometry based biomarkers for early detection of HCC using a glycoproteomic approach. Adv. Cancer Res. 2023, 157, 23–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (13).Sun S; Zhang H Identification and validation of atypical N-glycosylation sites. Anal. Chem. 2015, 87 (24), 11948–11951. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (14).Zhu J; Huang J; Zhang J; Chen Z; Lin Y; Grigorean G; Li L; Liu S; Singal AG; Parikh ND; et al. Glycopeptide biomarkers in serum haptoglobin for hepatocellular carcinoma detection in patients with nonalcoholic steatohepatitis. J. Proteome Res. 2020, 19 (8), 3452–3466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (15).Lin Y; Zhu J; Zhang J; Dai J; Liu S; Arroyo A; Rose M; Singal AG; Parikh ND; Lubman DM Glycopeptides with Sialyl Lewis antigen in serum haptoglobin as candidate biomarkers for nonalcoholic steatohepatitis hepatocellular carcinoma using a higher-energy collision-induced dissociation parallel reaction monitoring-mass spectrometry method. ACS omega 2022, 7 (26), 22850–22860. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (16).Lin Y; Zhu J; Pan L; Zhang J; Tan Z; Olivares J; Singal AG; Parikh ND; Lubman DM A panel of glycopeptides as candidate biomarkers for early diagnosis of NASH hepatocellular carcinoma using a stepped HCD method and PRM evaluation. J. Proteome Res. 2021, 20 (6), 3278–3289. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (17).Zhang R; Zhu J; Lubman DM; Mechref Y; Tang H GlycoHybridSeq: automated identification of N-linked glycopeptides using electron transfer/high-energy collision dissociation (EThcD). J. Proteome Res. 2021, 20 (6), 3345–3352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (18).Gutierrez Reyes CD; Huang Y; Atashi M; Zhang J; Zhu J; Liu S; Parikh ND; Singal AG; Dai J; Lubman DM; et al. PRM-MS quantitative analysis of isomeric N-glycopeptides derived from human serum haptoglobin of patients with cirrhosis and hepatocellular carcinoma. Metabolites 2021, 11 (8), 563. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (19).Lin Y; Lubman DM The role of N-glycosylation in cancer. Acta Pharm. Sin. B 2024, 14 (3), 1098–1110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (20).Somers N; Butaye E; Grossar L; Pauwels N; Geerts A; Raevens S; Lefere S; Devisscher L; Meuris L; Callewaert N; et al. Glycomics as prognostic biomarkers of hepatocellular carcinoma: A systematic review. Oncol. Lett. 2024, 29 (1), 24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (21).Sotoudeheian M. j. Value of Mac-2 Binding Protein Glycosylation Isomer (M2BPGi) in Assessing Liver Fibrosis in Metabolic Dysfunction-Associated Liver Disease: A Comprehensive Review of its Serum Biomarker Role. Curr. Protein Pept. Sci. 2025, 26 (1), 6–21. [DOI] [PubMed] [Google Scholar]
  • (22).Pompach P; Brnakova Z; Sanda M; Wu J; Edwards N; Goldman R Site-specific glycoforms of haptoglobin in liver cirrhosis and hepatocellular carcinoma. Mol. Cell. Proteomics 2013, 12 (5), 1281–1293. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (23).Sanda M; Benicky J; Goldman R Low collision energy fragmentation in structure-specific glycoproteomics analysis. Anal. Chem. 2020, 92 (12), 8262–8267. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (24).Lih TM; Jiao L; Chen L; Woo J; Wang Y; Zhang H AUTO-SP: automated sample preparation for analyzing proteins and protein modifications. Anal. Chem. 2025, 97, 16751–16758. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (25).Zhu J; Warner E; Parikh ND; Lubman DM Glycoproteomic markers of hepatocellular carcinoma-mass spectrometry based approaches. Mass Spectrom. Rev. 2019, 38 (3), 265–290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (26).Dalal K; Dalal B; Bhatia S; Shukla A; Shankarkumar A Analysis of serum Haptoglobin using glycoproteomics and lectin immunoassay in liver diseases in Hepatitis B virus infection. Clin. Chim. Acta 2019, 495, 309–317. [DOI] [PubMed] [Google Scholar]
  • (27).Zhu J; Lin Z; Wu J; Yin H; Dai J; Feng Z; Marrero J; Lubman DM Analysis of serum haptoglobin fucosylation in hepatocellular carcinoma and liver cirrhosis of different etiologies. J. Proteome Res. 2014, 13 (6), 2986–2997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (28).Wang M; Grauzam S; Bayram MF; Dressman J; DelaCourt A; Blaschke C; Liang H; Scott D; Huffman G; Black A; et al. Spatial omics-based machine learning algorithms for the early detection of hepatocellular carcinoma. Commun. Med. 2024, 4 (1), 258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (29).Kawahara R; Chernykh A; Alagesan K; Bern M; Cao W; Chalkley RJ; Cheng K; Choo MS; Edwards N; Goldman R; et al. Community evaluation of glycoproteomics informatics solutions reveals high-performance search strategies for serum glycopeptide analysis. Nat. Methods 2021, 18 (11), 1304–1316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (30).Xu X; Yin K; Wu R Systematic Investigation of the Trafficking of Glycoproteins on the Cell Surface. Mol. Cell. Proteomics 2024, 23 (5), 100761. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (31).Singal AG; Llovet JM; Yarchoan M; Mehta N; Heimbach JK; Dawson LA; Jou JH; Kulik LM; Agopian VG; Marrero JA; et al. AASLD Practice Guidance on prevention, diagnosis, and treatment of hepatocellular carcinoma. Hepatology 2023, 78 (6), 1922–1965. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (32).Topol EJ High-performance medicine: the convergence of human and artificial intelligence. Nat. Med. 2019, 25 (1), 44–56. [DOI] [PubMed] [Google Scholar]
  • (33).Rajkomar A; Dean J; Kohane I Machine learning in medicine. N. Engl. J. Med. 2019, 380 (14), 1347–1358. [DOI] [PubMed] [Google Scholar]
  • (34).Oh MJ; Lee SH; Kim U; An HJ In-depth investigation of altered glycosylation in human haptoglobin associated cancer by mass spectrometry. Mass Spectrom. Rev. 2023, 42 (2), 496–518. [DOI] [PubMed] [Google Scholar]
  • (35).Okuda S; Yoshizawa AC; Kobayashi D; Takahashi Y; Watanabe Y; Moriya Y; Hatano A; Takami T; Matsumoto M; Araki N; et al. jPOST environment accelerates the reuse and reanalysis of public proteome mass spectrometry data. Nucleic Acids Res. 2025, 53 (D1), D462–D467. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (36).Timp W; Timp G Beyond mass spectrometry, the next step in proteomics. Sci. Adv. 2020, 6 (2), No. eaax8978. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (37).Noor Z; Ahn SB; Baker MS; Ranganathan S; Mohamedali A Mass spectrometry–based protein identification in proteomicsa review. Briefings Bioinf. 2021, 22 (2), 1620–1638. [DOI] [PubMed] [Google Scholar]
  • (38).Tamara S; den Boer MA; Heck AJ High-resolution native mass spectrometry. Chem. Rev. 2022, 122 (8), 7269–7326. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (39).Meissner F; Geddes-McAlister J; Mann M; Bantscheff M The emerging role of mass spectrometry-based proteomics in drug discovery. Nat. Rev. Drug Discovery 2022, 21 (9), 637–654. [DOI] [PubMed] [Google Scholar]
  • (40).Hunter CL; Bons J; Schilling B Perspectives and opinions from scientific leaders on the evolution of data-independent acquisition for quantitative proteomics and novel biological applications. Aust. J. Chem. 2023, 76, 379–398. [Google Scholar]
  • (41).Darula Z; Medzihradszky KF Carbamidomethylation side reactions may lead to glycan misassignments in glycopeptide analysis. Anal. Chem. 2015, 87 (12), 6297–6302. [DOI] [PubMed] [Google Scholar]
  • (42).Kohansal-Nodehi M; Swiatek-de Lange M; Kroeniger K; Rolny V; Tabarés G; Piratvisuth T; Tanwandee T; Thongsawat S; Sukeepaisarnjaroen W; Esteban JI; et al. Discovery of a haptoglobin glycopeptides biomarker panel for early diagnosis of hepatocellular carcinoma. Front. Oncol. 2023, 13, 1213898. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (43).Mackay S; Hitefield NL; Oduor IO; Roberts AB; Burch TC; Lance RS; Cunningham TD; Troyer DA; Semmes OJ; Nyalwidhe JO Site-specific intact N-linked glycopeptide characterization of prostate-specific membrane antigen from metastatic prostate cancer cells. ACS omega 2022, 7 (34), 29714–29727. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (44).Klontz EH; Li C; Kihn K; Fields JK; Beckett D; Snyder GA; Wintrode PL; Deredge D; Wang L-X; Sundberg EJ Structure and dynamics of an α-fucosidase reveal a mechanism for highly efficient IgG transfucosylation. Nat. Commun. 2020, 11 (1), 6204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (45).Atiq O; Tiro J; Yopp AC; Muffler A; Marrero JA; Parikh ND; Murphy C; McCallister K; Singal AG An assessment of benefits and harms of hepatocellular carcinoma surveillance in patients with cirrhosis. Hepatology 2017, 65 (4), 1196–1205. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (46).Parikh ND; Tayob N; Al-Jarrah T; Kramer J; Melcher J; Smith D; Marquardt P; Liu P-H; Tang R; Kanwal F; et al. Barriers to surveillance for hepatocellular carcinoma in a multicenter cohort. JAMA network open 2022, 5 (7), No. e2223504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (47).Pepe MS; Etzioni R; Feng Z; Potter JD; Thompson ML; Thornquist M; Winget M; Yasui Y Phases of biomarker development for early detection of cancer. J. Natl. Cancer Inst. 2001, 93 (14), 1054–1061. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supp figures and Tables 5-8
supp Table 1
supp Table 2
supp Table 3
supp Table 4

Data Availability Statement

The MS raw data obtained from Orbitrap through NanoLC-HCD_PRM-MS/MS workflow have been deposited in the Japan Proteome STandard Repository (jPOST; https://repository.jpostdb.org/). The data accession numbers are PXD064014 for ProteomeXchange and JPST003811 for jPOST.

RESOURCES