Skip to main content
iScience logoLink to iScience
. 2026 Jan 21;29(2):114751. doi: 10.1016/j.isci.2026.114751

Integrating thermal liquid biopsy, clinical data, and mass spectrometry for early diagnosis and biomarker discovery in colorectal cancer

Sonia Hermoso-Durán 1,2,3,4, David Ortega-Alarcon 1,2, Astrid Z Johansen 5, Mattew J McKay 6, Julia S Johansen 5,7,8, Sonia Vega 1, Claus L Feltoft 7, Troels Gammeltoft Dolin 7,8, Jakob Lykke 9, Nicolas Fraunhoffer 10, Oscar Sanchez-Gracia 11, Pablo F Garrido 1,12,13, Ángel Lanas 2,3,14,15, Mark P Molloy 6, Adrian Velazquez-Campoy 1,2,3,16,, Olga Abian 1,2,3,16,17,∗∗
PMCID: PMC12907641  PMID: 41704751

Summary

Early detection of colorectal cancer is essential to improving survival, where yet current diagnostic tools show limited performance. This study aimed to enhance diagnostic accuracy by integrating clinical variables with thermogram profiles obtained through serum-based thermal liquid biopsy and analyzed using machine learning models. We evaluated 328 patients with colorectal cancer and 355 symptomatic individuals with non-organ-specific cancer signs but negative diagnostic evaluations, to reproduce clinically relevant decision settings. The combined model showed improved classification performance compared with the use of clinical variables alone, particularly in patients with early-stage disease. In addition, proteomic analysis of samples stratified by thermogram patterns identified proteins associated with survival, including fibrinogen-like protein 1, supporting the biological relevance of these thermodynamic profiles. Together, these findings indicate that integrating serum thermogram information with routine clinical data can modestly strengthen diagnostic assessment and help identify biologically meaningful patient subgroups, offering a promising non-invasive colorectal cancer evaluation.

Subject areas: health technology, medical specialty, medical tests, procedure, process

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • Thermal liquid biopsy improves early detection of colorectal cancer

  • Integrating TLB with clinical data improves diagnosis, especially in stages I–III CRC

  • Proteomics identifies FGL1 as a novel serum biomarker linked to prognosis

  • Combining TLB, clinical data, and proteomics supports non-invasive, personalized diagnosis


Health technology; Medical specialty; Medical tests; Procedure; Process

Introduction

Colorectal cancer (CRC) is the third most common cancer worldwide and caused 903,853 deaths in 2022 and was the second most deadly cancer after lung cancer.1 The incidence and mortality will increase in the coming years, and risk factors include increasing age, diabetes, obesity, smoking, alcohol abuse, and low physical activity. There has also been an increase in the incidence of early-onset CRC in young individuals under the age 50 in the Western world, with relative increases of 81%.2,3,4,5,6

Chronic inflammation is common in aging7,8 and is a key feature of cancer development.9,10,11 Elevated levels of circulating inflammatory biomarkers, such as C-reactive protein (CRP), interleukin-6 (IL-6), and chitinase-3-like protein 1 (CHI3L1 or YKL-40), have been linked to increased cancer risk, including CRC.12 CRP is mainly produced by hepatocytes, whereas IL-6 and YKL-40 are secreted by inflammatory, stromal, and cancer cells.13,14,15 High plasma levels of IL-6 and YKL-40 are associated with poor overall survival in metastatic CRC.16,17,18,19

The mortality and morbidity of CRC have declined due to population screening programs, demonstrating that early detection significantly improves prognosis.20 Surgical interventions in early-stage CRC is often curative, reinforcing the need for effective diagnostic tools. However, CRC diagnosis remains complex, particularly in early stages where symptoms are nonspecific and mild.

Carcinoembryonic antigen (CEA) is the only recommended biomarker for monitoring patients with CRC in the current ESMO guidelines (Argiles). However, CEA cannot be used for screening of CRC since it lacks specificity, as it is also elevated in other cancers and in benign conditions. CEA levels are also elevated in smokers, which can lead to false positives.21,22,23 Despite this limitation, CEA remains valuable for patient monitoring, as its concentration correlates with tumor burden, and fluctuations reflect treatment response or cancer recurrence.

The search for new diagnostic tools with fewer false positives is a key challenge in CRC management. Thermal liquid biopsy (TLB) has emerged as a promising complementary tool, demonstrating diagnostic utility in other malignancies.24,25,26,27,28

A recent study from our group (under review), focusing on pancreatic cancer, demonstrated that using symptomatic controls without disease is crucial to obtaining clinically relevant results. While this approach reduces differences between control and cancer groups, it better reflects real-world clinical scenarios, making the results more applicable in practice. Additionally, this study highlighted the versatility of machine learning algorithms, which allow the integration of clinical variables and can even provide prognostic insights.

Since serum thermograms reflect the combined effect of all circulating proteins and their interactions, a novel application of TLB could involve grouping patients based on their thermogram profiles and performing proteomic analyses to identify differentially expressed proteins. Although combining differential scanning calorimetry (DSC) with mass spectrometry (MS) is uncommon, previous studies have explored this approach.29

This study aimed to leverage the MICA symptomatic control cohort and compare it with CRC patients from the REBECCA study, an open Danish cohort focused on CRC biomarker discovery for diagnosis and prognosis.30 Using a machine learning-based methodology developed by our group,31 this study had three objectives: (1) to develop a diagnostic classification model based on clinical variables and evaluate whether integrating TLB improves model performance; (2) to evaluate the association between model outputs and clinical variables, identifying potential sources of false positives/negatives and assessing their prognostic value in CRC patients, (3) to perform MS-based proteomic analyses of TLB-stratified subgroup, aiming to discover new biomarkers for CRC diagnosis and prognosis.

Results

Patient cohort description

A total of 683 serum samples were analyzed, with a median patient age of 67 years [57; 74] and a median body mass index (BMI) of 24.77 kg/m2 [21.76; 27.60] (n.a. = 148). The cohort was divided into a control group (n = 355) and a CRC patient group (n = 328). The control group consisted of individuals without a cancer diagnosis at inclusion or during follow-up, while the CRC group included patients with histologically confirmed disease. The median age was higher in CRC patients (69 years [63; 75]) compared to controls (64 years [52; 72]), and the proportion of males was also higher in the CRC group (52.7% vs. 39.2%). However, no significant differences were observed in BMI distribution between groups (Table S1, Figures S1A and S1B).

Regarding lifestyle factors, data on smoking and alcohol consumption were available for most patients. No association was found between smoking status and CRC diagnosis, but a higher proportion of alcohol abuse was observed in the control group (Table S1 and Figure S1C). Functional status, assessed by the ECOG-PS scale (Eastern Cooperative Oncology Group Performance Status scale), showed a greater proportion of CRC patients with ECOG-PS = 1–3, indicating a higher degree of physical impairment (Table S1 and Figure S1D).

Among CRC patients, 56.7% had undergone surgery, while 43.3% had metastatic disease at the time of blood sample collection (Figure S2). The majority of metastases were located in the liver (82.4%), followed by the lungs (38.0%) and peritoneum (23.9%). Serum concentrations of CEA, CRP, IL-6, and YKL-40 differed significantly between groups (Table S1 and Figure S1E).

Survival analysis showed that patients resected for low stage CRC had a median recurrence-free survival (RFS) of 82.8 months and overall survival (OS) of 90.3 months, whereas patients with metastatic CRC had significantly shorter progression free survival (PFS) of 7.5 months and OS of 19.1 months.

Model based on clinical variables (iClin model)

Classification model

A machine learning-based classification model (iClin model) was developed to distinguish symptomatic control patients from CRC patients using clinical variables. The model was trained and validated in 570 patients (62.1% controls) with complete data on age (dichotomized ≤/> 50 years) and the natural logarithm of CEA, CRP, IL-6, and YKL-40. To address class imbalance, a balanced training subset of 151 patients per group was used.

Feature selection with Lasso regularization retained only age and CEA concentration, with CEA having the strongest predictive weight (Figure 1A). The model showed statistically significant differences between groups (p < 0.001, Figure 1B) and achieved an area under the curve (AUC) of 0.74 (95% confident interval (CI): 0.66–0.81), indicating moderate classification performance (Figure 1C).

Figure 1.

Figure 1

Performance of the iClin model in differentiating control patients (symptomatic) from CRC patients

(A) Standardized coefficients of the iClin model for each predictor variable.

(B) Comparison of the iClin model’s numerical response between control and CRC patients in the validation set. The continuous horizontal line represents the standard cut-off threshold, while the dashed line represents the Youden index threshold. Controls are shown in black, and CRC patients in blue.

(C) Receiver operating characteristic (ROC) curve and area under the curve (AUC) and the 95% confident interval of the iClin model in the validation set.

(D) Classification results using the standard cut-off threshold. The top panel presents the contingency table, while the bottom panel summarizes the model’s performance metrics (sensitivity, specificity, PPV, and NPV).

(E) Classification results using the Youden index threshold. The top panel presents the contingency table, while the bottom panel summarizes the model’s performance metrics.

Notes: Variables in (A) include the natural logarithm of CEA and age dichotomized as ≤/> 50 years.

Abbreviations: Acc, accuracy; AUC, area under the ROC curve; CEA, carcinoembryonic antigen; CRC, colorectal cancer; NPV, negative predictive value; PPV, positive predictive value; Sens, sensitivity; Spec, specificity.

Using both the standard cut-off (zero) and the Youden index threshold (0.297), the model demonstrated higher specificity and negative predictive value (NPV) than sensitivity and positive predictive value (PPV), suggesting it is more effective in ruling out CRC than in detecting it (Figures 1D and 1E).

Correlation of iClin model outcomes with clinical variables

To explore potential sources of false classifications, the iClin model’s outcomes were analyzed in relation to clinical variables, using the standard cut-off threshold for consistency.

False positives in controls were associated with smoking, alcohol abuse, and higher eastern cooperative group performance status (ECOG-PS) scores (1–3), while false negatives in CRC patients were more frequent among non-smokers, those with CRP ≤10 mg/L, low comorbidity (Charlson comorbidity index (CCI) = 0–2), and prior tumor resection. Additionally, significant differences in CRP, IL-6, and YKL-40 concentrations were observed between patients classified by the model (Tables S2 and S3 and Figures S3A–S3I).

Regarding prognosis, iClin model scores were significantly associated with OS in metastatic CRC patients, suggesting potential prognostic value (Table S4 and Figure S3J). Additionally, CEA levels correlated with OS in operated patients, while CRP levels correlated with OS in metastatic patients (Table S5).

Model based on TLB thermograms (iTLB model)

Classification model

To integrate TLB data, an independent iTLB model was developed to identify temperature pairs relevant for classification. This model was later combined with clinical features in a hybrid iTLB+iClin model.

The iTLB model was trained using area-normalized thermograms from 683 serum samples (355 controls, 328 CRC patients, Figure S4). Six key temperature pairs were selected (Figures 2A and 2B). The model’s numerical output significantly differed between groups (p < 0.001, Figure 2C) and achieved an AUC of 0.75 (95% CI: 0.68–0.81) (Figure 2D).

Figure 2.

Figure 2

Results of the iTLB model for differentiating thermograms between control and CRC patient groups

(A) Absolute values of the iTLB model coefficients for each predictor variable.

(B) Mean area-normalized thermograms for each group (controls in black, CRC in blue), with gray dots indicating the selected temperature pairs used in the iTLB model.

(C) Comparison of the iTLB model’s numerical response between control and CRC patients. The solid horizontal line represents the standard cut-off threshold, while the dashed line represents the Youden index threshold.

(D) Receiver operating characteristic (ROC) curve and area under the curve (AUC) with the 95% confident interval of the iTLB model in the validation set.

(E) Classification results using the standard cut-off threshold. The top panel shows the contingency table for classification outcomes, while the bottom panel displays the performance metrics (accuracy, sensitivity, specificity, PPV, and NPV).

(F) Classification results using the Youden index threshold. The top panel presents the contingency table, and the bottom panel summarizes the performance metrics.

Abbreviations: Acc, accuracy; a.u., arbitrary units; AUC, area under the ROC curve; CP, heat capacity; CRC, colorectal cancer; iTLB, intelligent Thermal Liquid Biopsy; NPV, negative predictive value; PPV, positive predictive value; Sens, sensitivity; Spec, specificity.

Comparing the standard cut-off threshold with the Youden index threshold (−0.024), the latter increased sensitivity and NPV, reducing false negatives but lowering specificity and PPV (Figures 2E and 2F).

Correlation of iTLB model outcomes with clinical variables

The iTLB model showed classification performance comparable to the iClin model, despite relying solely on thermogram data. To assess its clinical relevance, its numerical output was analyzed in relation to clinical variables and prognosis.

Sex and smoking status were associated with iTLB scores, with lower scores in female controls and non-smokers, and higher false-negative rates among non-smokers. Additionally, the model correlated with CEA, CRP, IL-6, and YKL-40 levels, where patients below the cut-off threshold had lower biomarker concentrations. False classifications were more frequent in patients with CEA and CRP values within the reference range.

Regarding prognostic implications, the iTLB model was significantly associated with OS in patients with metastatic CRC, suggesting its potential for identifying subgroups with different prognosis.

Full statistical results are presented in Tables S6–S9, with significant differences illustrated in Figure S5.

Model based on iClin variables and iTLB thermograms (iTLB+iClin model)

Classification model

To improve classification performance, a hybrid model (iTLB+iClin) was developed, integrating clinical variables (age, CEA concentration) and thermogram features (selected temperature pairs). The model was trained and validated on 570 patients (62% controls) with complete data.

Feature selection identified age and CEA, along with two temperature pairs, as the most predictive variables, with age showing the highest coefficient (Figure 3A). The model’s numerical output significantly differed between groups (p < 0.001, Figure 3B), achieving an AUC of 0.84 (95% CI: 0.78–0.89), outperforming both iClin and iTLB models (Figure 3C).

Figure 3.

Figure 3

Performance of the iTLB+iClin model in differentiating symptomatic control patients from CRC patients

(A) Absolute values of the iTLB+iClin model coefficients for each predictor variable.

(B) Comparison of the iTLB+iClin model’s numerical response between control (black) and CRC patients (blue) in the validation set. The continuous horizontal line represents the standard cut-off threshold, while the dashed line represents the Youden index threshold.

(C) Receiver operating characteristic (ROC) curve and AUC with the 95% confident interval of the iTLB+iClin model in the validation set.

(D) Classification results using the standard cut-off threshold. The top panel presents the contingency table, while the bottom panel summarizes the performance metrics (accuracy, sensitivity, specificity, PPV, and NPV).

(E) Classification results using the Youden index threshold. The top panel presents the contingency table, while the bottom panel summarizes the performance metrics.

Notes: Variables in (A) include the natural logarithm of CEA, while age is dichotomized as ≤/> 50 years.

Abbreviations: Acc, accuracy; AUC, area under the ROC curve; CEA, carcinoembryonic antigen; CRC, colorectal cancer; iTLB, intelligent Thermal Liquid Biopsy; NPV, negative predictive value; PPV, positive predictive value; Sens, sensitivity; Spec, specificity.

Using the Youden index threshold (0.255) instead of the standard cut-off (zero) improved specificity and PPV at the cost of a slight decrease in sensitivity (Figures 3D and 3E).

Correlation of iTLB+iClin model outcomes with clinical variables

The iTLB+iClin model outperformed the iClin and iTLB models in distinguishing CRC patients from controls. To assess its clinical relevance, its results were analyzed in relation to clinical and biochemical variables, focusing on potential misclassifications and prognostic value.

False classifications were linked to lifestyle factors, with smokers and alcohol abusers more likely to be classified as false positives, while non-smokers in the CRC group had higher false-negative rates.

Regarding biochemical markers, the model correlated with CRP, IL-6, and YKL-40. False negatives in CRC patients had CRP levels within the reference range, while higher IL-6 and YKL-40 levels were observed in patients with scores above the cut-off threshold, independent of the standard or Youden index.

In terms of clinical severity, higher scores were associated with greater disease severity (ECOG-PS = 1–3, CCI ≥3, American Society of Anesthesiologists (ASA) physical status classification system = II-III), while false negatives were more common in patients with lower severity scores. Additionally, the model showed significant differences based on tumor resection status, suggesting a link between surgical intervention and thermogram-based classification.

Finally, a significant association with OS in operated CRC patients highlights the model’s potential prognostic utility.

Complete statistical results are provided in Tables S10–S13, with significant differences illustrated in Figures 4 and S6.

Figure 4.

Figure 4

Statistically significant associations between the iTLB+iClin model and overall survival in operated CRC patients

(A) Survival analysis based on the iTLB+iClin model dichotomized using the standard cut-off threshold.

(B) Survival analysis based on the iTLB+iClin model dichotomized using the Youden index threshold.

Abbreviations: CRC: colorectal cancer; iTLB: intelligent thermal liquid biopsy.

Diagnostic capacity of the iTLB+iClin model for early stages of CRC

To compare the diagnostic performance of the developed models, the CEA biomarker was used as a reference and compared with both the iClin and iTLB+iClin models. Additionally, the improvement achieved by incorporating TLB data into the iClin model was evaluated.

Overall, the iTLB+iClin model showed the highest classification performance, with a significantly greater AUC compared to CEA alone and iClin alone (p < 0.05, Figure 5A). When analyzing subgroups, the iTLB+iClin model provided the largest improvement in operated CRC patients (i.e., stages I-III), where its AUC was significantly higher than that of the iClin model (p < 0.05, Figure 5B). For patients with metastatic CRC, no significant differences were found between the three models (Figure 5C).

Figure 5.

Figure 5

Receiver operating characteristic (ROC) curve analysis comparing the diagnostic capacity of the CEA biomarker (black), iClin model (light blue), and iTLB+iClin model (dark blue) in distinguishing between

(A) CRC patients and controls.

(B) Operated CRC patients and controls.

(C) Metastatic CRC patients and controls.

(D) Early-stage CRC (stage I–II) patients and controls.

(E) Advanced-stage CRC (stage III–IV) patients and controls.

Notes: p values were obtained using the DeLong test.

Abbreviations: AJCC, American Joint Committee on Cancer (8th edition); CRC, colorectal cancer; iTLB, intelligent thermal liquid biopsy.

In terms of early-stage CRC (stage I-II), the diagnostic capacities of the three models were similar, with no statistically significant differences (Figure 5D). Conversely, in advanced-stage CRC (stage III–IV), the iTLB+iClin model significantly outperformed both CEA and iClin alone, indicating greater diagnostic utility in later stages (Figure 5E).

A complementary stratified analysis for CRC stages I, II, and III individually, comparing the performance of CEA, iClin, and iTLB+iClin models, is provided in Figure S7. This figure includes receiver operating characteristic curves and diagnostic metrics (accuracy, sensitivity, specificity, PPV, and NPV), further illustrating the contribution of TLB-based models at each stage.

Proteomic analysis to identify for new biomarkers according to TLB

Given the association between iTLB model outcomes and OS, as well as the high predictive weight of CEA in the iTLB+iClin model, a proteomic analysis was conducted to identify potential biomarkers responsible for the thermogram differences. This was particularly relevant for the iTLB model, which is based on DSC technology and does not directly identify specific proteins contributing to the observed thermograms.

A high-resolution MS -based proteomic analysis was performed on 100 randomized serum samples (50 from operated CRC patients and 50 from metastatic CRC patients), with 67% of samples showing iTLB <0. The aim was to identify proteins differing between patients with iTLB <0 and iTLB >0, which could help explain OS differences observed in previous analyses.

Global proteomic findings

A total of 451 proteins were identified, which were filtered to 408 proteins based on presence in at least 50% of samples. Among these, three proteins exhibited statistically significant differences between groups and had notable log2 fold-change (FC) values: CRP; and amyloid proteins A-1 and A-2. Three additional proteins were detected when we reduced the log2FC to 0.75: fibrinogen-like protein-1 (FGL1); immunoglobulin heavy chain domains 1–3; and B cell receptor-associated protein 31. These findings are visualized in the volcano plot in Figure 6A.

Figure 6.

Figure 6

Volcano plots of proteomic analysis in CRC patients. Volcano plots showing differences in protein concentrations between groups classified by the iTLB model

(A) Proteomic analysis of 100 serum samples, including both operated CRC patients and metastatic CRC patients.

(B) Subset of 50 serum samples from operated CRC patients.

(C) Subset of 50 serum samples from patients with metastatic CRC. In each plot, proteins with significantly higher relative concentrations in the iTLB > 0 group (FC < 0) are highlighted in red, while those higher in the iTLB < 0 group (FC > 0) are shown in blue. In (A), proteins exceeding the threshold of FC ± 0.75 are also shown in black.

Abbreviation: FC: fold change.

Independent proteomic analyses in operated and metastatic CRC patients

When the proteomic analysis was performed separately in the two patient groups, the findings remained largely consistent with the global analysis. However, in operated CRC patients, differences in FGL1 and the immunoglobulin heavy chain domain 1-3 were more pronounced, exceeding the log2FC < −1 threshold. Additionally, other significant proteins exclusive to the operated CRC group emerged (Figure 6B).

A literature review revealed that FGL1 is produced by cancer cells and its elevated levels in peripheral blood are associated with poor prognosis and treatment resistance (anti-PD-1/B7H1).32,33 Similarly, elevated keratin-20 (K20) levels post-surgery have been linked to worse survival.34 In contrast, FERMT3 expression has been associated with better survival in gastric adenocarcinoma.35,36

In patients with metastatic CRC, an inverse relationship was found between CRP and amyloid protein A-1/A-2 levels and OS (higher relative concentration correlates with worse OS). Conversely, neural Wiskott-Aldrich syndrome protein (WASP) showed a direct relationship with OS (higher relative concentration correlates with better OS) (Figure 6C). Notably, WASP has been implicated in metastasis through its role in cell migration and extracellular matrix remodeling,37 and increased WASP expression has been linked to better CRC prognosis.38

Validation of key proteins in serum

Among the identified proteins, only CRP had available serum concentration data. Statistically significant differences were found in CRP levels according to the iTLB model, in both operated and metastatic CRC patients (Figures 7A and 7B).

Figure 7.

Figure 7

Differences in median protein concentrations according to the iTLB model

(A) CRP levels in operated CRC patients, dichotomized by the standard cut-off threshold.

(B) CRP levels in metastatic CRC patients, dichotomized by the standard cut-off threshold.

(C) FGL1 levels in operated CRC patients, measured using ELISA, dichotomized by the standard cut-off threshold.

Abbreviations: CRP, C-reactive protein; FGL1, fibrinogen-like protein-1; iTLB, intelligent thermal liquid biopsy.

Additionally, FGL1 levels were measured using ELISA (Elabscience, E-El-H1667) in the 50 operated CRC patients analyzed by proteomics. Significant differences in median FGL1 concentrations were observed based on iTLB model classification, confirming the MS findings for this protein (Figure 7C).

To further explore the prognostic relevance of FGL1, we performed Kaplan-Meier survival analyses in both operated and metastatic CRC patients stratified by median FGL1 serum levels. As shown in Figure S8, higher FGL1 concentrations were associated with poorer overall survival in operated CRC patients, whereas no significant differences were observed in the metastatic group. These findings suggest that FGL1 may hold prognostic value particularly in resected patients, supporting its potential role as a dual diagnostic and prognostic biomarker.

Discussion

This study focused on developing classification and predictive tools for CRC, aiming to improve the diagnosis of early-stage disease. Detecting CRC in early stages is crucial, as surgical intervention at this point has a curative potential. Previous research in our group has explored the role of TLB, demonstrating its potential for diagnosis and patient follow-up.27,28,31,39,40,41,42 Based on these findings, we investigated the applicability of TLB in CRC.

Two key conclusions from the pancreatic cancer study (under review, “Transforming Pancreatic Cancer Diagnostics: A Comprenhensive Analysis of Biomarkers, Thermal Liquid Biopsy and Integrated Approaches”) were incorporated into this work. First, the importance of using symptomatic controls was reinforced. Although results are less pronounced compared to using asymptomatic controls, they are more clinically relevant, improving the translational potential of the methodology. For this reason, the MICA cohort (symptomatic patients without a cancer diagnosis) was used. Second, integrating thermogram analysis with clinical variables had been shown as a viable approach for classification models, even allowing classification using clinical variables alone. Based on this, we applied the same methodology to CRC to assess whether adding TLB to a clinical model (iClin) could enhance its diagnostic performance.

This study included 683 patients, divided nearly equally between controls (MICA study) and CRC patients (REBECCA cohort). Consistent with previous data, CRC patients were older, with a median age of approximately 70 years, which aligns with the reported average age of CRC patients in Denmark.43 Additionally, a higher proportion of male patients was observed, a trend consistent with GLOBOCAN 2022 data, which reports a higher incidence of CRC in men. Surprisingly, the proportion of CRC patients who did not report alcohol abuse was higher than expected, contradicting established evidence linking alcohol consumption and CRC risk.20 This discrepancy could be due to the population included in the REBECCA study recruited from a well-educated patient population from the Capital Region of Denmark.

Three models were developed (iClin, iTLB, and iTLB+iClin). In the iTLB model, groups were balanced, while in iClin and iTLB+iClin, missing data resulted in unbalanced groups (around 60% controls). Model training with unbalanced groups can introduce classification bias, as models tend to favor the dominant class. Several strategies exist to mitigate this, such as up-sampling or down-sampling, where cases are replicated or removed to balance the dataset.44 Instead, this study employed a different strategy, randomly selecting the same number of patients per group for training, ensuring balanced model adjustment.

Another important consideration is how performance metrics are affected by unbalanced groups. Positive predictive value (PPV) and negative predictive value (NPV) are influenced by disease prevalence, meaning unbalanced training can bias these values. However, sensitivity, specificity, and AUC remained unaffected. Therefore, model comparison focused on these parameters rather than PPV and NPV.

The iClin model was developed using clinical variables, selecting age, CEA, CRP, IL-6, and YKL-40 due to their known association with CRC and inflammation.9,10,11,12 Age was included as a key predictor, as it is the primary risk factor for CRC1 and is used in CRC screening programs.45,46 CEA, despite lacking specificity, remains a widely used CRC biomarker.

Feature selection using Lasso regularization retained only age (dichotomized as ≤/> 50 years) and CEA concentration, indicating that CRP, IL-6, and YKL-40 did not significantly contribute to classification. The strong association of the iClin model with smoking can be explained by CEA variability in smokers.47 Additionally, the association with alcohol abuse in controls is likely due to the overlap between smoking and alcohol consumption in these patients. The correlation between iClin model scores and clinical severity scales (ECOG-PS, CCI, and ASA) suggests that higher scores reflect worse general health status, which in turn correlates with shorter PFS and OS.

The iTLB model, based solely on temperature pairs extracted from TLB thermograms, showed that TLB alone is insufficient for CRC diagnosis or prognosis, similar to other currently available clinical tools. Since no single non-invasive test serves as a gold standard for CRC diagnosis, TLB should be considered as a complementary tool rather than a standalone diagnostic method. Interestingly, the iTLB model’s association with sex and smoking habits underscores the need for further investigation into how TLB varies by demographic and lifestyle factors.

The iTLB+iClin model incorporated two clinical variables (age, CEA) and two temperature pairs from the iTLB model. While temperature pairs had lower absolute coefficients than clinical variables, they were not eliminated, indicating they added value to the model. The associations found in iTLB+iClin mirrored those in iClin and iTLB, reinforcing the robustness of these relationships.

At first glance, the iClin and iTLB+iClin models appeared similar, leading to the assumption that TLB did not significantly enhance the clinical model. However, detailed analysis revealed a clear improvement in sensitivity, increasing from 52% to over 70%, without compromising specificity. Moreover, the AUC comparison confirmed that integrating TLB improved overall CRC classification. While the model showed a modest increase in AUC for stage I-II patients (from 0.68 to 0.71), this difference did not reach statistical significance. Notably, the strongest diagnostic gain was observed in stage III patients, suggesting that TLB integration may be particularly valuable in detecting more advanced-but still potentially curable-cases. These findings should be interpreted with caution, and further validation in larger cohorts stratified by stage will be essential to confirm these trends. Nevertheless, the trend toward improved sensitivity highlights the potential of TLB to complement existing diagnostic tools rather than replace them, supporting its role as an adjunctive strategy for enhancing early CRC detection.

While the iTLB+iClin model demonstrated a marked improvement in sensitivity, particularly in early-stage CRC (from 52% to over 70%), the study has some limitations in this regard. First, the proportion of patients with early-stage disease (stage I-II) was relatively small, which may limit the assessment of model performance in this critical subgroup. Second, the current analysis does not include a direct comparison with conventional screening tools such as fecal immunochemical testing (FIT) or colonoscopy. Although the study focuses on symptomatic patients rather than a screening setting, future research should explore whether TLB-based models could complement or enhance standard diagnostic pathways. Finally, prospective longitudinal studies will be necessary to determine whether improved early detection using iTLB+iClin leads to better clinical outcomes, such as increased survival and reduced recurrence. These aspects will be crucial to establish the real-world clinical utility and cost-effectiveness of the proposed approach.

Despite these encouraging results, one limitation of the present study is the absence of an external validation cohort. Although the iTLB+iClin model showed robust performance across the combined MICA and REBECCA cohorts, the generalizability of these findings to other populations remains to be demonstrated. Future studies should include independent cohorts from different geographical and clinical settings to confirm the model’s reproducibility and strengthen its translational potential. Ideally, these validations should encompass a wider diversity of ethnic and demographic backgrounds. Moreover, while the current follow-up period limits the assessment of long-term prognostic implications, preliminary associations with overall survival in metastatic CRC suggest potential predictive value that warrants further investigation in prospective longitudinal studies. Addressing these aspects will be essential for consolidating the clinical applicability of the proposed integrated diagnostic model.

In addition, although the model was internally validated using repeated random down-sampling and multiple randomizations, we acknowledge that the absence of cross-cohort validation between the REBECCA and MICA datasets represents an additional limitation. Cross-cohort validation between REBECCA and MICA was not performed because the two datasets are structurally different: REBECCA includes confirmed CRC patients, whereas MICA contains symptomatic but cancer-negative individuals, with partially non-overlapping clinical variables. These differences make cross-cohort validation methodologically inappropriate in the current iteration of the study. Future work will address this aspect by training and testing the model across cohorts or applying it to independent external datasets to further evaluate its reproducibility and generalizability. Moreover, alternative strategies for managing class imbalance-such as class weighting or synthetic data augmentation-may also be explored in larger and more heterogeneous populations to optimize predictive performance.

Additionally, we acknowledge that the exploratory nature of the proteomic findings requires cautious interpretation. A multiple-testing correction was applied, candidate biomarkers were selected based on biological plausibility, consistent trends across datasets, and validation by ELISA. The use of symptomatic controls (MICA) improves translational value, though their representativeness is inherently linked to the referral context in which the cohort was defined. These aspects are discussed as part of the study’s limitations.

While TLB has been explored as a diagnostic tool, this study proposes an additional application: identifying biomarkers associated with thermogram patterns. Given the iTLB model’s correlation with survival, 100 randomized serum samples (50 from operated patients and 50 from patients with metastatic CRC) underwent proteomic analysis using LC-MS, comparing protein expression between groups based on iTLB classification (</> zero).

Among the identified proteins, CRP emerged as a potential prognostic marker, a finding confirmed by available serum concentration data. Additionally, FGL1 was identified as another potential biomarker, less studied than CRP but possibly implicated in cancer prognosis and treatment response. While further research is needed to confirm these findings and elucidate FGL1’s role in CRC, this result underscores the potential of TLB in stratifying patients into clinically relevant subgroups, facilitating the identification of novel biomarkers.

Although the identification of FGL1 and CRP as candidate biomarkers is supported by both mass spectrometry and ELISA validation, the current findings should be interpreted as confirmatory rather than novel discoveries. CRP is a well-established inflammatory marker, and FGL1 has recently been associated with immune modulation and cancer progression in various tumor types, including colorectal cancer. The current analysis, based on a limited set of 100 serum samples from operated and metastatic CRC patients, reinforces their potential relevance in the context of TLB-based stratification, but statistical power and generalizability remain limited. In particular, the role of FGL1 in early-stage CRC warrants further exploration in larger, stratified cohorts including stage I–II patients. Moreover, biological mechanisms linking FGL1 to CRC progression—such as immune evasion or metastatic dissemination—require dedicated functional studies. It is also important to acknowledge that serum levels of both markers may be influenced by confounding factors such as systemic inflammation or liver disease, which were not specifically controlled for in this study. Functional enrichment and network analyses (e.g., Reactome or STRING) could offer deeper biological insight but were beyond the scope of this diagnostic-oriented study and are planned for future research. Beyond diagnosis, these results suggest that the ability of TLB-based models to stratify patients according to prognosis—especially in relation to overall survival—could represent a clinically meaningful application, potentially surpassing its role in early detection.

From a translational perspective, the feasibility of implementing TLB in clinical practice merits further consideration. The method relies on DSC, a label-free technique that requires minimal sample preparation and is based on standardized, automatable instrumentation. While the current DSC workflow is relatively low-throughput, recent advances in instrument design and automation may support its scalability for diagnostic laboratories. In terms of cost, TLB does not require expensive reagents or antibodies, offering a potentially affordable alternative for biomarker-based stratification. Importantly, the integration of TLB into existing diagnostic workflows would not replace standard tools such as colonoscopy or FIT, but rather complement them-particularly in symptomatic patients, high-risk groups, or cases requiring prognostic stratification or treatment monitoring. Given that TLB provides a global thermodynamic signature of serum protein content, it may also be useful in identifying systemic alterations beyond specific molecular biomarkers. Overall, the results of this study reinforce the value of TLB as an adjunctive, non-invasive strategy within a personalized oncology framework, rather than as a standalone diagnostic solution.

Limitations of the study

While the integrated model showed improved sensitivity for early-stage colorectal cancer (CRC), the limited number of stage I–II cases constrains the evaluation of diagnostic performance in this critical subgroup. Additionally, although the model was internally validated using down-sampling and multiple randomizations, external validation in independent cohorts remains essential to assess generalizability. The REBECCA and MICA cohorts differ in design and available variables, precluding direct cross-cohort validation; harmonization or application to new datasets is planned for future studies.

The exploratory proteomic analysis, while supported by mass spectrometry and ELISA, was based on a limited subset (n = 100) and should be considered hypothesis-generating. The roles of candidate biomarkers such as FGL1 warrant further investigation in larger, stratified populations, and functional studies.

Resource availability

Lead contact

Requests for further information and resources should be directed to and will be fulfilled by the lead contact, Olga Abian (oabifra@unizar.es).

Materials availability

This study did not generate new unique reagents.

Data and code availability

  • Proteomics data have been deposited at PRIDE (ProteomeXchange Consortium) under the accession number PXD064655 and are publicly available.

  • This paper does not report original custom code. All analyses were performed using standard publicly available R packages (including switchBox, ncvreg, caret, and survival), as specified in the STAR Methods section.

  • Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

Acknowledgments

This research was funded by the Ministerio de Ciencia e Innovación (MCIN)/Agencia Estatal de Investigación (AEI)/10.13039/501100011033 and “ERDF A way of Making Europe” (PID2021-127296OB-I00 to A.V.C.); Ministerio de Ciencia, Innovación y Universidades [Juan de la Cierva Contract JDC2023-052992-I to D.O.A.]; Instituto de Salud Carlos III and co-funded by the European Union (ESF, “Investing in your future” and European Regional Development Fund-ERDF) (FIS projects PI21/00394 and PI25/00661 to O.A., and PFIS contract FI19/00146 to S.H.D.); Diputación General de Aragón (PROY_B08_24 to O.A.; Translational Digestive Pathology Group B25_23R to A.L.); and the Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD). M.P.M. acknowledges funding support from Bowel Cancer Australia.

This article is based upon work from COST Action “Identification of biological markers for prevention and translational medicine in pancreatic cancer (TRANSPAN),” CA21116, supported by COST (European Cooperation in Science and Technology). Created in BioRender. Abian, O. (2026) https://BioRender.com/orwo9ce.

Author contributions

Conceptualization, A.V.-C. and O.A.; methodology, S.H.-D., M.M., M.P.M., A.V.-C., and O.A.; software, N.F., O.S.-G., and P.F.G.; validation, S.H.-D., A.V.-C., and O.A.; formal analysis, N.F., O.S.-G., and P.F.G.; investigation, S.H.-D., D.O.-A., S.V., M.M., and M.P.M.; resources, A.Z.J., J.S.J., C.L.F., T.G.D., and J.L.; data curation, N.F., O.S.-G., and P.F.G.; writing-original draft, S.H.-D., A.V.-C., and O.A.; writing-review and editing, S.H.-D., A.V.-C., O.A., and Á.L.; supervision, A.V.-C. and O.A.; project administration, A.V.-C. and O.A.; funding acquisition, A.V.-C., O.A., and Á.L.

Declaration of interests

The authors declare no competing interests.

STAR★Methods

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Biological samples

Serum samples from REBECCA and MICA cohort patients Copenhagen University Hospital – Herlev and Gentofte Approved by Danish Ethics Committees (REBECCA: H-2-2013-078; MICA: H-7-2014-011)

Chemicals, peptides, and recombinant proteins

Phosphate-Buffered Saline (PBS) Sigma-Aldrich Cat# D8537
CEA Immunochemiluminescence Assay Siemens Denmark Catalog no. L2KG12
CRP Ultra Immunoturbidimetric Kit SENTINEL Diagnostics Catalog no. 11508 UD-2.0/02

Critical commercial assays

Human IL-6 Quantikine HS ELISA Kit R&D Systems, UK Catalog no. HS600
Human YKL-40 ELISA Kit Quidel Corporation, USA Catalog no. NC2207193

Deposited data

LC-MS/MS proteomics data PRIDE / ProteomeXchange Accession PXD064655
Thermogram raw and processed data This paper Available upon request

Software and algorithms

Origin 7.0 OriginLab https://www.originlab.com/
R (v4.3.2) R Project for Statistical Computing https://www.r-project.org/
DIA-NN (v1.8.1) Demichev et al., 2020 https://github.com/vdemichev/DiaNN
ProteoWizard (msConvert) ProteoWizard Project https://proteowizard.sourceforge.io
Perseus (v2.0.11.0) MaxQuant Project https://maxquant.net/perseus

Experimental model and study participant details

This study included two independent prospective cohorts of human participants recruited at Copenhagen University Hospital - Herlev and Gentofte, Denmark, and followed according to approved ethical protocols.

The first cohort comprised 340 patients (aged 35–88 years) with histologically confirmed stage I-IV colorectal cancer (CRC), enrolled between July 2014 and December 2021 in the REBECCA cohort study (“Biomarkers in patients with colorectal cancer – can they provide new information of the disease, effects of treatment, adverse events and prognosis?”) Patients were followed until May 2023, with a maximum follow-up of 5 years or until death. This cohort included both operated patients (stage I-III) and metastatic patients (stage IV). For all participants, the following variables were collected: age, sex, BMI (categorized), smoking and alcohol habits, Eastern Cooperative Oncology Group Performance Status (ECOG-PS), and serum levels of CEA, CRP, IL-6, and YKL-40. Additional clinical data were retrieved for CRC patients, including TNM stage, Charlson Comorbidity Index (CCI), ASA physical status, resection margin status (R0-R2), and time to recurrence or death. In stage IV CRC, data were collected regarding primary tumor resection status and time from blood sampling to death or last follow-up.

The second cohort comprised 414 symptomatic individuals (aged 18–98 years) included in the MICA cohort study (“New biomarkers in patients referred because of suspected serious illness – are they giving new diagnostic information?”) between July 2016 and December 2022. These patients were referred to the Diagnostic Cancer Patient Pathway due to non-specific but potentially serious symptoms (e.g., weight loss, anemia, fatigue) and were later confirmed not to have cancer after systematic diagnostic work-up and follow-up. This cohort served as a real-world symptomatic control group. For all MICA participants, the same clinical variables were collected as in REBECCA, including age, sex, BMI, smoking and alcohol habits, ECOG-PS, and serum biomarkers.

All participants provided written informed consent before inclusion. The study protocols were approved by the Danish Regional Ethics Committees and the Danish Data Protection Agencies:

  • REBECCA: VEK H-2-2013-078; HEH-2014-044, I-suite 02771; PRIVACY P-2019-614

  • MICA: H-7-2014-011; HEH-2014-105, I-suite 03330; PRIVACY P-2020-578

All procedures were conducted in accordance with the Declaration of Helsinki. Patient data and samples were pseudonymized using internal codes to ensure confidentiality.

Sex and age were recorded for all participants. Although sex was considered during statistical analysis, the models were not stratified by sex or gender. Future studies should explore the potential impact of sex and gender on the performance and generalizability of the models.

For the primary diagnostic analyses, a total of 683 serum samples were included, corresponding to 355 symptomatic control patients from the MICA cohort and 328 CRC patients from the REBECCA cohort. Patients were allocated to the control group if they had no cancer diagnosis at inclusion or during follow-up, and to the CRC group if colorectal cancer was histologically confirmed.

Method details

Sample collection and processing

Peripheral venous blood samples were collected from all participants in both cohorts at Copenhagen University Hospital - Herlev and Gentofte (Denmark). For patients with CRC included in the REBECCA cohort, blood was obtained prior to surgery (for stage I-III) or before first-line chemotherapy (for stage IV). For symptomatic individuals from the MICA cohort, blood was collected at the time of referral to the diagnostic cancer pathway.

Blood samples were processed within 2 hours of collection. Serum was separated by centrifugation and immediately aliquoted and stored at −80 °C until analysis. All samples were labelled with anonymized internal codes and handled according to standard procedures approved by the Danish Ethics Committees.

Biochemical assays and clinical variables

Serum concentrations of CEA, CRP, IL-6, and YKL-40 were measured using standardized and validated assays performed at the Department of Clinical Biochemistry at Herlev and Gentofte University Hospital. CEA and CRP analyses were performed on fresh serum samples as part of the routine diagnostic workflow, whereas IL-6, YKL-40, and thermogram measurements were conducted on biobanked frozen samples. CEA was quantified using an immunochemiluminescence assay (Immulite 2000 GI-MA, Siemens, Denmark; catalog no. L2KG12), with a measurement range of 1–10,000 μg/L, intra-assay coefficient of variation (CV) below 5%, and inter-assay CV below 6%. Elevated CEA levels were defined as >5 μg/L. CRP was measured using a high-sensitivity immunoturbidimetric assay (SENTINEL CRP Ultra (UD), 11508 UD-2.0/02), with a detection range of 0.3–640 mg/L, intra-assay CV <3%, and inter-assay CV <15%. Elevated CRP was defined as >10 mg/L. IL-6 and YKL-40 were measured in duplicates using two-site sandwich ELISAs (IL-6: R&D Systems, UK, catalog no. HS600; YKL-40: Quidel Corporation, USA). The lower limit of detection for IL-6 was 0.01 ng/L, with intra- and inter-assay CVs ≤8% and ≤11%, respectively. For YKL-40, the detection limit was 20 μg/L, with intra- and inter-assay CVs below 5% and 6%, respectively. Elevated IL-6 and YKL-40 were defined as >5 ng/L and >200 μg/L, respectively.

The following clinical variables were extracted from the REBECCA and MICA cohort databases: age, sex, body mass index (BMI), smoking status, alcohol consumption, Eastern Cooperative Oncology Group Performance Status (ECOG-PS), and serum biomarker levels. BMI was categorized as underweight (<18.5 kg/m2), normal weight (18.5-24.9 kg/m2), overweight (25-29.9 kg/m2), and obese (≥30 kg/m2). Alcohol abuse was defined as >7 units/week for females and >14 units/week for males (1 unit ≈ 12 g of alcohol). Smoking status was classified as never, former, or current smoker.

Additional variables were collected for patients with colorectal cancer (CRC) from the REBECCA cohort. These included TNM stage (I-IV), subclassification as operated (stage I-III) or metastatic (stage IV), Charlson Comorbidity Index (CCI, unadjusted for age), American Society of Anesthesiologists (ASA) physical status classification, and resection margin status (R0: complete resection with negative margins; R1: microscopic residual disease; R2: macroscopic residual disease). Longitudinal follow-up data were recorded, including time from blood sampling to recurrence, death, or last clinical follow-up. In stage IV patients, the resection status of the primary tumor (in situ vs resected) and the interval between blood collection and initiation of first-line chemotherapy were also documented. All variables were collected prospectively using standardized institutional protocols with appropriate ethical oversight and data quality control.

Thermal liquid biopsy (TLB) analysis

Serum thermograms were obtained using high-sensitivity differential scanning calorimetry (Auto-PEAQ-DSC; MicroCal, Malvern-Panalytical). Each thermogram represents the excess heat capacity (Cp) of the serum sample as a function of temperature, reflecting the global thermal denaturation behavior of serum proteins.

Samples were diluted 1:25 in filtered phosphate-buffered saline (PBS), and 400 μL were used for each assay. Measurements were performed with a scan rate of 1 °C/min, from 10 °C to 95°C.

After acquisition, thermograms were processed using custom scripts implemented in Origin 7 (OriginLab). The data were baseline-corrected, the buffer signal was subtracted, and the analysis was restricted to the 40–95°C range. Thermograms were interpolated at 0.25 °C intervals and normalized by area to reduce systematic variability due to differences in protein concentration.

Mass spectrometry-based proteomic analysis

For the discovery proteomic analysis, we selected a subset of 100 CRC patients from the REBECCA cohort, including 50 operated patients (stages I–III) and 50 patients with metastatic disease (stage IV). Within each clinical group, samples were randomly selected and then classified according to the iTLB model output (iTLB < 0 vs iTLB > 0) for downstream comparative analyses.

Serum samples were randomized prior to analysis to minimize technical variability. Protein extraction and digestion were performed using a 96-well 3D-printed device and StageTip-based protocols adapted for high-throughput processing. Following digestion, approximately 1 μg of peptides per sample was injected into a Dionex Ultimate 3000 nanoUHPLC system, coupled online to a Q-Exactive HF-X Orbitrap mass spectrometer (Thermo Fisher Scientific) via a nanospray electrospray ionization (ESI) source.

Peptide separation was carried out on a 50 cm × 75 μm C18AQ fused silica column (1.9 μm particles, with ∼10 μm tip) at 60 °C. A linear gradient elution was applied from 5% to 40% acetonitrile over 45 minutes at a flow rate of 300 nL/min. Peptides were ionized at 2.3 kV and analyzed using data-independent acquisition (DIA) mode. The MS1 scan was recorded over a range of 350–1400 m/z at 60,000 resolution (AGC target: 3×106; maximum injection time: 50 ms), followed by three DIA events encompassing 35 overlapping mass windows between 350 and 1200 m/z (MS2 resolution: 15,000; AGC target: 1×106).

Raw data were converted to mzML format and demultiplexed using ProteoWizard. Peptide and protein identification were performed using DIA-NN software in library-free mode against a human UniProt FASTA database. Identification criteria included a maximum of two missed tryptic cleavages and allowance for three variable modifications (N-terminal methionine cleavage, carbamidomethylation, and methionine oxidation). Peptides were considered within a length range of 7–30 amino acids, with precursor charges from +1 to +4, precursor mass range of 300–1800 m/z, and fragment ion range of 200–1800 m/z. A false discovery rate (FDR) threshold of 1% was applied at the peptide level. Run-to-run alignment and a two-pass neural network classifier were used to improve quantification consistency across samples.

Quantification and statistical analysis

All statistical analyses were performed in R (v4.3.2, October 31, 2023) unless otherwise specified. The statistical methodology, sample sizes, performance metrics, and software tools used for each analysis are described below and in the corresponding figure legends.

Data preprocessing and model development

Three classification models were developed to predict colorectal cancer (CRC) status:

  • 1.

    iClin model: based on clinical and biochemical variables only,

  • 2.

    iTLB model: based solely on thermal liquid biopsy (TLB) data,

  • 3.

    iTLB+iClin model: combining both clinical variables and TLB features.

Clinical predictors included age (dichotomized at 50 years) and serum concentrations of CEA, IL-6, CRP, and YKL-40, used as raw or log-transformed values. Preprocessing included removal of variables with zero variance or high correlation (Pearson r > 0.9), and normalization by mean-centering and scaling.

Thermogram features consisted of discordant temperature pairs selected via a K-Top-Scoring-Pair (KTSP) strategy. Logistic regression models were trained using the ncvreg package with a mixture of L1 and L2 penalties (α = 0.5), incorporating internal cross-validation for feature selection and regularization.

Due to class imbalance between controls and CRC patients, models were trained on balanced subsets (n = 100 per group), randomly resampled in 100 iterations. This conservative approach minimized sampling bias without introducing synthetic data.

Model validation and performance evaluation

Models were trained on 70% of the data and validated on the remaining 30%. For comparisons between models, the full dataset was also used. A classification cutoff of 0 was applied to model scores, and Youden’s index was used for sensitivity analysis.

Diagnostic performance was evaluated using sensitivity, specificity, positive and negative predictive values (PPV and NPV), and area under the ROC curve (AUC). ROC curves were compared using DeLong’s test.

Statistical analysis of clinical variables lead

Categorical variables were summarized as counts and percentages; continuous variables as mean ± standard deviation (SD) or median with interquartile range [Q1–Q3], depending on data distribution. Normality was assessed with Kolmogorov-Smirnov or Shapiro-Wilk tests.

  • Group comparisons for continuous variables used unpaired t-tests (if normal with equal variances, verified via Bartlett’s test) or Wilcoxon rank-sum tests.

  • Categorical variables were compared with Pearson’s Chi-square test or Fisher’s exact test (n ≤ 20).

  • Survival analysis was performed using Kaplan-Meier estimators, with group differences evaluated by Log-Rank or Gehan-Breslow tests.

All tests were two-sided, and a p-value < 0.05 was considered statistically significant.

Proteomic data analysis

Label-free protein quantification from DIA-MS data was performed using DIA-NN and analyzed in Perseus (v2.0.11.0). Proteins with at least 50% valid values in one group were retained. Peak areas were log2-transformed and missing values imputed from a normal distribution. Differential expression between groups was tested using two-sided t-tests with a false discovery ratio (FDR) correction (p < 0.05). Volcano plots were used to identify candidate biomarkers.

Additional resources

This study is based on two prospective observational cohorts (REBECCA and MICA) and does not correspond to a registered clinical trial. Therefore, no clinical trial registry number applies.

Published: January 21, 2026

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.isci.2026.114751.

Contributor Information

Adrian Velazquez-Campoy, Email: adrianvc@unizar.es.

Olga Abian, Email: oabifra@unizar.es.

Supplemental information

Document S1. Figures S1–S8 and Tables S1–S13
mmc1.pdf (2.8MB, pdf)

References

  • 1.Keum N., Giovannucci E. Global burden of colorectal cancer: emerging trends, risk factors and prevention strategies. Nat. Rev. Gastroenterol. Hepatol. 2019;16:713–732. doi: 10.1038/s41575-019-0189-8. [DOI] [PubMed] [Google Scholar]
  • 2.Biller L.H., Schrag D. Diagnosis and Treatment of Metastatic Colorectal Cancer. JAMA. 2021;325:669–685. doi: 10.1001/jama.2021.0106. [DOI] [PubMed] [Google Scholar]
  • 3.Eng C., Yoshino T., Ruíz-García E., Mostafa N., Cann C.G., O’Brian B., Benny A., Perez R.O., Cremolini C. Colorectal cancer. Lancet. 2024;404:294–310. doi: 10.1016/S0140-6736(24)00360-X. [DOI] [PubMed] [Google Scholar]
  • 4.Brown J.C., Ma C., Shi Q., Couture F., Kuebler P., Kumar P., Tan B., Krishnamurthi S., Chang V., Goldberg R.M., et al. Inflammation, physical activity, and disease-free survival in stage III colon cancer: Cancer and Leukemia Group B–Southwest Oncology Group 80702 (Alliance) J. Natl. Cancer Inst. 2024;116:2032–2039. doi: 10.1093/jnci/djae203. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Patel S.G., Karlitz J.J., Yen T., Lieu C.H., Boland C.R. The rising tide of early-onset colorectal cancer: a comprehensive review of epidemiology, clinical features, biology, risk factors, prevention, and early detection. Lancet Gastroenterol. Hepatol. 2022;7:262–274. doi: 10.1016/S2468-1253(21)00426-X. [DOI] [PubMed] [Google Scholar]
  • 6.Ben-Aharon I., van Laarhoven H.W.M., Fontana E., Obermannova R., Nilsson M., Lordick F. Early-Onset Cancer in the Gastrointestinal Tract Is on the Rise—Evidence and Implications. Cancer Discov. 2023;13:538–551. doi: 10.1158/2159-8290.CD-22-1038. [DOI] [PubMed] [Google Scholar]
  • 7.Ferrucci L., Fabbri E. Inflammageing: chronic inflammation in ageing, cardiovascular disease, and frailty. Nat. Rev. Cardiol. 2018;15:505–522. doi: 10.1038/s41569-018-0064-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Franceschi C., Campisi J. Chronic Inflammation (Inflammaging) and Its Potential Contribution to Age-Associated Diseases. J. Gerontol. A Biol. Sci. Med. Sci. 2014;69:S4–S9. doi: 10.1093/gerona/glu057. [DOI] [PubMed] [Google Scholar]
  • 9.Hanahan D., Weinberg R.A. Hallmarks of Cancer: The Next Generation. Cell. 2011;144:646–674. doi: 10.1016/j.cell.2011.02.013. [DOI] [PubMed] [Google Scholar]
  • 10.Sharma B.R., Kanneganti T.-D. Inflammasome signaling in colorectal cancer. Transl. Res. 2023;252:45–52. doi: 10.1016/j.trsl.2022.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Terzić J., Grivennikov S., Karin E., Karin M. Inflammation and Colon Cancer. Gastroenterology. 2010;138:2101–2114.e5. doi: 10.1053/j.gastro.2010.01.058. [DOI] [PubMed] [Google Scholar]
  • 12.Michels N., van Aart C., Morisse J., Mullee A., Huybrechts I. Chronic inflammation towards cancer incidence: A systematic review and meta-analysis of epidemiological studies. Crit. Rev. Oncol. Hematol. 2021;157 doi: 10.1016/j.critrevonc.2020.103177. [DOI] [PubMed] [Google Scholar]
  • 13.Hart P.C., Rajab I.M., Alebraheem M., Potempa L.A. C-Reactive Protein and Cancer-Diagnostic and Therapeutic Insights. Front. Immunol. 2020;11 doi: 10.3389/fimmu.2020.595835. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Jones S.A., Jenkins B.J. Recent insights into targeting the IL-6 cytokine family in inflammatory diseases and cancer. Nat. Rev. Immunol. 2018;18:773–789. doi: 10.1038/s41577-018-0066-7. [DOI] [PubMed] [Google Scholar]
  • 15.Zhao T., Su Z., Li Y., Zhang X., You Q. Chitinase-3 like-protein-1 function and its role in diseases. Signal Transduct. Target. Ther. 2020;5:201. doi: 10.1038/s41392-020-00303-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Wang Z., Wu P., Wu D., Zhang Z., Hu G., Zhao S., Lai Y., Huang J. Prognostic and clinicopathological significance of serum interleukin-6 expression in colorectal cancer: a systematic review and meta-analysis. OncoTargets Ther. 2015;8:3793–3801. doi: 10.2147/OTT.S93297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Vainer N., Dehlendorff C., Johansen J.S. Systematic literature review of IL-6 as a biomarker or treatment target in patients with gastric, bile duct, pancreatic and colorectal cancer. Oncotarget. 2018;9:29820–29841. doi: 10.18632/oncotarget.25661. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Bian B., Li L., Yang J., Liu Y., Xie G., Zheng Y., Zeng L., Zeng J., Shen L. Prognostic value of YKL-40 in solid tumors: a meta-analysis of 41 cohort studies. Cancer Cell Int. 2019;19:259. doi: 10.1186/s12935-019-0983-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Tarpgaard L.S., Guren T.K., Glimelius B., Christensen I.J., Pfeiffer P., Kure E.H., Sorbye H., Ikdahl T., Yilmaz M., Johansen J.S., Tveit K.M. Plasma YKL-40 in Patients with Metastatic Colorectal Cancer Treated with First Line Oxaliplatin-Based Regimen with or without Cetuximab: RESULTS from the NORDIC VII Study. PLoS One. 2014;9 doi: 10.1371/journal.pone.0087746. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Berbecka M., Berbecki M., Gliwa A.M., Szewc M., Sitarz R. Managing Colorectal Cancer from Ethology to Interdisciplinary Treatment: The Gains and Challenges of Modern Medicine. Int. J. Mol. Sci. 2024;25:2032. doi: 10.3390/ijms25042032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Argilés G., Tabernero J., Labianca R., Hochhauser D., Salazar R., Iveson T., Laurent-Puig P., Quirke P., Yoshino T., Taieb J., et al. Localised colon cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann. Oncol. 2020;31:1291–1305. doi: 10.1016/j.annonc.2020.06.022. [DOI] [PubMed] [Google Scholar]
  • 22.Konishi T., Shimada Y., Hsu M., Tufts L., Jimenez-Rodriguez R., Cercek A., Yaeger R., Saltz L., Smith J.J., Nash G.M., et al. Association of Preoperative and Postoperative Serum Carcinoembryonic Antigen and Colon Cancer Outcome. JAMA Oncol. 2018;4:309–315. doi: 10.1001/jamaoncol.2017.4420. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Goldstein M.J., Mitchell E.P. Carcinoembryonic Antigen in the Staging and Follow-up of Patients with Colorectal Cancer. Cancer Invest. 2005;23:338–351. doi: 10.1081/CNV-58878. [DOI] [PubMed] [Google Scholar]
  • 24.Zapf I., Fekecs T., Ferencz A., Tizedes G., Pavlovics G., Kálmán E., Lőrinczy D. DSC analysis of human plasma in breast cancer patients. Thermochim. Acta. 2011;524:88–91. doi: 10.1016/j.tca.2011.06.019. [DOI] [Google Scholar]
  • 25.Todinova S., Krumova S., Kurtev P., Dimitrov V., Djongov L., Dudunkov Z., Taneva S.G. Calorimetry-based profiling of blood plasma from colorectal cancer patients. Biochim. Biophys. Acta. 2012;1820:1879–1885. doi: 10.1016/j.bbagen.2012.08.001. [DOI] [PubMed] [Google Scholar]
  • 26.Rai S., Pan C., Cambon A., Chaires J.B., Garbett N.C. Group classification based on high-dimensional data: application to differential scanning calorimetry plasma thermogram analysis of cervical cancer and control samples. Open Access Med. Stat. 2013;1:1. doi: 10.2147/OAMS.S40069. [DOI] [Google Scholar]
  • 27.Vega S., Garcia-Gonzalez M.A., Lanas A., Velazquez-Campoy A., Abian O. Deconvolution analysis for classifying gastric adenocarcinoma patients based on differential scanning calorimetry serum thermograms. Sci. Rep. 2015;5 doi: 10.1038/srep07988. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Rodrigo A., Ojeda J.L., Vega S., Sanchez-Gracia O., Lanas A., Isla D., Velazquez-Campoy A., Abian O. Thermal Liquid Biopsy (TLB): A Predictive Score Derived from Serum Thermograms as a Clinical Tool for Screening Lung Cancer Patients. Cancers (Basel) 2019;11 doi: 10.3390/cancers11071012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Garbett N.C., Merchant M.L., Helm C.W., Jenson A.B., Klein J.B., Chaires J.B. Detection of cervical cancer biomarker patterns in blood plasma and urine by differential scanning calorimetry and mass spectrometry. PLoS One. 2014;9 doi: 10.1371/journal.pone.0084710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Dolin T.G., Christensen I.J., Lund C.M., Bojesen S.E., Lykke J., Nielsen D.L., Larsen J.S., Johansen J.S. Preoperative plasma vitamin D in patients with localized colorectal cancer: Age-dependent association with inflammation, postoperative complications, and survival. Eur. J. Surg. Oncol. 2023;49:244–251. doi: 10.1016/j.ejso.2022.08.040. [DOI] [PubMed] [Google Scholar]
  • 31.Hermoso-Durán S., Fraunhoffer N., Millastre-Bocos J., Sanchez-Gracia O., Garrido P.F., Vega S., Lanas Á., Iovanna J., Velázquez-Campoy A., Abian O. Development of a Machine-Learning Model for Diagnosis of Pancreatic Cancer from Serum Samples Analyzed by Thermal Liquid Biopsy. Adv. Intell. Syst. 2024;7 doi: 10.1002/aisy.202400308. [DOI] [Google Scholar]
  • 32.Hara H., Yoshimura H., Uchida S., Toyoda Y., Aoki M., Sakai Y., Morimoto S., Shiokawa K. Molecular cloning and functional expression analysis of a cDNA for human hepassocin, a liver-specific protein with hepatocyte mitogenic activity. Biochim. Biophys. Acta. 2001;1520:45–53. doi: 10.1016/S0167-4781(01)00249-4. [DOI] [PubMed] [Google Scholar]
  • 33.Li C.Y., Cao C.Z., Xu W.X., Cao M.M., Yang F., Dong L., Yu M., Zhan Y.Q., Gao Y.B., Li W., et al. Recombinant human hepassocin stimulates proliferation of hepatocytes in vivo and improves survival in rats with fulminant hepatic failure. Gut. 2010;59:817–826. doi: 10.1136/gut.2008.171124. [DOI] [PubMed] [Google Scholar]
  • 34.Li W.X., Xiao H.W., Hong X.Q., Niu W.X. Predictive value of CK20 in evaluating the efficacy of treatment and prognosis after surgery for colorectal cancer. Genet. Mol. Res. 2015;14:5823–5829. doi: 10.4238/2015.May.29.14. [DOI] [PubMed] [Google Scholar]
  • 35.Bąk-Romaniszyn L., Świerzko A.S., Sokołowska A., Durko Ł., Mierzwa G., Szala-Poździej A., Małecka-Panas E., Cedzyński M. Mannose-binding lectin (MBL) in adult patients with inflammatory bowel disease. Immunobiology. 2020;225 doi: 10.1016/j.imbio.2019.10.008. [DOI] [PubMed] [Google Scholar]
  • 36.Nakagawa T., Kawasaki N., Ma Y., Uemura K., Kawasaki T. Antitumor Activity of Mannan-Binding Protein. Methods Enzymol. 2003;363:26–33. doi: 10.1016/S0076-6879(03)01041-3. [DOI] [PubMed] [Google Scholar]
  • 37.Hidalgo-Sastre A., Desztics J., Dantes Z., Schulte K., Ensarioglu H.K., Bassey-Archibong B., Öllinger R., Engleiter T., Rayner L., Einwächter H., et al. Loss of Wasl improves pancreatic cancer outcome. JCI Insight. 2020;5 doi: 10.1172/jci.insight.127275. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Molinie N., Gautreau A. The Arp2/3 Regulatory System and Its Deregulation in Cancer. Physiol. Rev. 2018;98:215–238. doi: 10.1152/physrev.00006.2017. [DOI] [PubMed] [Google Scholar]
  • 39.Millastre J., Hermoso-Durán S., Solórzano M.O.d., Fraunhoffer N., García-Rayado G., Vega S., Bujanda L., Sostres C., Lanas Á., Velázquez-Campoy A., Abian O. Thermal Liquid Biopsy: A Promising Tool for the Differential Diagnosis of Pancreatic Cystic Lesions and Malignancy Detection. Cancers (Basel) 2024;16:4024. doi: 10.3390/cancers16234024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Annesi F., Hermoso-Durán S., Rizzuti B., Bruno R., Pirritano D., Petrone A., Del Giudice F., Ojeda J., Vega S., Sanchez-Gracia O., et al. Thermal liquid biopsy (Tlb) of blood plasma as a potential tool to help in the early diagnosis of multiple sclerosis. J. Pers. Med. 2021;11 doi: 10.3390/jpm11040295. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Hermoso-Durán S., García-Rayado G., Ceballos-Laita L., Sostres C., Vega S., Millastre J., Sánchez-Gracia O., Ojeda J.L., Lanas Á., Velázquez-Campoy A., et al. Thermal liquid biopsy (TLB) focused on benign and premalignant pancreatic cyst diagnosis. J Pers Med. 2021;11:1–19. doi: 10.3390/jpm11010025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Velazquez-Campoy A., Vega S., Sanchez-Gracia O., Lanas A., Rodrigo A., Kaliappan A., Hall M.B., Nguyen T.Q., Brock G.N., Chesney J.A., et al. Thermal liquid biopsy for monitoring melanoma patients under surveillance during treatment: A pilot study. Biochim. Biophys. Acta. Gen. Subj. 2018;1862:1701–1710. doi: 10.1016/j.bbagen.2018.04.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Dolin T.G., Mikkelsen M., Jakobsen H.L., Nordentoft T., Pedersen T.S., Vinther A., Zerahn B., Vistisen K.K., Suetta C., Nielsen D., et al. Geriatric assessment and intervention in older vulnerable patients undergoing surgery for colorectal cancer: a protocol for a randomised controlled trial (GEPOC trial) BMC Geriatr. 2021;21:88. doi: 10.1186/s12877-021-02045-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Mohammed R., Rawashdeh J., Abdullah M. 2020 11th International Conference on Information and Communication Systems (ICICS) IEEE; 2020. Machine Learning with Oversampling and Undersampling Techniques: Overview Study and Experimental Results; pp. 243–248. [DOI] [Google Scholar]
  • 45.Rex D.K., Boland C.R., Dominitz J.A., Giardiello F.M., Johnson D.A., Kaltenbach T., Levin T.R., Lieberman D., Robertson D.J. Colorectal Cancer Screening: Recommendations for Physicians and Patients From the U.S. Multi-Society Task Force on Colorectal Cancer. Gastroenterology. 2017;153:307–323. doi: 10.1053/j.gastro.2017.05.013. [DOI] [PubMed] [Google Scholar]
  • 46.Ladabaum U., Dominitz J.A., Kahi C., Schoen R.E. Strategies for Colorectal Cancer Screening. Gastroenterology. 2020;158:418–432. doi: 10.1053/j.gastro.2019.06.043. [DOI] [PubMed] [Google Scholar]
  • 47.Herbeth B., Bagrel A. A study of factors influencing plasma CEA levels in an unselected population. Oncodev. Biol. Med. 1980;1:191–198. [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S8 and Tables S1–S13
mmc1.pdf (2.8MB, pdf)

Data Availability Statement

  • Proteomics data have been deposited at PRIDE (ProteomeXchange Consortium) under the accession number PXD064655 and are publicly available.

  • This paper does not report original custom code. All analyses were performed using standard publicly available R packages (including switchBox, ncvreg, caret, and survival), as specified in the STAR Methods section.

  • Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.


Articles from iScience are provided here courtesy of Elsevier

RESOURCES