Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2025 Aug 26;158(1):243–256. doi: 10.1002/ijc.70106

Integrative omics approaches to uncover liquid‐based cancer‐predicting biomarkers in Lynch syndrome

Minta Kärkkäinen 1, Tero Sievänen 1,2, Tia‐Marje Korhonen 1, Joonas Tuomikoski 3, Kirsi Pylvänäinen 4, Sami Äyrämö 3, Toni T Seppälä 5, Jukka‐Pekka Mecklin 1,4, Eija K Laakkonen 1, Tiina Jokela 1,6,7,
PMCID: PMC12588554  PMID: 40856265

Abstract

Lynch syndrome is a genetic cancer‐predisposing syndrome caused by pathogenic mutations in DNA mismatch repair (path_MMR) genes. Due to the elevated cancer risk, novel screening methods, alongside current surveillance techniques, could enhance cancer risk stratification. Here we show how bi‐omics integration could be utilized to pinpoint potential cancer‐predicting biomarkers in Lynch syndrome. We studied which blood‐based circulating microRNAs and metabolites could predict Lynch syndrome cancer occurrence within a 5.8‐year prospective surveillance period. We used single‐ and bi‐omics bioinformatic analyses and identified omics‐level patterns and associations across these biological layers. Lasso Cox regression was used to highlight the most promising cancer‐predicting biomarkers. Our findings revealed distinct circulating metabolite landscapes among path_MMR variant carriers and a circulating microRNA co‐expression module significantly associated with future cancer incidence. These microRNAs regulate cancer‐related pathways, including the PI3K/Akt signaling pathway. Additionally, a metabolite module consisting of ApoB‐containing lipoproteins (low‐, intermediate‐, and very low‐density lipoproteins) showed distinct levels across path_MMR variants. Notably, three biomarkers—hsa‐miR‐101‐3p, hsa‐miR‐183‐5p, and triglycerides in high‐density lipoprotein particles (HDL_TG)—significantly predicted cancer risk, achieving a Harrel's Concordance Index (C‐index) of 0.76 (p = .0007). Elevated levels of these biomarkers indicated increased cancer risk. Internal validation of the model yielded a C‐index of 0.72. The bi‐omics approach and the identified biomarkers offer promising insights for future studies regarding cancer risk identification in Lynch syndrome.

Keywords: cancer risk prediction, Lynch syndrome, omics integration, systemic biomarkers


What's new?

Lynch syndrome (LS) significantly increases lifetime cancer risk, necessitating innovative approaches for risk stratification. Current risk assessment methods, however, fail to integrate data from different biological systems, which may be influenced by lifestyle factors. Here, bioinformatics was used to identify biomarkers linked to LS‐related cancer risk at two system levels: circulating microRNAs (c‐miRNA) and circulating metabolites (c‐Metab). Three biomarkers—hsa‐miR‐101‐3p, hsa‐miR‐183‐5p, and triglycerides in HDL particles—derived from c‐miRNA and c‐Metab strongly predicted cancer risk. The c‐miRNAs linked to future cancer potentially regulate cancer‐related pathways. Further investigation is warranted to validate biomarkers of LS cancer risk identified through omics integration.

graphic file with name IJC-158-243-g001.jpg


Abbreviations

BS

Brier Score

C‐index

concordance index

c‐Metab

circulating metabolite

c‐miRNA

circulating microRNA

CRC

colorectal cancer

HDL

high‐density lipoprotein

HIF‐1

hypoxia‐inducible factor‐1

HR

hazard ratio

HSD

Tukey's honest significant difference

IAE

integrated absolute error

IBS

integrated brier score

IDL

intermediate‐density lipoprotein

ISE

integrated squared error

Lasso

least absolute shrinkage and selection

LDL

low‐density lipoprotein

LS

Lynch syndrome

LSRFi

Finnish Lynch Syndrome Research Registry

ME

module eigengene

MMR

mismatch repair gene

path_MMR

pathogenic mutation in the mismatch repair gene

PI3K

phosphoinositide 3‐kinase

PLSD

prospective Lynch syndromes database

sPLS

sparse partial least squares regression

TNF

tumor necrosis factor

TOM

topological overlap matrix

VLDL

very low‐density lipoprotein

WGCNA

weighted correlation network analysis

1. INTRODUCTION

Lynch syndrome (LS) is estimated to affect approximately 1 in every 300 people worldwide. 1 This hereditary cancer risk syndrome significantly elevates the lifetime risk of cancer. LS is primarily caused by pathogenic mutations in the mismatch repair (path_MMR) genes, including MLH1, MSH2, MSH6, and PMS2. These mutations impair the DNA mismatch repair process, leading to increased risk in different types of cancers like endometrium, ovaries, stomach, small bowel, bile duct, pancreas, and upper urinary tract, with colorectal cancer (CRC) being the most common type in LS. 1 , 2 Due to high cancer risk, vigilant surveillance and innovative strategies for early detection and precise risk stratification are pivotal for effective risk mitigation in LS. 1 However, not all LS carriers develop cancer, indicating that other factors, such as lifestyle choices, can affect cancer risk. 3

Epigenetics and energy metabolism are key systemic molecular mechanisms through which lifestyle choices impact cancer risk. 4 , 5 One cancer hallmark is dysregulated epigenetics. 6 Epigenetic changes facilitate malignant transformation, for example, by affecting cell cycle regulation, hypoxia responses, and other processes through central signaling pathways like tumor necrosis factor (TNF) alpha, phosphoinositide 3‐kinase (PI3K)/Akt, and hypoxia‐inducible factor‐1 (HIF‐1). 7 , 8 Circulating microRNAs (c‐miRNAs) are blood‐based epigenetic modulators that regulate gene expression of multiple target tissues. 9 LS carriers have a distinct c‐miRNA landscape compared to healthy non‐carriers, potentially affecting LS cancer risk and carcinogenesis. 10 Cancer cells can also alter energy metabolism to generate more metabolic substrates, such as increasing cholesterol uptake to support membrane synthesis and rapid proliferation. 11 One hallmark of cancer is high glucose consumption, which promotes tumor growth and helps to adapt to new environments during metastasis. 12 Dysfunction in energy metabolism supports cancer development by fueling rapid proliferation, promoting genetic instability, evading apoptosis, and modifying the tumor microenvironment. 11 , 13 Systemic energy metabolism fluctuations can be examined by analyzing circulating metabolites (c‐Metabs). Our recent work demonstrated that LS carriers have a distinct c‐Metab profile compared to the non‐carrier control cohorts. 14 In summary, cancer cells manipulate both epigenetics and energy metabolism and fluctuations in c‐miRNAs and c‐Metabs hold potential as biomarkers for assessing cancer susceptibility. 9 , 11

Omics integration approaches provide a better biological assessment of cancer risk compared to relying solely on biomarkers from a single system or omics. 15 The LS cohort is ideal for biomarker research due to regular cancer screenings, such as colonoscopies. 16 Currently, cumulative cancer risk for distinct organs in the LS population is assessed primarily through the prospective Lynch syndrome database's (PLSD) cumulative risk model. This model accounts for the mutation in MMR genes, age, sex, and previous cancer history. 2 , 16 Liquid biopsies offer a non‐invasive and sensitive method for assessing cancer risk, enabling further stratification that incorporates lifestyle factors and personalized risk assessment for high‐risk individuals. Previous studies have shown that c‐miRNAs and c‐Metabs have potential as biomarkers for predicting cancer risk. 17 , 18 , 19 Our previous study demonstrated the potential of c‐miRNA biomarkers in predicting cancer incidence in the LS cohort. 17 Liquid‐based biomarker identification offers a promising, non‐invasive method for more precise cancer risk assessment and personalized stratification in LS.

Here, we used a bi‐omics analysis framework to study how individual and integrated c‐miRNA and c‐Metab data are associated with LS cancer risk. Lasso Cox survival model was applied to the bi‐omics integration for the identification of cancer risk‐associated biomarkers. This study presents the promising systemic biomarkers associated with cancer risk in LS.

2. MATERIALS AND METHODS

2.1. Clinical data

The study cohort consisted of (n = 116) Finnish cancer‐free LS carriers whose c‐miRNA expression profile and c‐Metab levels were analyzed from serum samples. The sample collection started in 2018 and lasted till 2020. The study subjects had been under surveillance for 5.8 years (until June 2024) for this specific study and continue to remain under surveillance. The clinical data was obtained from the nationwide Finnish Lynch Syndrome Research Registry (LSRFi; www.lynchsyndrooma.fi). The data included age, sex, path_MMR variant, body‐mass index (BMI), previous cancer diagnoses including specific diagnosis date and cancer type/organ, and whether the study subject had cancer during the surveillance (which is referred to as status). The status was categorized as follows: LS carriers who remained cancer‐free throughout the follow‐up period were classified as “healthy,” while LS carriers diagnosed with cancer during follow‐up were classified as “future cancer.” All data analyses for the study were conducted using the R programming language (v. 4.4.1).

2.2. Sample collection

Venous blood samples from LS carriers were collected in a fasted state during their surveillance colonoscopy visits, which confirmed a cancer‐free status at the time of sampling. Samples were taken from the antecubital vein to standard serum tubes (455092, Greiner). To separate serum, the whole blood samples were allowed to clot for 30 min at room temperature, centrifuged at 1800g for 10 min, and aliquoted. The aliquoted samples were stored at −80°C until analyzed.

2.3. High‐throughput microRNA sequencing

C‐miRNA isolations from blood serum were carried out using the affinity column‐based miRNeasy Serum/Plasma Advanced Kit (217204, Qiagen) according to the manufacturer's instructions. Small‐RNA Library preparations were executed with QIAseq miRNA Library Preparation Kit (1103679, Qiagen) according to the manufacturer's instructions using multiplexing adapters. Sequencing of the small‐RNA libraries was done with NextSeq 500 (Illumina) using NextSeq 500/550 High Output Kit v. 2.5 with 75 cycles (15057934, Illumina) to produce 75‐base pair single‐end reads with an aimed mean sequencing depth of >5M reads per sample as recommended by the manufacturer (Qiagen). More details are described here 10 and in Materials and Methods section of Data S1, Supporting Information.

2.4. C‐miRNA data processing, alignment, and normalization

Sequencing output data provided a FASTQ‐format. These sequences were trimmed to 22 bp to enrich miRNA‐sequences and then quality filtered with FastX‐toolkit. Subsequently, the pre‐processed reads were mapped to human mature miRNA‐genome (miRbase v.22) with Bowtie alignment tool. The sequencing coverage and quality statistics for each sample are summarized in (Table S1). Low expressed miRNAs were filtered out (miRNAs with count summary <1 in 50% of the samples), remaining miRNAs were normalized with the median of ratios method, and variance stabilized using DESeq2 package. 20 The potential batch effect was removed using the sva package's ComBat function in R. 21 More details are described here 10 , 20 and in Materials and Methods section of Data S1.

2.5. Metabolomics analysis

Metabolites were analyzed with a targeted proton nuclear magnetic resonance (1H‐NMR) spectroscopy platform (Nightingale Health Ltd., Helsinki, Finland; biomarker quantification version 2020). The technical details of the method are reported here 14 and in Materials and Methods section of Data S1). The platform quantifies 250 metabolite measures. Of these, 170 metabolites were selected for downstream analyses (Table S2). Eighty lipoprotein lipid ratios were excluded from the analyses due to their overlapping information compared to absolute lipid concentrations. The selected 170 metabolites were Box‐Cox transformed 22 using the MASS package in R to ensure normally distributed data for downstream analyses. The Box‐Cox transformation with the lambda parameter was estimated from data for each variable separately.

2.6. Principal coordinate, PERMANOVA, and ANOSIM analyses

Principal coordinate analysis (PCoA) of the Euclidean distances calculated from both omics datasets was performed using the base stats package in R. Permutational multivariate analysis of variance (PERMANOVA) was used to test whether cohorts' centroids and dispersion in the PCoA distance matrix significantly differ from each other. Analysis of similarities (ANOSIM) was used to determine whether there is more similarity within the cohorts than between cohorts. PERMANOVA and ANOSIM statistical procedures were performed using the vegan package. 23

2.7. Bi‐omics covariance

To detect associations between c‐miRNAs and c‐Metabs, omics integration and dimensionality reduction were done with R package mixOmics. 24 Sparse partial least squares (sPLS) regression was used to find a set of variables that best explain the covariance between c‐miRNAs and c‐Metabs. Cross‐validation was used to optimize the number of features and components for sPLS. Data visualization included a correlation circle plot showing the relationships between the features and components, and a heatmap visualizing similarity patterns between selected c‐miRNAs and c‐Metabs.

2.8. Weighted correlation network analysis

The weighted correlation network analysis (WGCNA) was used to construct both the c‐miRNA and cMetab co‐expression networks, using the R package WGCNA. 25 The technical procedure is described in detail in Data S3. The analysis yielded a gene expression similarity matrix that was converted into an adjacency matrix using soft‐thresholding power to ensure the scale‐free topology criterion (Figure S1). The adjacency matrix was transformed into a topological overlap matrix (TOM) and hierarchical clustering was applied to the TOM to group highly interconnected entities into modules. An eigengene (ME) was calculated for each module, representing the first principal component of the module's expression profile or level. The eigengene summarized the overall expression pattern/level of the module and was used as a representative measure of module activity.

2.9. Module–phenotype association analyses

Once modules were identified, their differences between phenotypic traits were assessed. A t test was used to compare whether the mean of the module differed between healthy and future cancer groups. Tukey's honest significant difference (HSD) was used to make multiple comparisons between path_MMR variant group means. A linear mixed‐effects model (LMM) approach using lme4 26 package in R was used to assess module–path_MMR variant associations. The model included path_MMR variant as a fixed effect and age and BMI as random effects to account for individual‐level variability. The fixed effect estimates the average influence of the path_MMR variant on modules while accounting for potential confounding due to age and BMI.

2.10. Predictive c‐miRNA target gene and pathway analyses

Target gene predictions were executed using the mirWalk 27 tool to investigate potential biological roles of the module c‐miRNAs associated with future cancer status. The selected set of predicted miRNA‐target genes exclusively included those targeting the 3′‐untranslated region. To enhance the reliability of predictions, only those miRNA‐target genes that were both included and verified in Targetscan, mirDB, and miRTarBase databases were retained for subsequent gene set enrichment analysis (GSEA). The GSEA encompassed the evaluation of Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Benjamini–Hochberg‐adjusted p‐values of ≤.05 were deemed enriched.

2.11. Identification of potential cancer predictive biomarkers and evaluation of their predictive capacity using Lasso Cox regression

Least absolute shrinkage and selection method (Lasso) 28 regularized Cox regression was used to identify the most promising cancer‐predictive biomarkers, using the glmnet and survival packages in R (Data S3). Given the large set of predictive variables, and to avoid classification algorithm problems, like high dimensionality or multicollinearity, the number of both c‐miRNAs and c‐Metabs was thresholded for biomarker identification. The number of c‐miRNAs was thresholded, leaving only c‐miRNAs of the modules from WGCNA (Table S4) that showed the strongest association with future cancer status. Overlapping c‐Metabs were excluded, leaving 63 metabolites for the analysis (Table S5). This reduction was done to eliminate redundancy, as the original list (Table S2) contained metabolites that overlapped extensively, either as subsets or specific components of broader categories.

The response variable in the Cox regression model was the time to cancer diagnosis after serum sampling, measured in years, or for the healthy, the time until the final update from LSRFi in June 2024. The optimal value for the regularization parameter (lambda) was selected to shrink all predictors except the five most significant biomarkers from both the c‐miRNA and c‐Metab datasets (Figure S2). The resulting 10 biomarkers (5 c‐miRNAs and 5 c‐Metabs) were used to fit the initial Cox regression model on the entire study sample size. The impact of a previous cancer history was evaluated by incorporating it into the model and by assessing its impact on future cancer status using a chi‐square test. The ANOVA test was used to study whether each biomarker significantly contributed to the model fit. Biomarkers with the highest chi‐square values were considered promising and retained in the final model. The final Cox regression model included three biomarkers to predict whether LS carriers would remain healthy over a 5.8‐year surveillance period. The model performance was evaluated using Harrel's Concordance Index (C‐index), hazard ratios (HRs), and 95% confidence intervals (CIs).

Proportional hazards assumptions were tested using Schoenfeld residuals (Figure S3). Model performance was validated using internal validation by splitting the data randomly five times into training (50%) and validation (50%) sets, stratified by cancer status, where the model was trained with the train data and tested its performance using a validation set. The performance was evaluated using the C‐index, Brier Score (BS), Integrated Brier Score (IBS), integrated absolute error (IAE), and integrated squared error (ISE) on validation data that was not part of the model training, utilizing the SurvMetrics R package. 29 Additionally, the model's performance was tested for CRC risk prediction by excluding other cancer types from the dataset.

3. RESULTS

3.1. The cohort characteristics

The descriptive characteristics of the study subjects are presented in Table 1. All study subjects were healthy at the time of serum sampling, which took place at the start of the surveillance period. Of the study subjects, 82 (71%) carried MLH1, 16 (14%) MSH2, 17 (15%) MSH6, and 1 (<1%) PMS2 path_MMR variants. Due to there being only one PMS2 carrier, this study participant was excluded from path_MMR–omics association analyses. During the 5.8‐year surveillance, 17 developed cancer (6 females, 11 males) and 99 remained healthy. No loss to follow‐up occurred. On average, cancer was diagnosed 1.84 years after the sample was collected. Of the study cohort, 45 (39%) had had cancer before surveillance started and 71 (61%) had no previous cancer. The mean age of the study participants was 55.6 years and the mean BMI was 27.1. The most prevalent cancer type was CRC (47%).

TABLE 1.

Descriptive characteristics of the study subjects.

Variable Total cohort Cancer during surveillance (future cancer) Cancer‐free after surveillance (healthy)
N (total = 116) 116 17 99
Sex, N (%)
Female 59 (51%) 6 (35%) 53 (54%)
Male 57 (49%) 11 (65%) 46 (46%)
Age, years mean ± (SD) at the time of sampling 55.6 ± (13.4) 56.8 ± (14.9) 55.3 ± (13.2)
Body mass index, kg/m2 mean ± (SD) 27.1 ± (5.6) 27.8 ± 4.6 26.9 ± 5.8
path_MMR, N (%)
MLH1 82 (71%) 14 (82%) 68 (69%)
MSH2 16 (14%) 2 (12%) 14 (14%)
MSH6 17 (15%) 1 (6%) 16 (16%)
PMS2 1 (<1%) 0 (0%) 1 (1%)
Cancer history, N (%)
Yes 45 (39%) 7 (41%) 38 (38%)
No 71 (61%) 10 (59%) 61 (62%)
Cancer‐free time after sampling, mean ± (SD) 4.55 ± (1.38) 1.84 ± (1.42) 5.01 ± (0.65)
Cancer type during surveillance, N
Bladder 1
Breast 1
Colorectal 8
Esophageal 1
Glioma 1
Gastric 1
Prostate 2
Sebaceous gland 1
Spinocellular 1

3.2. path_MMR variant carriers exhibit distinct metabolomic landscapes

The PCoA dimension reduction, PERMANOVA, and ANOSIM analyses were used to identify overall omics‐level differences between path_MMR variants and healthy and future cancer groups. We did not observe a significant difference in the c‐miRNA landscape between different path_MMR groups (Figure 1A). However, the ANOSIM test revealed that there exists more similarity in the c‐Metab landscape within each path_MMR group than between the groups (Figure 1B). Specifically, groups with path_MLH1 and path_MSH2 had distinct c‐Metab profiles indicating that these variants differ more than others in the metabolomic landscape (Figure 1B). Also, there was no significant difference in c‐miRNA nor c‐Metab landscapes between healthy and future cancer groups (Figure 1C,D). In conclusion, path_MMR variants showed distinct whole c‐Metab level landscapes, but we did not observe these differences in c‐miRNA landscapes.

FIGURE 1.

FIGURE 1

Overview of the two omics data used in the study. Panels (A) and (B) show how the omics landscapes differ between carriers of different path_MMR variants, while panels (C) and (D) compare landscapes between future cancer vs. healthy groups. V1 and V2 represent the first and second principal coordinate axes from the PCoA, capturing the main sources of variation in the data. Points on the plot represent samples and their distance from each other reflects their dissimilarity based on the distance matrix used in the PCoA. The color gradient represents the density of samples, with red indicating higher density and blue lower density. PERMANOVA and ANOSIM tests show whether there are significant differences in the composition of the omics landscapes between the groups. Panels (E) and (F) show the correlation matrix between the omics datasets, providing an integrated view of the relationships across different omics layers. In the correlation circle plot (E), c‐miRNAs are depicted in blue and c‐Metabs in orange. Species that are close to each other in the circles have positive correlations. Species that are on the opposite sides of the circle have negative correlations. The heatmap (F) presents the strongest positive correlations as red and negative correlations as blue.

3.3. Cholesterol‐related c‐Metabs associate with groups of c‐miRNAs

We also studied how these omics datasets associate with one another using sPLS. C‐miRNAs formed two major clusters with positive within‐cluster correlation but no or negative correlation with c‐Metabs (Figure 1E). In addition, distinct clusters were identified with c‐miRNAs correlating with ApoA1 (HDL) and ApoB (LDL and VLDL) containing lipoprotein characteristics. Five c‐miRNAs correlated positively with the amount of free cholesterol, phospholipids, and total lipids within S‐sized HDL particles (Figure 1F). They also correlated positively with a variety of ApoB‐containing lipoprotein particles, such as triglycerides of the S‐sized LDL particles and concentration, total lipids, phospholipids, and free cholesterol of the total or XL‐sized VLDL particles as well as the total concentration of triglycerides (Figure 1F). Another set of three c‐miRNAs correlated negatively with these same ApoB‐containing lipoprotein characteristics but had low correlations with HDL characteristics. These results indicate that some c‐miRNAs are associated with lipid metabolism.

3.4. C‐miRNA and c‐Metab co‐expression networks associate with future cancer and path_MMR variant

The WGCNA was used to detect c‐miRNAs whose expression levels correlate with each other, suggesting shared regulatory mechanisms. After identifying co‐expressed c‐miRNA modules, we studied the associations between these modules and phenotypes of interest. WGCNA detected 11 c‐miRNA co‐expression modules (Figure 2A and Table S3). The c‐miRNAs in the grey module did not belong to any co‐expression group. The pink module's eigengene (MEpink), calculated from all c‐miRNA expression levels within the module (hsa‐miR‐101‐3p, hsa‐miR‐182‐5p, hsa‐miR‐183‐5p, hsa‐miR‐25‐3p, hsa‐miR‐4732‐3p, hsa‐miR‐532‐5p, hsa‐miR‐660‐5p, hsa‐miR‐7‐5p, and hsa‐miR‐93‐5p), had a significantly higher expression level in future cancer than the healthy group (Figure 2B–D). We used miRWalk to study the future cancer‐associated c‐miRNA module's target genes and the pathways they associate with. KEGG pathway analysis revealed that the module's c‐miRNAs are connected to biological pathways associated with various cancer mechanisms, including pathways in gastric cancer, prostate cancer, miRNAs in cancer, HIF‐1 signaling, TNF signaling, and PI3K‐Akt signaling (Figure 2E and Data S4). Results indicate that future cancer‐associated c‐miRNAs potentially regulate cancer‐related pathways.

FIGURE 2.

FIGURE 2

WGCNA of c‐miRNA expression levels. (A) The cluster dendrogram shows c‐miRNA co‐expression modules. (B) T test reveals significant module eigengene (ME) differences between the healthy and future cancer group where MEs summarize the expression profiles of all genes within a given module. A positive t‐value suggests that the healthy group has a higher mean ME value compared to the future cancer group, while a negative t‐value indicates the opposite. Larger t‐values indicate stronger differences. (C) Boxplot shows the distribution of the MEpink among the groups. The y‐axis represents the ME values for the pink module (MEpink), derived from WGCNA. (D) Lists c‐miRNAs within the pink module. (E) Pathway analysis shows the pink module's c‐miRNA target genes' biological roles. The y‐axis indicates significantly enriched biological roles (BH ≤0.05) and the x‐axis the number of c‐miRNA target genes related to the pathway. Stars highlight the central signaling pathways associated with malignant transformation.

We also studied whether c‐miRNA co‐expression modules differ between path_MMR variants (MLH1, MSH2, and MSH6) by utilizing Tukey's multiple comparison test and found no significant differences (Table S6). The results suggest that path_MMR status is not a strong factor in regulating c‐miRNA co‐expression levels.

The WGCNA was also performed on c‐Metab data to study whether the modules exhibited different levels between healthy and future cancer groups or between path_MMR variants. The pipeline detected six modules (Figure S4A and Table S7) but none of these differed significantly between healthy and future cancer groups (Figure S4B). However, the MEturquoise had significantly different levels between MLH1 and MSH6 variant carriers (Figure 3A). The MEturquoise mainly consisted of ApoB‐containing lipoproteins (LDL, VLDL, IDL) with variable particle sizes and the cholesterol and triglycerides they carry and sphingomyelins (Figure 3B). Figure 3C illustrates the different lipoprotein classes, their sizes, and how high‐ and low‐blood cholesterol levels may influence cancer progression. The LMM revealed that the MEturquoise was significantly lower with MSH2 (t value = −2.00) and MSH6 (t value = −2.25) path_MMR carriers than with MLH1 carriers (Figure S5A). The boxplot of the MEturquoise showed that the MLH1 carriers have higher module metabolite levels compared to MSH2 and MSH6 (Figure S5B). The analysis highlights distinct profiles of LDL, VLDL, and IDL particles across these path_MMR variants, where the particles have higher levels with MLH1 carriers.

FIGURE 3.

FIGURE 3

WGCNA c‐Metab‐module eigengene associations with path_MMR variants where (A) Tukey's multiple comparison test reveals module level differences and (adjusted p‐values) between path_MMR variants. The test compares the difference between the means of the two groups where positive values suggest the first group has a larger mean, while negative values indicate the second group has a larger mean. (B) The table lists metabolites included in module turquoise, which showed a significant difference between MLH1 and MSH6 variants. (C) Presents a schematic figure of lipoproteins' role in cancer progression.

3.5. Hsa‐miR‐101‐3p, hsa‐miR‐183‐5p, and triglycerides in HDL particles are potential cancer risk biomarkers in LS

We applied Lasso Cox regression to determine significant c‐miRNA and c‐Metab predictors of cancer occurrence. The 10 top predictive features were hsa‐miR‐101‐3p, hsa‐miR‐183‐5p, hsa‐miR‐182‐5p, hsa‐miR‐4732‐3p, hsa‐miR‐148b‐3p, HDL_TG, Tyrosine, Glucose, Acetate, and GlycA (Figure S6). Previous cancer history had no significant impact on future cancer status (Tables S8 and S9). The model was simplified by concluding only the strongest predictive biomarkers. Among these, hsa‐miR‐101‐3p, hsa‐miR‐183‐5p, and HDL_TG significantly predicted future cancer occurrence, where elevated levels of these three biomarkers indicated an increased cancer risk (Figure 4A). Additionally, we compared biomarker distributions between the groups, finding that hsa‐miR‐101‐3p, hsa‐miR‐183‐5p, and HDL_TG levels were significantly higher in the future cancer group compared to the healthy group (Figure 4B).

FIGURE 4.

FIGURE 4

Cox proportional hazards model results and predictive accuracy on validation data. (A) Forest plot displays the hazard ratios (HR), confidence intervals (CI), model Concordance Index (C‐index), and significance levels for each predictor variable, based on the model trained using the entire dataset. HRs for each predictor were above 1, indicating an increased cancer risk, with significant 95% CIs. (B) Boxplot illustrates the distribution of key covariates between the healthy and future cancer groups. The significance of mean differences between the groups was assessed using a t test. (C) Presents receiver‐operating characteristic (ROC) curves of the internal validation results. (D) Internal validation results of the Cox regression model show average performance metrics across five iterations: C‐index, Brier Score (BS), Integrated Brier Score (IBS), Integrated Absolute Error (IAE), and Integrated Squared Error (ISE). A high C‐index and low values for the other metrics indicate good prediction accuracy.

Due to the dataset including distinct cancer types within the future cancer group (Table 1), we first evaluated the predictive capacity of the identified biomarkers for all LS‐related cancers. The model incorporating all 10 biomarkers had a C‐index of 0.82 (p = .0028) (Figure S6). A reduced model using only the biomarkers—hsa‐miR‐101‐3p, hsa‐miR‐183‐5p, and HDL_TG—achieved a C‐index of 0.76 (p = .0007) (Figure 4A). Additionally, we tested the model's performance specifically on CRC, as it is the predominant cancer type in LS. When incorporating all 10 biomarkers, the model's C‐index was 0.9 (p = .021) (Figure S7A). Interestingly, from this model for CRC, besides hsa‐miR‐101‐3p, glucose was also a significant predictor (Figure S7A), thus indicating that it could potentially also work as CRC‐predicting biomarker in LS. The reduced CRC prediction model had a C‐index of 0.8 (p = .04) (Figure S7B). However, in the model only hsa‐miR‐101‐3p was a significant predictor (Figure S7B). In summary, even the full 10‐biomarker model for both all LS‐related cancers and the model for CRC had higher overall accuracy; most biomarkers were not individually significant. The reduced model, focusing on hsa‐miR‐101‐3p, hsa‐miR‐183‐5p, and HDL_TG, highlighted their potential to predict cancer risk across all LS‐related cancers.

The validation ROC curves consistently demonstrated predictions well above 0.5, supporting the potential of these biomarkers to predict overall cancer risk in LS (Figure 4C). In the validation, the model had an average C‐index of 0.72, with low BS (0.102), IBS (0.087), IAE (0.10), and ISE (0.003) (Figures 4D and S8). All iterations had prediction accuracy ranging from moderate to good (Figure S8A–F). High levels of hsa‐miR‐101‐3p, hsa‐miR‐183‐5p, and HDL_TG indicate an association with increased cancer risk, positioning them as possible systemic biomarkers for predicting future cancer occurrence in LS.

4. DISCUSSION

This study aimed to identify biomarkers that could predict cancer risk in LS carriers using a liquid‐based bi‐omics data integration approach. We investigated both c‐miRNAs and c‐Metabs at the single omics and bi‐omics correlation levels. Notably, we found that MLH1 and MSH2 exhibited greater similarity in the c‐Metab landscape within their respective groups than between them. We also observed correlations between c‐miRNA clusters and lipoprotein variables. The WGCNA revealed omics‐level co‐expression modules. The c‐miRNA co‐expression module (MEpink) was upregulated in the future cancer group compared to the healthy group, suggesting its potential as a biomarker for predicting future cancer occurrence. Additionally, a c‐Metab cluster (MEturquoise) had significantly distinct levels between path_MRR variant groups. It consisted of lipid‐related metabolites, primarily focused on cholesterol‐ and ApoB‐containing lipoprotein particles, such as LDL, IDL, and VLDL. We identified three significant biomarkers—hsa‐miR‐101‐3p, hsa‐miR‐183‐5p, and HDL_TG—that reflected LS cancer risk. These biomarkers were derived from two distinct system biology omics layers, c‐miRNA and c‐Metab, emphasizing the power of a bi‐omics approach in uncovering potential indicators for cancer susceptibility in LS carriers.

4.1. Cancer‐associated c‐miRNAs regulate common cancer‐related pathways

The WGCNA detected a c‐miRNA module (MEpink) significantly associated with future cancer status. Interestingly, the target gene pathway analysis revealed that some of the module c‐miRNAs regulate the PI3K/Akt signaling pathway. PI3K/Akt regulates cell growth, division, metabolism, protein synthesis, and survival. 30 MEpink's c‐miRNAs also regulate, for instance, the HIF‐1 signaling pathway that controls proliferation, apoptosis, glucose metabolism, and promotes angiogenesis in addition to anaerobic metabolism. 31 , 32 Elevated levels of HIF‐1 are linked to tumor metastasis, poor patient prognosis as well as tumor resistance therapy. 31 Tumor cells use the HIF‐1 pathway to overcome hypoxic stress, where they activate survival pathways to secure essential biological processes to maintain, for instance, cell proliferation. 31 In our results, the c‐miRNAs within this module were upregulated in the future cancer group. Additionally, our results revealed that high glucose levels were associated with elevated CRC cancer risk. This finding indicates a possible link between c‐miRNAs associated with future cancer risk and elevated glucose levels.

4.2. path_MMR variants and their potential association with lipid metabolism

The WGCNA and LMM also revealed that the c‐Metab module, MEturquoise, showed higher levels in MLH1 carriers compared to MSH2 and MSH6 variant carriers, suggesting that mutations in MLH1 may influence lipid metabolism. path_MMR variant significantly affects the cancer risk of LS carriers, where the highest risk is associated with MLH1. 1 MEturquoise mainly consisted of cholesterol within LDL‐, IDL‐, and VLDL particles in addition to high concentrations of these particles. The disorders in lipid metabolism are associated with a higher risk of tumor development by promoting cancer cell growth and metastatic lesion development. 13 Cancer cells alter metabolism to gain the energy they need for cell proliferation and growth, for instance, using LDL as a cholesterol carrier. 13 Stimulation of PI3K/Akt/mTOR signaling pathway causes transcription of the sterol regulatory element‐binding proteins that contribute to cholesterol uptake and promote cancer cell growth. 33 Our previous study showed significant similarities between the c‐Metab profiles of healthy LS carriers and CRC patients, suggesting shared metabolic patterns that could contribute to LS cancer susceptibility. 14 In support, the current study showed that the MLH1 variant might be associated with dysregulated lipid metabolism, potentially impacting LS carcinogenesis.

4.3. Identified biomarkers' associations to cancer

The three identified cancer‐predicting biomarkers, hsa‐miR‐101‐3p, hsa‐miR‐183‐5p, and HDL_TG, have all been studied in the context of cancer risk and occurrence. In our models, hsa‐miR‐101‐3p was a significant cancer risk predictor for all LS‐related cancers and for CRC. Hsa‐miR‐101‐3p has been reported to inhibit the proliferation, migration, and invasion of cancer cells. 34 However, studies are controversial on whether hsa‐miR‐101‐3p's role is to work as a tumor suppressor or as an oncogene. 34 , 35 In vivo studies have indicated it to work as an oncogene. 34 For instance, hsa‐miR‐101 indicates poor prognosis in ovarian cancer patients and its overexpression promotes ovarian cancer sphere formation. 36 We also found that hsa‐miR‐183‐5p was an indicator of increased cancer risk. Previous research has reported that hsa‐miR‐183 is a potential biomarker for CRC, breast cancer, lung cancer, and hepatocellular carcinoma. 37 , 38 , 39 , 40 Sanjabi et al. demonstrated that it was overexpressed in CRC samples when compared to healthy controls. 37 Macedo et al. 39 showed that overexpressing hsa‐miR‐183 altered the proliferation and migration of MDA‐MB‐231 and MDA‐MB‐468 breast cancer cell lines. In silico analysis further identified retinoblastoma 1 (RB1), a well‐known tumor suppressor protein, as a downstream target of hsa‐miR‐183. In summary, both identified c‐miRNAs indicated elevated cancer risk, aligning with previous findings. Functional assays, such as using LS cancer organoid cultures, could clarify their roles in LS cancer etiology. However, further validation is needed to support the findings of this study.

The third cancer‐predicting biomarker we identified was HDL_TG. HDLs are dense, small (5–17 nm) ApoA1‐containing lipoprotein particles composed of phospholipids, cholesteryl esters, triglycerides, free cholesterol, and sphingolipids. The best‐characterized function of HDL is to carry cholesterol to the liver from the peripheral tissues. 41 It navigates in circulation and takes free cholesterol and phospholipids from peripheral tissues and ApoB‐containing lipoproteins by exchanging its triglycerides for the cholesterol from these particles. 41 Cancer cells require a lot of energy and use, for instance, cholesterol to maintain their energy needs for proliferation, migration, and metastatic activities. 13 The cancer cells impair reverse cholesterol transport. 13 It has been reported that HDL‐related components, like lipid transfer proteins (for instance cholesteryl ester transfer protein), correlate with distinct cancers. 42 HDL triglyceride is demonstrated to correlate positively and significantly with triglyceride in VLDL, and the concentrations of triglyceride and cholesterol in HDL have a negative correlation. 43 In summary, our results could indicate that increased HDL triglycerides imply a more favorable tumor microenvironment due to impaired reverse cholesterol transport, where the cholesterol esters are carried by LDL and VLDL that provide cholesterol to cancer cells.

4.4. The prediction model accuracy

The Cox proportional hazards model built in this study was used to test how well these identified biomarkers could predict LS carriers' cancer risk. The Cox proportional model has widely been used in the biomarker identification and prediction of clinical outcomes of distinct cancers, like survival, reoccurrence, or risk prediction. 44 , 45 , 46 , 47 Previous studies have reported predictive performance metrics for Lasso Cox regression models, such as a C‐index of ~0.7 in pancreatic ductal adenocarcinoma and an AUC of 0.756 for predicting gastric cancer recurrence. 46 , 48 However, these models did not incorporate an omics integration approach. Tyagi et al. used a multi‐omics strategy to identify prognostic molecular features in prostate cancer, achieving an AUC of 0.67. 49 In comparison, our model demonstrated a higher predictive accuracy, with a C‐index of 0.76 for all cancer types and 0.8 for CRC, aligning with the performance range of other models. In the internal validation, the model accuracy on validation data was, on average, 0.72. This suggests that the predictive capacity of these biomarkers, using the selected Lasso Cox regression method, is good. Thus, it is important to note that our model cannot serve as a reliable predictive tool in clinical practice without further validation to confirm its performance. However, the prediction model for all LS‐related cancers had significant 95% CIs for each biomarker, indicating that all three biomarkers significantly predicted future cancer occurrence and, therefore, have the potential to work as LS cancer risk‐reflecting biomarkers.

4.5. Strengths of the study

The strength of this study is its unique integration of bi‐omics data, leveraging two systemic biomolecular levels to improve cancer prediction in LS. The dataset is remarkable, as there exist only a few prospective follow‐up surveillances where participants almost certainly develop cancer. Due to high cancer risk, LS provides a good platform for identifying potential biomarkers. The biomarkers identified in this study may offer insights into the biological mechanisms underlying LS‐associated cancers if further validated. With more research, specific c‐miRNAs or c‐Metabs could potentially contribute to detecting cancer risk and identifying novel therapeutic targets. Ultimately, reliable cancer risk predictors may support improved clinical decision‐making for LS carriers, who face high lifetime cancer risks.

4.6. Limitations

Our study contained high‐dimensional variables, leading to challenges such as handling highly correlated biomarkers, which cause multicollinearity. To address this, we used Lasso regularization—a penalized regression method that identifies predictive features from high‐dimensional data that handles collinearity by retaining only one of the correlated predictors. 28 The small sample size and class imbalance could explain the moderate prediction accuracy of the model validation (C‐index 0.72). The model could be further validated through external validation by collecting more data, but time and budget constraints limited this effort, and the use of synthetic data proved insufficient for this purpose. Also, we estimated HRs at the end of the surveillance period and assumed the constant effect of the biomarkers on the HR over time. In addition, the chosen methodology or data analysis approach significantly influences the results. For comparison, in our previous pilot study, we identified LS‐specific differentially expressed c‐miRNAs as potential cancer‐predicting biomarkers. 17 Here, we used different approaches to identify key LS cancer‐predicting biomarkers. Integrating two omics datasets offered a more comprehensive view of system‐level responses to cancer development compared to single‐omics approaches. Additionally, our findings reflect the Finnish LS population, where MLH1 pathogenic variants are more common due to population‐specific genetics. 50 Thus, caution is needed when generalizing to populations with a different distribution of MMR gene mutations (e.g., MSH2, MSH6, PMS2). Validation in more diverse cohorts is needed to confirm broader applicability. In summary, the good model accuracy supports the suitability of the Lasso Cox regression for our data, and the identification of biomarkers across multiple omics layers highlights their potential to inform cancer risk. However, larger, more diverse cohorts and improved predictive accuracy are needed before clinical application in LS cancer risk estimation.

5. CONCLUSIONS

In this study, we identified potential biomarkers that may help predict cancer risk in LS. Our findings establish a framework for future studies to validate circulating biomarkers of LS‐related cancer risk through omics integration. By evaluating the association between these biomarkers and cancer risk, our method could eventually support more personalized cancer surveillance in LS. However, all identified biomarkers—hsa‐miR‐101‐3p, hsa‐miR‐183‐5p, and triglycerides in HDL—require rigorous external validation before consideration for clinical application or as additions to existing tools like the PLSD model.

AUTHOR CONTRIBUTIONS

Minta Kärkkäinen: Conceptualization; methodology; investigation; formal analysis; visualization; project administration; writing – original draft. Tero Sievänen: Formal analysis; investigation; writing – review and editing. Tia‐Marje Korhonen: Formal analysis; supervision; writing – review and editing. Joonas Tuomikoski: Formal analysis; writing – review and editing. Kirsi Pylvänäinen: Data curation. Sami Äyrämö: Formal analysis; writing – review and editing. Toni T. Seppälä: Data curation; funding acquisition; investigation; resources; writing – review and editing. Jukka‐Pekka Mecklin: Data curation; investigation; resources; writing – review and editing. Eija K. Laakkonen: Conceptualization; data curation; funding acquisition; investigation; resources; supervision; writing – review and editing. Tiina Jokela: Conceptualization; data curation; formal analysis; funding acquisition; investigation; methodology; project administration; supervision; writing – original draft.

CONFLICT OF INTEREST STATEMENT

Toni T. Seppälä reports consultation fees from Mehiläinen, Nouscom, Orion, Tillots Pharma, and Amgen, being a co‐owner and CEO of Healthfund Finland Ltd. and a position in the Clinical Advisory Board and as a minor shareholder of Lynsight Ltd. The other authors do not have a conflict of interest.

ETHICS STATEMENT

The acquired informed consent was collected from each participant. Helsinki and Uusimaa Health Care District (HUS/155/2021) and Central Finland Health Care District Ethics Committee (KSSHP D# 1U/2018 and 1/2019 and KSSHP 3/2016) had approved the data collection and data usage of LSRFi.

Supporting information

Data S1. Supporting Information.

IJC-158-243-s001.pdf (1.3MB, pdf)

Data S2. Supporting Information.

IJC-158-243-s004.xlsx (1.4MB, xlsx)

Data S3. Supporting Information.

IJC-158-243-s002.pdf (509.2KB, pdf)

Data S4. Supporting Information.

IJC-158-243-s003.xlsx (21.8KB, xlsx)

ACKNOWLEDGMENTS

We would like to acknowledge all study participants, the LSRFi, Central Finland Hospital Nova's pathology laboratory staff, and staff of the Sports and Health Science faculty laboratory members who participated in data collection. We thank Nightingale Health Ltd. for analyzing metabolomics data. We would also like to acknowledge CSC for providing the computational resources that facilitated the analyses performed in this study. In addition, we thank BioRender for providing the tools to create scientific illustrations. Tiina Jokela reports grants from European Commission Union Marie Skłodowska‐Curie Individual Fellowships, Mary and George C. Ehrnrooth Foundation, and K. Albin Johanssons Foundations during the conduct of the study. Toni T. Seppälä reports grants from Finnish Medical Foundation, Emil Aaltonen Foundation, Jane and Aatos Erkko Foundation, Relander Foundation, and Cancer Foundation Finland during the conduct of the study. Eija K. Laakkonen reports grants from Päivikki and Sakari Sohlberg Foundation during the conduct of the study. Joonas Tuomikoski reports personal fees from Seppo Säynäjäkankaan's Foundation. Open access publishing facilitated by Jyvaskylan yliopisto, as part of the Wiley ‐ FinELib agreement.

Kärkkäinen M, Sievänen T, Korhonen T‐M, et al. Integrative omics approaches to uncover liquid‐based cancer‐predicting biomarkers in Lynch syndrome. Int J Cancer. 2026;158(1):243‐256. doi: 10.1002/ijc.70106

DATA AVAILABILITY STATEMENT

Upon publication, the miRNA sequencing data generated in this study is available in NCBI Sequence Read Archive under Bioproject PRJNA1220303. All R scripts used for computational analyses are publicly available at: https://doi.org/10.5281/zenodo.16032566. Other data that support the findings of this study are available from the corresponding author upon request.

REFERENCES

  • 1. Dominguez‐Valentin M, Sampson JR, Seppälä TT, et al. Cancer risks by gene, age, and gender in 6350 carriers ofpathogenic mismatch repair variants: findings from the prospective Lynch syndrome database. Genet Med. 2020;22(1):15‐25. doi: 10.1038/s41436-019-0596-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Bhattacharya P, Leslie SW, McHugh TW. Lynch syndrome (hereditary nonpolyposis colorectal cancer). StatPearls. StatPearls Publishing; 2024. [PubMed] [Google Scholar]
  • 3. Power RF, Doherty DE, Horgan R, et al. Modifiable risk factors for cancer among people with lynch syndrome: an international, cross‐sectional survey. Hered Cancer Clin Practice. 2024;22(1):10. doi: 10.1186/s13053-024-00280-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Shankar E, Gupta K, Gupta S. Dietary and lifestyle factors in epigenetic regulation of cancer. In: Bishayee A, Bhatia D, eds. Epigenetics of Cancer Prevention. Academic Press; 2019:361‐394. [Google Scholar]
  • 5. Locasale JW. Diet and exercise in cancer metabolism. Cancer Discov. 2022;12(10):2249‐2257. doi: 10.1158/2159-8290.CD-22-0096 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Yu X, Zhao H, Wang R, et al. Cancer epigenetics: from laboratory studies and clinical trials to precision medicine. Cell Death Discovery. 2024;10(1):28. doi: 10.1038/s41420-024-01803-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Kanwal R, Gupta S. Epigenetic modifications in cancer. Clin Genet. 2012;81(4):303‐311. doi: 10.1111/j.1399-0004.2011.01809.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Li T, Mao C, Wang X, Shi Y, Tao Y. Epigenetic crosstalk between hypoxia and tumor driven by HIF regulation. J Exp Clin Cancer Res. 2020;39(1):224. doi: 10.1186/s13046-020-01733-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Peng Y, Croce CM. The role of MicroRNAs in human cancer. Signal Transduct Target Ther. 2016;1(1):15004. doi: 10.1038/sigtrans.2015.4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Sievänen T, Korhonen TM, Jokela T, et al. Systemic circulating microRNA landscape in Lynch syndrome. Int J Cancer. 2023;152(5):932‐944. doi: 10.1002/ijc.34338 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Martínez‐Reyes I, Chandel NS. Cancer metabolism: looking forward. Nat Rev Cancer. 2021;21(10):669‐680. doi: 10.1038/s41568-021-00378-6 [DOI] [PubMed] [Google Scholar]
  • 12. Subramaniam S, Jeet V, Clements JA, Gunter JH, Batra J. Emergence of MicroRNAs as key players in cancer cell metabolism. Clin Chem. 2019;65(9):1090‐1101. doi: 10.1373/clinchem.2018.299651 [DOI] [PubMed] [Google Scholar]
  • 13. Xu H, Zhou S, Tang Q, Xia H, Bi F. Cholesterol metabolism: new functions and therapeutic approaches in cancer. Biochim Biophys Acta Rev Cancer. 2020;1874(1):188394. doi: 10.1016/j.bbcan.2020.188394 [DOI] [PubMed] [Google Scholar]
  • 14. Jokela T, Karppinen J, Kärkkäinen M, et al. Circulating metabolome landscape in Lynch Syndrome. Cancer Metab. 2023;12:4. doi: 10.21203/rs.3.rs-3561844/v1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Hasin Y, Seldin M, Lusis A. Multi‐omics approaches to disease. Genome Biol. 2017;18(1):83. doi: 10.1186/s13059-017-1215-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Dominguez‐Valentin M, Haupt S, Seppälä TT, et al. Mortality by age, gene and gender in carriers of pathogenic mismatch repair gene variants receiving surveillance for early cancer diagnosis and treatment: a report from the prospective Lynch syndrome database. EClinicalMedicine. 2023;58:101909. doi: 10.1016/j.eclinm.2023.101909 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Sievänen T, Jokela T, Hyvärinen M, et al. Circulating miRNA signature predicts cancer incidence in Lynch syndrome—a pilot study. Cancer Prev Res. 2024;17(6):243‐254. doi: 10.1158/1940-6207.CAPR-23-0368 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Wang W, Rong Z, Wang G, Hou Y, Yang F, Qiu M. Cancer metabolites: promising biomarkers for cancer liquid biopsy. Biomarker Res. 2023;11(1):66. doi: 10.1186/s40364-023-00507-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Hayes J, Peruzzi PP, Lawler S. MicroRNAs in cancer: biomarkers, functions and therapy. Trends Mol Med. 2014;20(8):460‐469. doi: 10.1016/j.molmed.2014.06.005 [DOI] [PubMed] [Google Scholar]
  • 20. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA‐seq data with DESeq2. Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Leek JT, Johnson WE, Parker HS, Jaffe AE, Storey JD. The sva package for removing batch effects and other unwanted variation in high‐throughput experiments. Bioinformatics. 2012;28(6):882‐883. doi: 10.1093/bioinformatics/bts034 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Daimon T. Box–cox transformation. In: Lovric M, ed. International Encyclopedia of Statistical Science. Springer; 2011:176‐178. doi: 10.1007/978-3-642-04898-2_152 [DOI] [Google Scholar]
  • 23. Oksanen J, Blanchet FG, Kindt R, et al. Vegan: Community Ecology Package. R Package Version 2.2‐1. 2015.
  • 24. Rohart F, Gautier B, Singh A, Lê Cao KA. mixOmics: an R package for omics feature selection and multiple data integration. PLoS Comput Biol. 2017;13(11):e1005752. doi: 10.1371/journal.pcbi.1005752 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinf. 2008;9(1):559. doi: 10.1186/1471-2105-9-559 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Bates D, Mächler M, Bolker B, Walker S. Fitting linear mixed‐effects models using lme4. J Stat Softw. 2015;67(1):1‐48. doi: 10.18637/jss.v067.i01 [DOI] [Google Scholar]
  • 27. Sticht C, De La Torre C, Parveen A, Gretz N. miRWalk: an online resource for prediction of microRNA binding sites. PLoS One. 2018;13(10):e0206239. doi: 10.1371/journal.pone.0206239 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Tibshirani R. The lasso method for variable selection in the Cox model. Stat Med. 1997;16(4):385‐395. [DOI] [PubMed] [Google Scholar]
  • 29. Zhou H, Wang H, Wang S, Zou Y. SurvMetrics: an R package for predictive evaluation metrics in survival analysis. R J. 2023;14:252‐263. doi: 10.32614/RJ-2023-009 [DOI] [Google Scholar]
  • 30. Hemmings BA, Restuccia DF. PI3K‐PKB/Akt pathway. Cold Spring Harb Perspect Biol. 2012;4(9):a011189. doi: 10.1101/cshperspect.a011189 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Masoud GN, Li W. HIF‐1α pathway: role, regulation and intervention for cancer therapy. Acta Pharm Sin B. 2015;5(5):378‐389. doi: 10.1016/j.apsb.2015.05.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Semenza GL. Targeting HIF‐1 for cancer therapy. Nat Rev Cancer. 2003;3(10):721‐732. doi: 10.1038/nrc1187 [DOI] [PubMed] [Google Scholar]
  • 33. Deng CF, Zhu N, Zhao TJ, et al. Involvement of LDL and ox‐LDL in cancer development and its therapeutical potential. Front Oncol. 2022;12:803473. doi: 10.3389/fonc.2022.803473 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Liu N, Yang C, Gao A, Sun M, Lv D. MiR‐101: an important regulator of gene expression and tumor ecosystem. Cancers (Basel). 2022;14(23):5861. doi: 10.3390/cancers14235861 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Varambally S, Cao Q, Mani RS, et al. Genomic loss of microRNA‐101 leads to overexpression of histone methyltransferase EZH2 in cancer. Science. 2008;322(5908):1695‐1699. doi: 10.1126/science.1165395 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Cui TX, Kryczek I, Zhao L, et al. Myeloid‐derived suppressor cells enhance stemness of cancer cells by inducing MicroRNA101 and suppressing the corepressor CtBP2. Immunity. 2013;39(3):611‐621. doi: 10.1016/j.immuni.2013.08.025 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Sanjabi F, Nekouian R, Akbari A, Mirzaei R, Fattahi A. Plasma miR‐183‐5p in colorectal cancer patients as potential predictive lymph node metastasis marker. J Cancer Res Ther. 2022;18(4):921‐926. [DOI] [PubMed] [Google Scholar]
  • 38. Zaporozhchenko IA, Morozkin ES, Skvortsova TE, et al. Plasma miR‐19b and miR‐183 as potential biomarkers of lung cancer. PLoS One. 2016;11(10):e0165261. doi: 10.1371/journal.pone.0165261 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Macedo T, Silva‐Oliveira RJ, Silva VAO, Vidal DO, Evangelista AF, Marques MMC. Overexpression of mir‐183 and mir‐494 promotes proliferation and migration in human breast cancer cell lines. Oncol Lett. 2017;14(1):1054‐1060. doi: 10.3892/ol.2017.6265 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Liang Z, Gao Y, Shi W, et al. Expression and significance of microRNA‐183 in hepatocellular carcinoma. Sci World J. 2013;2013:381874. doi: 10.1155/2013/381874 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Bailey A, Mohiuddin SS. Biochemistry, high density lipoprotein. StatPearls. StatPearls Publishing; 2024. [PubMed] [Google Scholar]
  • 42. Pirro M, Ricciuti B, Rader DJ, Catapano AL, Sahebkar A, Banach M. High density lipoprotein cholesterol and cancer: marker or causative? Prog Lipid Res. 2018;71:54‐69. doi: 10.1016/j.plipres.2018.06.001 [DOI] [PubMed] [Google Scholar]
  • 43. Barter PJ, Connor WE. The transport of triglyceride in the high‐density lipoproteins of human plasma. J Lab Clin Med. 1975;85(2):260‐272. [PubMed] [Google Scholar]
  • 44. Jardillier R, Koca D, Chatelain F, Guyon L. Prognosis of lasso‐like penalized Cox models with tumor profiling improves prediction over clinical data alone and benefits from bi‐dimensional pre‐screening. BMC Cancer. 2022;22(1):1045. doi: 10.1186/s12885-022-10117-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Calabrese F, Lunardi F, Pezzuto F, et al. Are there new biomarkers in tissue and liquid biopsies for the early detection of non‐small cell lung cancer? J Clin Med. 2019;8(3):414. doi: 10.3390/jcm8030414 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Feng Y, Yang J, Duan W, Cai Y, Liu X, Peng Y. LASSO‐derived prognostic model predicts cancer‐specific survival in advanced pancreatic ductal adenocarcinoma over 50 years of age: a retrospective study of SEER database research. Front Oncol. 2023;13:1336251. doi: 10.3389/fonc.2023.1336251 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Du X j, Yang X r, Wang Q c, Lin G l, Li P f, Zhang W f. Identification and validation of a five‐gene prognostic signature based on bioinformatics analyses in breast cancer. Heliyon. 2023;9(2):e13185. doi: 10.1016/j.heliyon.2023.e13185 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Huang B, Ding F, Li Y. A practical recurrence risk model based on Lasso‐Cox regression for gastric cancer. J Cancer Res Clin Oncol. 2023;149(17):15845‐15854. doi: 10.1007/s00432-023-05346-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Tyagi N, Roy S, Vengadesan K, Gupta D. Multi‐omics approach for identifying CNV‐associated lncRNA signatures with prognostic value in prostate cancer. Non‐coding RNA Res. 2024;9(1):66‐75. doi: 10.1016/j.ncrna.2023.10.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Sipilä LJ, Aavikko M, Ravantti J, et al. Detection of a major Lynch syndrome‐causing MLH1 founder variant in a large‐scale genotyped cohort. Fam Cancer. 2024;23(4):647‐652. doi: 10.1007/s10689-024-00400-4 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data S1. Supporting Information.

IJC-158-243-s001.pdf (1.3MB, pdf)

Data S2. Supporting Information.

IJC-158-243-s004.xlsx (1.4MB, xlsx)

Data S3. Supporting Information.

IJC-158-243-s002.pdf (509.2KB, pdf)

Data S4. Supporting Information.

IJC-158-243-s003.xlsx (21.8KB, xlsx)

Data Availability Statement

Upon publication, the miRNA sequencing data generated in this study is available in NCBI Sequence Read Archive under Bioproject PRJNA1220303. All R scripts used for computational analyses are publicly available at: https://doi.org/10.5281/zenodo.16032566. Other data that support the findings of this study are available from the corresponding author upon request.


Articles from International Journal of Cancer are provided here courtesy of Wiley

RESOURCES