Construction of an artificially intelligent model for accurate detection of HCC by integrating clinical, radiological, and peripheral immunological features

Yangyang Wang; Shengqiang Chi; Yu Tian; Xueyao Li; Hang Zhang; Yiting Xu; Chao Huang; Yiwei Gao; Gaowei Jin; Qihan Fu; Wanyue Cao; Cao Chen; Haonan Ding; Yuquan Zhang; Yupeng Hong; Junjian Li; Xu Sun; Enliang Li; Yuhua Zhang; Weiyun Yao; Runtian Liu; Yongfei Hua; Haifeng Huang; Minghui Xu; Bo Zhang; Weifeng Tao; Tianxing Yang; Yuming Gao; Xiaoguang Wang; Cheng Lin; Jingsong Li; Qi Zhang; Tingbo Liang

doi:10.1097/JS9.0000000000002281

. 2025 Jan 29;111(4):2942–2952. doi: 10.1097/JS9.0000000000002281

Construction of an artificially intelligent model for accurate detection of HCC by integrating clinical, radiological, and peripheral immunological features

Yangyang Wang ^a,^b,^c, Shengqiang Chi ^d,^e, Yu Tian ^e, Xueyao Li ^d, Hang Zhang ^d, Yiting Xu ^a,^b,^c, Chao Huang ^d, Yiwei Gao ^d, Gaowei Jin ^a,^b,^c, Qihan Fu ^b,^c,^f, Wanyue Cao ^a,^b,^c, Cao Chen ^a,^b,^c, Haonan Ding ^a,^b,^c, Yuquan Zhang ^a,^b,^c, Yupeng Hong ^g, Junjian Li ^h, Xu Sun ⁱ, Enliang Li ^j, Yuhua Zhang ^k, Weiyun Yao ^l, Runtian Liu ^m, Yongfei Hua ⁿ, Haifeng Huang ^o, Minghui Xu ^p, Bo Zhang ^q, Weifeng Tao ^r, Tianxing Yang ^s, Yuming Gao ^t, Xiaoguang Wang ^u, Cheng Lin ^v, Jingsong Li ^d,^e,^*, Qi Zhang ^a,^b,^c,^w,^x,^*, Tingbo Liang ^a,^b,^c,^w,^x,^*

^aDepartment of Hepatobiliary and Pancreatic Surgery, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China

^bMOE Joint International Research Laboratory of Pancreatic Diseases, Hangzhou, China

^cZhejiang Provincial Key Laboratory of Pancreatic Disease, the First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China

^dResearch Center for Data Hub and Security, Zhejiang Lab, Hangzhou, China

^eThe Engineering Research Center of EMR and Intelligent Expert System, Ministry of Education, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China

^fDepartment of Oncology, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China

^gDepartment of Oncology, Zhejiang Provincial People’s Hospital, Affiliated People’s Hospital, Hangzhou Medical College, Hangzhou, China

^hDepartment of Hepatobiliary and Pancreatic Surgery, The First Affiliated Hospital, Wenzhou Medical University, Wenzhou, China

ⁱDepartment of General Surgery, Huzhou Central Hospital, Zhejiang University School of Medicine, Huzhou, China

^jDepartment of Hepatobiliary and Pancreatic Surgery, The Second Affiliated Hospital of Nanchang University, Nanchang, China

^kDepartment of Hepatobiliary and Pancreatic Surgery, The Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital), Institute of Basic Medicine and Cancer, Chinese Academy of Sciences, Hangzhou, China

^lDepartment of Surgery, Changxing People’s Hospital, Huzhou, China

^mDepartment of Hepatobiliary and Pancreatic Surgery, The Second Hospital of Hebei Medical University, Shijiazhuang, China

ⁿDepartment of General Surgery, Ningbo Medical Center Lihuili Eastern Hospital, Ningbo, China

^oDepartment of General Surgery, Shengzhou People’s Hospital, Shengzhou, China

^pDepartment of General Surgery, Haining People’s Hospital, Haining, China

^qDepartment of General Surgery, Shenzhen University Luohu People’s Hospital, Shenzhen, China

^rDepartment of General Surgery, Shangyu People’s Hospital of Shaoxing, Shangyu, China

^sDepartment of Medical Oncology, Sanmen People’s Hospital, Taizhou, China

^tDepartment of General Surgery, Jixi County People’s Hospital, Jixi, China

^uDepartment of General Surgery, Jiaxing Second People’s Hospital, Jiaxing, China

^vZhejiang Puluoting Health Technology Co Ltd, Hangzhou, China

^wZhejiang Clinical Research Center of Hepatobiliary and Pancreatic Diseases, Hangzhou, China

^xCancer Center of Zhejiang University, Hangzhou, China

Corresponding authors. Address: Department of Hepatobiliary and Pancreatic Surgery, The First Affiliated Hospital, Zhejiang University School of Medicine, No. 79 Qingchun Road, Hangzhou 310003, China. Tel: +86 0571 87236601. E-mail: liangtingbo@zju.edu.cn (T. Liang); qi.zhang@zju.edu.cn (Q. Zhang); ljs@zju.edu.cn (J. Li).

PMCID: PMC12175786 PMID: 39878177

Abstract

Background:

Integrating comprehensive information on hepatocellular carcinoma (HCC) is essential to improve its early detection. We aimed to develop a model with multimodal features (MMF) using artificial intelligence (AI) approaches to enhance the performance of HCC detection.

Materials and methods:

A total of 1092 participants were enrolled from 16 centers. These participants were allocated into the training, internal validation, and external validation cohorts. Peripheral blood specimens were collected prospectively and subjected to mass cytometry analysis. Clinical and radiological data were obtained from electrical medical records. Various AI methods were employed to identify pertinent features and construct single-modal models with optimal performance. The XGBoost algorithm was utilized to amalgamate these models, integrating multimodal information and facilitating the development of a fusion model. Model evaluation and interpretability were demonstrated using the SHapley Additive exPlanations method.

Results:

We constructed the electronic health record, BioScore, RadiomicScore, and DLScore models based on clinical, radiological, and peripheral immunological features, respectively. Subsequently, these single-modal models were amalgamated to develop an all-in-one MMF model. The MMF model exhibited enhanced performance compared to models comprising only single-modal features in detecting HCC. This superiority in performance was confirmed through the internal and external validation cohorts, yielding area under the curve (AUC) values of 0.985 and 0.915, respectively. Additionally, the MMF model improved the detection ability in subpopulations of HCCs that were negative for alpha-fetoprotein and those with small size, with AUC values of 0.974 and 0.996, respectively.

Conclusions:

Integrating MMF improved the performance of the model for HCC detection.

Keywords: artificial intelligence early detection, fusion model, hepatocellular carcinoma, multimodal

Introduction

Hepatocellular carcinoma (HCC) is the third leading cause of cancer-related death worldwide^[1]. Despite the development of various therapies for HCC, the prognosis remains poor. A significant number of patients are diagnosed at advanced stages, which precludes surgical intervention and results in reduced survival rates^[2]. In contrast, patients with early-stage HCC can undergo curative treatments, such as hepatectomy and liver transplantation, resulting in a 5-year survival rate of 70%–80%^[3]. Therefore, enhancing the rate of early detection is the most effective strategy to improve the prognosis of HCC.

Currently, B-mode ultrasound, computed tomography (CT), magnetic resonance imaging, and tumor markers such as alpha-fetoprotein (AFP) are used to detect HCC^[4]. However, imaging approaches lack the sensitivity to detect small lesions^[5], and AFP is positive in only 60%–70% of patients with HCC^[6,7]. Recently, novel liquid biopsy techniques, such as plasma cell-free DNA, have advanced HCC detection, achieving a sensitivity of 96.8%^[8]. Additionally, researchers have increasingly focused on the tumor microenvironment (TME), which contains complex information about tumors and their interactions with the host organism, playing critical roles in tumor pathogenesis. Meanwhile, cytometry by time of flight (CyTOF) can deeply profile the immune response of patients and has become a significant tool for studying TME^[9,10]. We have also developed an iPBIScore model using CyTOF based on peripheral blood specimens to achieve better efficacy^[11]. However, these efforts are primarily based on single-modal data. Since cancer is considered a systemic disease, early detection of HCC based on single-modal features remains insufficiently satisfactory. Therefore, incorporating comprehensive systemic information on patients with HCC will probably enhance early detection capabilities. Besides, it is important to properly integrate comprehensive information encompassing “system and local” and “direct and indirect” data. However, this concept has not been fully validated yet. The main limitations are the lack of cohorts with multimodal information and the absence of effective fusion methods for multimodal features.

Artificial intelligence (AI) techniques have been widely applied in clinical diagnosis. To mining information from imaging, radiomics has been developed for HCC detection; however, its performance remains unsatisfactory^[12]. Deep learning (DL) is an evolving approach, which autonomously extracts quantitative representations from radiological images. Combining DL and radiomics has shown promise, particularly in predicting lymph node metastasis in pancreatic neuroendocrine tumors and gastric cancer^[13,14]. Therefore, combining DL with radiomic features holds promise for achieving significant performance improvements in the early detection of HCC. Furthermore, AI methods are expected to integrate multimodal data to improve the detection performance of certain cancers in the general population^[15,16]. Specifically, combined lung and ovarian cancer models have been proposed to facilitate early detection^[17,18]. Therefore, the prospect of achieving a combined model for early detection of HCC through AI methods is encouraging.

In this study, we aimed to validate that integrating multimodal information can yield superior performance compared to single-modal approaches in the early detection of HCC. To achieve this, we developed a fusion model that combines multimodal information (MMF model) from electronic health records (EHR), contrast-enhanced CT, and CyTOF data using AI algorithms.

Methods

Study design and participants

This multicenter study enrolled 1092 participants from 16 centers in China between October 2019 and July 2022. Peripheral blood specimens were prospectively collected under the consent of a previous study (blank for double-bland policy)^[11]. HCC cases were confirmed by pathological or clinical diagnosis according to the guidelines on the diagnosis and treatment of primary liver cancer (2017 edition)^[19]. The inclusion criteria were: (1) primary HCC; (2) benign liver diseases, including but not limited to, hemangiomas, hepatic cysts, focal nodular hyperplasia, and cirrhosis; (3) available CyTOF analyses^[11]. Participants were excluded for: (1) previous treatment for HCC or benign focal liver diseases; (2) diseases and treatments related to blood or the immune system, which were detailed in the previous study^[11]. Participants from our center were randomly divided into the training and the internal validation cohorts in an 8:2 ratio (n = 706 and 177, respectively), while participants from other centers were recruited for external validation (n = 209, Supplementary Table 1, http://links.lww.com/JS9/D842; Fig. 1). The EHR, contrast-enhanced CT images, and CyTOF data were collected. This study was approved by the ethical committee of our center. This work has been reported in line with the STROCSS criteria^[20].

Figure 1. — Flowchart of participant enrollment.

EHR feature selection and model construction

The variables collected from EHR included, but not limited to age, sex, body mass index, hepatitis B virus (HBV) infection, AFP, alanine transaminase, aspartate aminotransferase, total bilirubin, albumin, red blood cell count, and white blood cell count. Continuous variables were discretized to reduce the influence of outliers and to enhance model stability. The Boruta algorithm was employed to select significant variables for HCC detection by comparing the importance of selected features with random features. The selected features were incorporated into eleven machine learning methods, including least absolute shrinkage and selection operator (LASSO), Ridge, Elastic Net, partial least squares regression generalized linear model (plsRglm), Support Vector Machine, Linear Discriminant Analysis, NaiveBayes, Gradient Boosting Machine (GBM), LightGBM, XGBoost, and glmBoost, to construct EHR-based models in the training cohort. The algorithms and hyperparameters with optimal performance were selected using five-fold cross-validation.

Immunological feature selection and model construction

The peripheral immunological status of participants was assessed using CyTOF analysis^[11]. Initially, features with significant differences between participants with HCC and those with benign liver diseases were identified, while those with a pairwise correlation coefficient greater than 0.8 were excluded. The selected immunological features, along with AFP, were combined to train a BioScore model. The optimal LightGBM algorithm and its hyperparameters were determined from eleven machine learning algorithms through five-fold cross-validation. The prediction scores generated by the peripheral immunological feature-based model for the participants were defined as BioScore.

Radiomic feature selection and model construction

Contrast-enhanced abdominal CT scans were performed prior to any treatments, with the arterial phase (AP) and portal venous phase (PVP) images extracted. The liver was annotated using ITK-SNAP (version 3.8.0), and the features of the whole-liver were mined and analyzed as previously described^[21]. PyRadiomics (https://github. com/Radiomics/pyradiomics) was used to extract radiomic features from the AP and PVP of CT images^[22]. Standardization of the feature extraction algorithm was performed according to the guidelines proposed by the Image Biomarker Standardization Initiative^[23]. The radiomic features included first-order features, shape features, gray-level run-length matrix, gray-level size zone matrix, gray-level co-occurrence matrix, gray-level dependence matrix, and neighboring gray-level difference matrix (see Supplementary File 1 for details, http://links.lww.com/JS9/D880).

To further select radiomic features, we first used the Mann–Whitney U test and Benjamini–Hochberg correction to identify features with significant differences between the two groups. Next, we standardized the selected features using z-score normalization. Subsequently, we constructed a LASSO regression model and adjusted the penalty parameter through five-fold cross-validation to identify the features with the most predictive power based on a nonzero coefficient. Finally, we examined the pairwise correlation among the selected features, and excluded those with correlation coefficients greater than 0.8. The model construction process was the same as that of the EHR model. The prediction scores generated by the radiomic model for the participants were defined as RadiomicScore.

DL model construction

We truncated the CT values to the range of −100 HU to 250 HU. Each CT image was then cropped to include only the liver region based on a 3D bounding box of the liver. We resized these liver CT images with varying dimensions to 32 × 128 × 128 pixels using linear interpolation. Subsequently, ResNet3D, comprising nine residual modules, was employed for model training using the resized images (Supplementary Fig. 1, http://links.lww.com/JS9/D841)^[24,25]. The model was optimized using the Adam optimizer, with a learning rate set to 10⁻⁷, β₁ of 0.9, and β₂ of 0.999. The prediction scores generated by the DL model for the participants were designated as DLScore.

MMF model construction

We integrated EHR features, BioScore, RadiomicScore, and DLScore to construct an all-in-one MMF model. Additionally, we adjusted the model parameters based on the XGBoost and LightGBM algorithms, which could handle incomplete data, through five-fold cross-validation and selected the model with the optimal performance^[26,27]. Incomplete data were handled using these two algorithms for a small portion of the dataset.

Model evaluation and interpretability

The models were thoroughly evaluated using various metrics including accuracy, precision/positive predictive value, recall/sensitivity, specificity, negative predictive value, F1 score, area under the receiver operating characteristic (ROC) curve (AUC), and area under the precision-recall curve. Additionally, assessments included analysis of ROC, calibration, and decision curves. The DeLong test was employed to compare the AUC of different models. The SHapley Additive exPlanations (SHAP) method was used to analyze the feature importance of the best model for interpretation^[28]. Furthermore, we examined whether the all-in-one MMF model exhibited a significant improvement compared to other models within subgroups of patients with HCC, hemangioma, liver cysts, and liver nodules. Additionally, models for subgroups of HCCs with different sizes, as well as for AFP-negative and AFP-positive HCCs, were also studied.

Statistical analysis

All statistical analyses were performed using Python 3.6.13 and R 4.2.3 software. Continuous variables were represented as median (interquartile range) or mean ± standard error of the mean, and compared using the Kruskal–Wallis (H) test or Mann–Whitney U tests, as appropriate. Categorical variables were described by frequency and percentage, and compared using the chi-square test or Fisher’s exact probability method. The EHR feature selection was conducted using the Boruta package in Python with default algorithm parameters, and statistical differences were analyzed using the “tableone” package in Python. Significance tests for radiomic features between the HCC and benign liver disease groups were conducted using the “scipy” and “statsmodels” packages in Python. Adjusted P-values were calculated using the Benjamini–Hochberg correction, with the statistical significance criterion set at an adjusted P-value <0.05. All machine learning model constructions were performed using the “sklearn,” “glmnet,” “xgboost,” and “lightgbm” packages in Python. SHAP analysis was conducted using the “shap” package in Python. The ROC curves, calibration curves, and decision curves were plotted using the “pROC,” “CalibrationCurves,” and “dcurves” packages in R, respectively.

Results

Characteristics of participants

A total of 1092 participants from 16 centers were enrolled in this study, of whom 790 had HCC. They were divided into three cohorts including 706 participants in the training cohort, 177 in the internal validation cohort, and 209 in the external validation cohort (Table 1). The majority of participants were male and infected with HBV (69.2% and 61.1%, respectively). Notably, 80% of patients with HCC were at Barcelona Clinic Liver Cancer stage of 0 or A.

Table 1.

Baseline and clinicopathological parameters of patients.

	Overall (n = 1092)	Training cohort (n = 706)	Internal validation cohort (n = 177)	External validation cohort (n = 209)	P-value
Age (year), n (%)					0.680
<20	4 (0.4%)	3 (0.4%)	0 (0)	1 (0.5%)
20–44	185 (16.9%)	125 (17.7%)	31 (17.5%)	29 (13.9%)
45–64	602 (55.1%)	379 (53.7%)	97 (54.8%)	126 (60.3%)
≥65	301 (27.6%)	199 (28.2%)	49 (27.7%)	53 (25.4%)
Sex, n (%)					0.187
Female	336 (30.8%)	211 (29.9%)	50 (28.2%)	75 (35.9%)
Male	756 (69.2%)	495 (70.1%)	127 (71.8%)	134 (64.1%)
Hepatitis B, n (%)					0.845
Yes	667 (61.6%)	430 (61.3%)	107 (61.1%)	130 (63.4%)
No	415 (38.4%)	272 (38.7%)	68 (38.9%)	75 (36.6%)
AFP, n (%)					0.015
<20 ng/mL	698 (63.9%)	454 (64.3%)	116 (65.5%)	128 (61.2%)
≥20 ng/mL	367 (33.6%)	240 (34.0%)	59 (33.3%)	68 (32.5%)
Unknown	27 (2.5%)	12 (1.7%)	2 (1.1%)	13 (6.2%)
ALT, U/L (range)	25 (3−747)	25 (4−77)	24 (3−294)	28 (9−406)	0.032
AST, U/L (range)	27 (8−1232)	25 (8−1232)	25 (11−363)	31.2 (15−278)	<0.001
TBIL, μmol/L (range)	13.6 (2.1–490.4)	13.7 (3.7–490.4)	13.0 (2.1–85.0)	14.0 (4.9–454.8)	0.465
ALB, g/L (range)	43.7 (21.6–65.0)	44.2 (21.6–52.8)	43.4 (23.6–52.5)	31.2 (24.5–65.0)	<0.001
Cirrhosis, n (%)					<0.001
Yes	548 (50.2%)	319 (45.2%)	83 (46.9%)	114 (54.5%)
No	516 (47.3%)	384 (54.4%)	92 (52.0%)	72 (34.4%)
Unknown	28 (2.6%)	3 (0.4%)	2 (1.1%)	23 (11%)
Diseases, n (%)					0.002
HCC	790 (72.3%)	527 (74.6%)	132 (74.6%)	131 (62.7%)
Begin	302 (27.7%)	179 (25.4%)	45 (25.4%)	78 (37.3%)
BCLC^a, n (%)					0.918
0	117 (14.8%)	76 (14.4%)	22 (16.7%)	19 (14.5%)
A	501 (63.4%)	336 (63.8%)	85 (64.4%)	80 (61.1%)
B	68 (8.6%)	46 (8.7%)	11 (8.3%)	11 (8.4%)
C/D	104 (13.2%)	69 (13.1%)	14 (10.6%)	21 (16.0%)

Open in a new tab

AFP, alpha-fetoprotein; ALT, alanine transaminase; AST, aspartate aminotransferase; TBIL, total bilirubin; ALB, albumin; HCC, hepatocellular carcinoma; BCLC, Barcelona Clinic Liver Cancer.

^{^a}

BCLC was analyzed among patients with HCC.

Performance of the EHR model

The Boruta algorithm selected three variables: age, sex, and HBV infection. Based on the performance of the models in the training cohorts, we selected XGBoost as the algorithm for model construction (Supplementary Table 2, http://links.lww.com/JS9/D842). The EHR model achieved an AUC value of 0.934 (95% confidential interval [CI]: 0.912–0.955) in the training cohort, 0.906 (95% CI: 0.850–0.949) in the internal validation cohort, and 0.833 (95% CI: 0.775–0.889) in the external validation cohort, respectively.

Performance of the BioScore model

We determined 15 immunological features (CD56, CD161, CD27, CD14, CD279, CD183, CD8a, CD20, CD85J, CD24, CD86, HLA-DR, CD38, and CD11c) for the construction of BioScore model (Supplementary Table 3, http://links.lww.com/JS9/D842). Additionally, the AFP level was included in the model. The LightGBM algorithm was finally selected based on its best performance in the external validation cohort (Supplementary Table 4, http://links.lww.com/JS9/D842). Patients with HCC had a significantly higher BioScore than those with benign liver diseases (mean, 0.873 vs. 0.367, P < 0.001; Supplementary Table 5, http://links.lww.com/JS9/D842). In the training cohort, the BioScore model performed well, with a sensitivity of 0.977, specificity of 0.754, and AUC of 0.975 (95% CI: 0.964–0.985), respectively. The performance was validated by the internal cohort and external cohort with AUC values of 0.865 (95% CI: 0.808–0.917) and 0.876 (95% CI: 0.827–0.919), respectively (Table 2).

Table 2.

Performance of models with single-modal features.

Cohort	Model	Accuracy	Precision	Sensitivity	Specificity	NPV	F1-score	AUC	AUPRC
Training cohort	EHR	0.901	0.921	0.949	0.760	0.834	0.935	0.934	0.967
	BioScore	0.921	0.921	0.977	0.754	0.918	0.948	0.975	0.991
	DLScore	0.971	0.977	0.985	0.923	0.950	0.981	0.987	0.995
	RadiomicScore	0.925	0.924	0.983	0.734	0.929	0.952	0.953	0.979
Internal validation cohort	EHR	0.847	0.878	0.924	0.622	0.737	0.900	0.906	0.959
	BioScore	0.802	0.830	0.924	0.444	0.667	0.875	0.865	0.952
	DLScore	0.966	0.966	0.991	0.882	0.968	0.979	0.985	0.995
	RadiomicScore	0.926	0.913	1.000	0.676	1.000	0.954	0.970	0.990
External validation cohort	EHR	0.746	0.729	0.947	0.410	0.821	0.824	0.833	0.883
	BioScore	0.785	0.779	0.916	0.564	0.800	0.842	0.876	0.934
	DLScore	0.810	0.833	0.932	0.450	0.692	0.880	0.819	0.929
	RadiomicScore	0.824	0.836	0.966	0.312	0.714	0.896	0.699	0.890

Open in a new tab

Performance of the RadiomicScore model

A total of 839 contrast-enhanced CT images were acquired from participants, with 607, 153, and 79 in the training, internal validation, and external validation cohorts, respectively. We extracted 851 radiomic features from the CT scans, 703 of which were significantly different between participants with and without HCC. A LASSO regression model with 24 standardized features was constructed (Fig. 2A and B). After removing two highly correlated features, 22 features were retained for subsequent training of the RadiomicScore model, including original_shape_Elongation, original_glcm_Imc2, original_glcm_MaximumProbability, wavelet-LLH_firstorder_10Percentile, wavelet-LLH_firstorder_Minimum, wavelet-LLH_glszm_SmallAreaHighGrayLevelEmphasis, wavelet-LLH_glszm_ZoneEntropy, wavelet-LLH_ngtdm_Strength, wavelet-LHL_glcm_Idn, wavelet-LHL_glcm_InverseVariance, wavelet-LHH_firstorder_TotalEnergy, wavelet-LHH_glszm_SmallAreaEmphasis, wavelet-LHH_glszm_SmallAreaHighGrayLevelEmphasis, wavelet-HLH_glszm_SmallAreaEmphasis, wavelet-HLH_gldm_DependenceEntropy, wavelet-HHL_firstorder_Median, wavelet-HHL_glcm_ClusterTendency, wavelet-HHL_glcm_MaximumProbability, wavelet-HHH_glszm_GrayLevelNonUniformityNormalized, wavelet-LLL_glcm_SumAverage, wavelet-LLL_glszm_GrayLevelVariance, and wavelet-LLL_glszm_SizeZoneNonUniformityNormalized (Fig. 2C; Supplementary Table 6, http://links.lww.com/JS9/D842).

Figure 2. — Selection of radiomic features in the training cohort. (A) Five-fold cross-validation in the least absolute shrinkage and selection operator (LASSO) model. For each λ value, a confidence interval for the target parameter was obtained around the mean indicated by the red dots. Two dashed lines represent two specific λ values: one is lambda.min, which generates the λ value with the minimum cross-validation error; the other is lambda.1se, which is the λ value that adds one standard deviation to the λ value with the minimum cross-validation error. (B) Profiles of LASSO coefficient. (C) The Pearson correlation coefficient of 24 features selected by LASSO regression. LLH_S, wavelet.LLH_ngtdm_Strength; LLH_M, wavelet.LLH_firstorder_Minimum; LLH_ZE, wavelet.LLH_glszm_ZoneEntropy; LLH_SAE, wavelet.LHH_glszm_SmallAreaEmphasis; LLL_SA, wavelet.LLL_glcm_SumAverage; LHH_TE, wavelet.LHH_firstorder_TotalEnergy; HLH_SAE, wavelet.HLH_glszm_SmallAreaEmphasis; HLH_DE, wavelet.HLH_gldm_DependenceEntropy; HHL_MP, wavelet.HHL_glcm_MaximumProbability; ori_MP, original_glcm_MaximumProbability; ori_SVR, original_shape_SurfaceVolumeRatio; HLH_TE, wavelet.HLH_firstorder_TotalEnergy; HHL_M, wavelet.HHL_firstorder_Median; LLL_GLV, wavelet.LLL_glszm_GrayLevelVariance; HHH_GLV, wavelet.HHH_glszm_GrayLevelVariance; HHH_GLNU, wavelet.HHH_glszm_GrayLevelNonUniformityNormalized; LHL_Idn, wavelet.LHL_glcm_Idn; ori_E, original_shape_Elongation; LLL_SZNN, wavelet.LLL_glszm_SizeZoneNonUniformityNormalized; LHL_IV, wavelet.LHL_glcm_InverseVariance; LLH_SAHG, wavelet.LLH_glszm_SmallAreaHighGrayLevelEmphasis; HHL_CT, wavelet.HHL_glcm_ClusterTendency; LLH_10P, wavelet.LLH_firstorder_10Percentile; ori_Imc2, original_glcm_Imc2.

We selected the XGBoost algorithm for model construction (Supplementary Table 7, http://links.lww.com/JS9/D842). The RadiomicScore in participants with HCC was significantly higher than that in those without HCC (mean, 0.827 vs. 0.411, P < 0.001; Supplementary Table 5, http://links.lww.com/JS9/D842). The RadiomicScore model achieved AUCs of 0.953 (95% CI: 0.927–0.976) in the training cohort, 0.970 (95% CI: 0.937–0.995) in the internal validation cohort, and 0.699 (95% CI: 0.529–0.833) in the external validation cohort, respectively (Table 2).

Performance of the DLScore model

The DLScore demonstrated discriminative ability in malignant and benign liver diseases (mean, 0.911 vs. 0.230, P < 0.001; Supplementary Table 5, http://links.lww.com/JS9/D842), and the DLScore model reached an AUC of 0.987 (95% CI: 0.975–0.996) in the training cohort. Additionally, the AUC values were 0.985 (95% CI: 0.956–0.999) and 0.819 (95% CI: 0.696–0.917) in the internal and external validation cohorts, respectively (Table 2).

Performance of the MMF model

The XGBoost and LightGBM methods demonstrated similar performance in integrating models, whereas the former one was commonly used (Supplementary Table 8, http://links.lww.com/JS9/D842)^[29]. Therefore, we used XGBoost algorithm for subsequent analysis.

The BioScore model outperformed the EHR, DLScore, and RadiomicScore models in terms of performance and robustness, indicating the significant influence of peripheral immunological characteristics in the early diagnosis of HCC. Additionally, the DLScore model performed well in both the training and internal validation cohorts, whereas the performance of RadiomicScore was marginally inferior to that of the others.

Intriguingly, we discovered that the models with multimodal information were more efficient than those with single-modal features. Particularly, the all-in-one MMF model integrated with EHR features, BioScore, DLScore, and RadiomicScore performed the best, with an AUC of 1.000 (95% CI: 1.000–1.000) in the training cohort, 0.985 (95% CI: 0.969–0.996) in the internal validation cohort, and 0.915 (95% CI: 0.874–0.949) in the external validation cohort, respectively (Fig. 3; Table 3; Supplementary Table 9, http://links.lww.com/JS9/D842). Additionally, it had the highest accuracy of 1.000, 0.938, and 0.852 in the three cohorts, respectively (Table 3). The performance of the all-in-one MMF model was validated by the lowest Brier score, Eavg, and estimated calibration index in the validation cohorts (internal, 0.037, 0.021, and 0.121; external, 0.132, 0.100, and 1.225; Supplementary Table 10, http://links.lww.com/JS9/D842). Furthermore, the calibration curves of the all-in-one MMF model revealed a well-calibration alignment between the predicted HCC and the actual situations (Fig. 4A–C). The decision curves also revealed that the all-in-one MMF model provided a good net benefit (Fig. 4D–F).

Figure 3. — Comparison of area under curve (AUC) values among different models. (A) Training cohort. (B) Internal validation cohort. (C) External validation cohort.

Table 3.

Performance of models with multimodal features.

Cohort	Model	Accuracy	Precision	Sensitivity	Specificity	NPV	F1-score	AUC	AUPRC
Training cohort	EHR	0.901	0.921	0.949	0.760	0.834	0.935	0.931	0.965
	BioScore	0.921	0.921	0.977	0.754	0.918	0.948	0.975	0.991
	RadiomicScore	0.925	0.924	0.983	0.734	0.929	0.952	0.953	0.979
	DLScore	0.971	0.977	0.985	0.923	0.950	0.981	0.987	0.995
	EHR + BioScore	0.969	0.974	0.985	0.922	0.954	0.979	0.990	0.996
	EHR + RadiomicScore	0.969	0.981	0.977	0.944	0.934	0.979	0.993	0.997
	EHR + DLScore	0.958	0.964	0.979	0.894	0.936	0.972	0.991	0.997
	BioScore + RadiomicScore	0.999	0.998	1.000	0.994	1.000	0.999	1.000	1.000
	BioScore + DLScore	0.986	0.981	1.000	0.944	1.000	0.991	0.999	1.000
	RadiomicScore + DLScore	0.931	0.919	0.994	0.743	0.978	0.955	0.977	0.988
	EHR + BioScore + RadiomicScore	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000
	EHR + BioScore + DLScore	0.990	0.991	0.996	0.972	0.989	0.993	0.999	1.000
	EHR + RadiomicScore + DLScore	0.969	0.974	0.985	0.922	0.954	0.979	0.996	0.998
	BioScore + RadiomicScore + DLScore	0.986	0.983	0.998	0.950	0.994	0.991	0.999	1.000
	EHR + BioScore + RadiomicScore + DLScore	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000
Internal validation cohort	EHR	0.847	0.878	0.924	0.622	0.737	0.900	0.909	0.960
	BioScore	0.802	0.830	0.924	0.444	0.667	0.875	0.865	0.952
	RadiomicScore	0.926	0.913	1.000	0.676	1.000	0.954	0.970	0.990
	DLScore	0.966	0.966	0.991	0.882	0.968	0.979	0.985	0.995
	EHR + BioScore	0.825	0.863	0.909	0.578	0.684	0.886	0.920	0.974
	EHR + RadiomicScore	0.938	0.929	0.992	0.778	0.972	0.960	0.972	0.990
	EHR + DLScore	0.938	0.929	0.992	0.778	0.972	0.960	0.982	0.993
	BioScore + RadiomicScore	0.910	0.926	0.955	0.778	0.854	0.940	0.935	0.976
	BioScore + DLScore	0.932	0.923	0.992	0.756	0.971	0.956	0.967	0.988
	RadiomicScore + DLScore	0.898	0.885	0.992	0.622	0.966	0.936	0.961	0.979
	EHR + BioScore + RadiomicScore	0.893	0.931	0.924	0.800	0.783	0.928	0.966	0.988
	EHR + BioScore + DLScore	0.960	0.950	1.000	0.844	1.000	0.974	0.978	0.992
	EHR + RadiomicScore + DLScore	0.944	0.930	1.000	0.778	1.000	0.964	0.989	0.996
	BioScore + RadiomicScore + DLScore	0.932	0.929	0.985	0.778	0.946	0.956	0.976	0.992
	EHR + BioScore + RadiomicScore + DLScore	0.938	0.935	0.985	0.800	0.947	0.959	0.985	0.995
External validation cohort	EHR	0.746	0.729	0.947	0.410	0.821	0.824	0.853	0.903
	BioScore	0.785	0.779	0.916	0.564	0.800	0.842	0.876	0.934
	RadiomicScore	0.824	0.836	0.966	0.312	0.714	0.896	0.699	0.890
	DLScore	0.810	0.833	0.932	0.450	0.692	0.880	0.819	0.929
	EHR + BioScore	0.818	0.812	0.924	0.641	0.833	0.864	0.893	0.919
	EHR + RadiomicScore	0.737	0.741	0.893	0.474	0.725	0.810	0.845	0.911
	EHR + DLScore	0.708	0.703	0.924	0.346	0.730	0.799	0.857	0.919
	BioScore + RadiomicScore	0.799	0.820	0.870	0.679	0.757	0.844	0.860	0.898
	BioScore + DLScore	0.799	0.799	0.908	0.615	0.800	0.850	0.853	0.895
	RadiomicScore + DLScore	0.641	0.640	0.977	0.077	0.667	0.773	0.671	0.760
	EHR + BioScore + RadiomicScore	0.847	0.846	0.924	0.718	0.848	0.883	0.912	0.944
	EHR + BioScore + DLScore	0.813	0.803	0.931	0.615	0.842	0.862	0.903	0.938
	EHR + RadiomicScore + DLScore	0.722	0.711	0.939	0.359	0.778	0.809	0.867	0.925
	BioScore + RadiomicScore + DLScore	0.785	0.783	0.908	0.577	0.789	0.841	0.868	0.911
	EHR + BioScore + RadiomicScore + DLScore	0.852	0.852	0.924	0.731	0.851	0.886	0.915	0.947

Open in a new tab

Figure 4. — Calibration curves (A–C) and decision curves (D–F) for different models. (A and D) Training cohort. (B and E) Internal validation cohort. (C, F) External validation cohort.

Model interpretability

We used SHAP analysis to assess the contributions of each feature to the MMF model^[30]. We identified that BioScore, DLScore, RadiomicScore, HBV, sex, and age were the key features to interpret the all-in-one MMF model (Fig. 5). Based on quantitative assessment of weightage, BioScore had the highest SHAP value for detecting HCC, followed by DLScore and RadiomicScore. The SHAP dependence plots depicted the trend of SHAP values with feature variations and the impact of individual feature modifications on the model output (Fig. 6). The SHAP value exhibited an increasing trend as the feature value rises. A SHAP value greater than zero indicated a higher risk of HCC. For instance, patients with BioScore greater than 0.7 typically have SHAP values greater than zero, suggesting that they may be patients with HCC. The waterfall plots displayed the actual measurement values of features, and the values of two patients without HCC (Fig. 7A and B) and two patients with HCC (Fig. 7C and D) were exhibited. Higher BioScore, DLScore, and RadiomicScore have been demonstrated to influence decisions in favor of the “HCC” category.

Figure 5. — SHAP summary plot of the fusion model. Each point represents a sample, and its position on the horizontal axis indicates the SHAP value of the feature. The color reflects the relative magnitude of the feature, with red indicating high values (HCC in this study) and blue indicating low values. A higher feature value corresponds to SHAP values greater than 0, indicating a higher likelihood of the patient being an HCC patient.

Figure 6. — SHAP dependence plot of the fusion model. The SHAP dependence plot illustrates how SHAP values change with variations in specific features.

Figure 7. — Waterfall plots of each patient’s feature values and their contributions to model decision-making. (A, B) Two patients without HCC. (C, D) Two patients with HCC.

Performance of the models in subgroups

To accurately distinguish between liver lesions with different properties and HCC, we assessed the predictive efficacy of different models (Supplementary Table 11, http://links.lww.com/JS9/D842). The all-in-one MMF model remained the best model to differentiate HCC from hepatic cysts (accuracy: 0.759) or HCC and hepatic nodules (accuracy: 0.833). However, a model integrating EHR features, BioScore, and DLScore demonstrated the best performance in detecting HCC (accuracy: 0.986), and a model integrating EHR features, BioScore, and RadiomicScore reduced the probability of misdiagnosing hepatic hemangioma as HCC (accuracy: 0.920).

Additionally, the all-in-one MMF model enhanced the accuracy of HCC with small size. The accuracy achieved 0.882 and 0.977 in HCCs smaller than 1 and 2 cm, respectively (Supplementary Table 12, http://links.lww.com/JS9/D842). Similarly, the all-in-one MMF model significantly outperformed the other models for both the AFP-positive (AUC, 0.991; 95% CI: 0.968–1.000; Supplementary Table 13, http://links.lww.com/JS9/D842) and AFP-negative subgroups (AUC, 0.974; 95% CI: 0.964–0.982). Therefore, it achieved even significant improvement compared with the other models for AFP-negative HCCs.

Discussion

Accurate and early detection of HCC is critical for improving its prognosis. As a systemic disease, integrating multimodal information is considered a valuable measurement to promote early detection of HCC. Unfortunately, the lack of comprehensive data from large cohorts has hindered the progress. Additionally, the integration methods have been inadequate over the past few decades. In our previous study, we constructed a large cohort with comprehensive information by prospectively gathering samples for CyTOF analysis and acquiring their other data. Furthermore, the development of AI algorithms facilitates easier integration of data with other modalities and improvement of the detection efficacy.

In this study, we established a large cohort of participants and gathered clinical data, peripheral immunological status, and CT imaging as multimodal features to construct an integrated model for improved HCC detection. As predicted, the all-in-one MMF model outperformed all single-modal models in terms of overall performance, bolstering the concept that the accumulation of comprehensive data can improve cancer diagnosis. Importantly, most of the patients with HCC in this study were in the early stages, indicating that our MMF model performed well in early detection of HCC. Further, it continued to exhibit superior competence in AFP-negative and small-sized HCCs, which was considered challenging.

Collecting multimodal data of people is cost-consuming and may cause overdiagnosis. In addition to standard data in clinical practice, we chose peripheral immune status because, as regular evaluation indicated, it offers indirect signs in the case of a tumor and may supplement direct signals^[31,32]. This rationale is preliminarily proven by our previous study^[11]. In this study, we further expanded the types of data and attempted to integrate them using AI techniques. Meanwhile, we conducted strict processes of data cleansing, integration, and reduction to enhance the quality of the data. The most flexible and effective modeling method was selected from multiple machine learning algorithms. Additionally, it was observed that the XGBoost method used in the study performed satisfactorily in integrating models and generating prediction values^[29].

Our radiological data analyses had two highlights. First, we extracted features based on the whole-liver segmentation method, which has been established for evaluating chronic liver diseases and predicting liver metastases originating from other cancers^[33–36]. In addition to the features of liver lesions, whole-liver segmentation considers the “soil” associated with the onset and progression of HCC, which may offer more information for early detection of HCC^[21]. Second, we employed both DL and radiomics. In addition to classical image features extracted from traditional radiomics, DL offered new deep features. Thus, DL and radiomics can complement each other and eventually improve performance.

Through the SHAP methods, the importance of variables in predicting recurrence of HCC has been revealed^[37,38]. We also used the SHAP method to interpret the MMF model, which offers both global and local explanations. Global explanations offer consistent and accurate attribution values for every feature in the model, demonstrating the associations between features and HCC. Additionally, local explanations enable specific predictions for individual patients by entering specific data. The SHAP methods aid in the visualization of the MMF model and contribute to a better understanding of its decision-making for HCC. Specifically, the risk of HCC was dynamically evaluated based on the SHapley score calculated by assigning quantitative weights, and the immunological status of patients is highly valuable for HCC early detection, which reinforces the findings of our previous study and emphasizes the need to evaluate immune status. Additionally, there was no authoritative statement or consensus indicating that DL is fundamentally superior to radiomics previously. However, in our study, we demonstrated comparable or even better performance of DL, revealing additional information that cannot be extracted through radiomics. The information on each modality in this study contributes to improving HCC early detection, and their combination achieved the superposition of effects.

This study had certain limitations. First, we recruited all participants from secondary and tertiary hospitals in China, thus, the proportion and nature of HCC do not accurately represent the comprehensive population. Second, the multimodal data had selection bias and incompleteness. We chose CyTOF analysis based on our previous experience; nevertheless, there might be other suitable choices. Meanwhile, CyTOF has not yet been extensively applied in clinical practice and will increase the cost of tumor screening. Hence, multicenter prospective studies involving centers of various levels and regions must be conducted to both confirm the performance of our model in more universal scenarios and generalize the application of CyTOF. Third, the interpretability of the MMF model presented difficulties for clinical application. Although we adopted the SHAP method to visualize the black box, the underlying mechanisms and accurate interpretation of the model need further exploration for better understanding.

In summary, we constructed an MMF model integrating clinical characteristics, radiologic features, and peripheral immunological status using machine learning methods. The MMF model outperformed all single-modal models. Multimodal feature fusion may enable more precise detection of HCC in clinical practice.

Footnotes

Y.W., S.C., and Y.T. contributed equally to this work.

Sponsorships or competing interests that may be relevant to content are disclosed at the end of this article.

Supplemental Digital Content is available for this article. Direct URL citations are provided in the HTML and PDF versions of this article on the journal’s website, www.lww.com/international-journal-of-surgery.

Contributor Information

Yangyang Wang, Email: 22118030@zju.edu.cn.

Shengqiang Chi, Email: sqchi1991@163.com.

Yu Tian, Email: tyler@zju.edu.cn.

Hang Zhang, Email: zhangh02@zhejianglab.com.

Chao Huang, Email: huangchao@zhejianglab.com.

Yiwei Gao, Email: gaoyw@zhejianglab.com.

Gaowei Jin, Email: jaywzz2018@163.com.

Wanyue Cao, Email: caowy423@163.com.

Cao Chen, Email: cc929cccc@163.com.

Haonan Ding, Email: 22318519@zju.edu.cn.

Yuquan Zhang, Email: 22318557@zju.edu.cn.

Yupeng Hong, Email: hypbdn@zju.edu.cn.

Xu Sun, Email: sx-phoenix@126.com.

Yuhua Zhang, Email: 22318557@zju.edu.cn.

Weiyun Yao, Email: changxingywy@163.com.

Runtian Liu, Email: 873879757@qq.com.

Yongfei Hua, Email: huayongfei@aliyun.com.

Haifeng Huang, Email: 18241425@qq.com.

Bo Zhang, Email: 358148353@qq.com.

Weifeng Tao, Email: 13857595014@163.com.

Tianxing Yang, Email: ytx126@126.com.

Yuming Gao, Email: gaoyw@zhejianglab.com.

Xiaoguang Wang, Email: xiaoguangwangs@163.com.

Cheng Lin, Email: lincheng@plttech.com.

Qi Zhang, Email: qi.zhang@zju.edu.cn.

Tingbo Liang, Email: liangtingbo@zju.edu.cn.

Ethical approval

This study protocol was reviewed and approved by the ethical committee of FAHZU (No. IIT202304908B).

Consent

Written informed consent was obtained from the patient for publication of this case report and accompanying images. A copy of the written consent is available for review by the Editor-in-Chief of this journal on request.

Sources of funding

This work was financially supported by the National Key Research & Development Program (No. 2020YFA0804300/2020YFA0804301), the National Natural Science Foundation of China (Nos. 82473461, 82188102, 82172069, 92359304 and 32321002), Zhejiang Provincial Natural Science Foundation (No. HDMD22H319373), Zhejiang Province Science and Technology Major Progress (No. WKJ-ZJ-2403), and Zhejiang Provincial Traditional Chinese Medicine Science and Technology Project (No. GZY-ZJ-KJ-23025). Dr. Qi Zhang also gratefully acknowledges the support of K.C.Wong Education Foundation.

Author contributions

T.L., Q.Z., and J.L. conceived and designed the study. T.L. and Q.Z. supervised the study. Y.W., S.C., Y.T., Q.F., W.C., C.C., H.D., Y.Z., Y.H., J.L., X.S., E.L., Y.Z., W.Y., R.L., Y.H., H.H., M.X., B.Z., W.T., T.Y., Y.G., and X.W. collected the data. Y.W., S.C., Y.T., X.L., H.Z., C.H., Y.G., and C.L. performed the statistical analysis. Y.W., S.C., and Y.T. drafted the manuscript. All the authors made critical revisions and approved the final version of the manuscript.

Conflicts of interest disclosure

The authors have no conflicts of interest to declare.

Research registration unique identifying number (UIN)

NCT06637059.

Guarantor

Tingbo Liang and Qi Zhang.

Provenance and peer review

Not commissioned, externally peer-reviewed.

Data availability statement

The data in this study are available from the corresponding author on reasonable request.

References

[1].Sung H, Ferlay J, Siegel RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2021;71:209–49. [DOI] [PubMed] [Google Scholar]
[2].Villanueva A. Hepatocellular carcinoma. N Engl J Med 2019;380:1450–62. [DOI] [PubMed] [Google Scholar]
[3].Llovet JM, Zucman-Rossi J, Pikarsky E, et al. Hepatocellular carcinoma. Nat Rev Dis Primers 2016;2:16018. [DOI] [PubMed] [Google Scholar]
[4].Singal AG, Llovet JM, Yarchoan M, et al. AASLD Practice Guidance on prevention, diagnosis, and treatment of hepatocellular carcinoma. Hepatology 2023;78:1922–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
[5].Adeniji N, Dhanasekaran R. Current and emerging tools for hepatocellular carcinoma surveillance. Hepatol Commun 2021;5:1972–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
[6].Galle PR, Foerster F, Kudo M, et al. Biology and significance of alpha-fetoprotein in hepatocellular carcinoma. Liver Int 2019;39:2214–29. [DOI] [PubMed] [Google Scholar]
[7].Trevisani F, D’Intino PE, Morselli-Labate AM, et al. Serum α-fetoprotein for diagnosis of hepatocellular carcinoma in patients with chronic liver disease: influence of HBsAg and anti-HCV status. J Hepatol 2001;34:570–75. [DOI] [PubMed] [Google Scholar]
[8].Zhang X, Wang Z, Tang W, et al. Ultrasensitive and affordable assay for early detection of primary liver cancer using plasma cell-free DNA fragmentomics. Hepatology 2022;76:317–29. [DOI] [PubMed] [Google Scholar]
[9].Iyer A, Hamers AAJ, Pillai AB. CyTOF((R)) for the Masses. Front Immunol 2022;13:815828. [DOI] [PMC free article] [PubMed] [Google Scholar]
[10].Shi J, Liu J, Tu X, et al. Single-cell immune signature for detecting early-stage HCC and early assessing anti-PD-1 immunotherapy efficacy. J Immunother Cancer 2022;10:e003133. [DOI] [PMC free article] [PubMed] [Google Scholar]
[11].Zhang Q, Ye M, Lin C, et al. Mass cytometry-based peripheral blood analysis as a novel tool for early detection of solid tumours: a multicentre study. Gut 2023;72:996–1006. [DOI] [PMC free article] [PubMed] [Google Scholar]
[12].Harding-Theobald E, Louissaint J, Maraj B, et al. Systematic review: radiomics for the diagnosis and prognosis of hepatocellular carcinoma. Aliment Pharmacol Ther 2021;54:890–901. [DOI] [PMC free article] [PubMed] [Google Scholar]
[13].Gu W, Chen Y, Zhu H, et al. Development and validation of CT-based radiomics deep learning signatures to predict lymph node metastasis in non-functional pancreatic neuroendocrine tumors: a multicohort study. EClinicalMedicine 2023;65:102269. [DOI] [PMC free article] [PubMed] [Google Scholar]
[14].Dong D, Fang MJ, Tang L, et al. Deep learning radiomic nomogram can predict the number of lymph node metastasis in locally advanced gastric cancer: an international multicenter study. Ann Oncol 2020;31:912–20. [DOI] [PubMed] [Google Scholar]
[15].Zhong J, Shi J, Amundadottir LT. Artificial intelligence and improved early detection for pancreatic cancer. Innov (Camb) 2023;4:100457. [DOI] [PMC free article] [PubMed] [Google Scholar]
[16].Taylor-Phillips S, Seedat F, Kijauskaite G, et al. UK National Screening Committee’s approach to reviewing evidence on artificial intelligence in breast cancer screening. Lancet Digit Health 2022;4:e558–e65. [DOI] [PubMed] [Google Scholar]
[17].He J, Wang B, Tao J, et al. Accurate classification of pulmonary nodules by a combined model of clinical, imaging, and cell-free DNA methylation biomarkers: a model development and external validation study. Lancet Digit Health 2023;5:e647–e56. [DOI] [PubMed] [Google Scholar]
[18].Cai G, Huang F, Gao Y, et al. Artificial intelligence-based models enabling accurate diagnosis of ovarian cancer using laboratory tests in China: a multicentre, retrospective cohort study. Lancet Digit Health 2024;6:e176–e86. [DOI] [PubMed] [Google Scholar]
[19].Zhou J, Sun HC, Wang Z, et al. Guidelines for diagnosis and treatment of primary liver cancer in China (2017 Edition). Liver Cancer 2018;7:235–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
[20].Rashid R, Sohrabi C, Kerwan A, et al. The STROCSS 2024 guideline: strengthening the reporting of cohort, cross-sectional, and case-control studies in surgery. Int J Surg 2024;110:3151–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
[21].Huang C, Hu P, Tian Y, et al. Mining whole-liver information with deep learning for preoperatively Predicting HCC recurrence-free survival. Annu Int Conf IEEE Eng Med Biol Soc 2023;2023:1–4. [DOI] [PubMed] [Google Scholar]
[22].Van Griethuysen JJ, Fedorov A, Parmar C, et al. Computational radiomics system to decode the radiographic phenotype. Cancer Res 2017;77:e104–e7. [DOI] [PMC free article] [PubMed] [Google Scholar]
[23].Zwanenburg A, Vallieres M, Abdalah MA, et al. The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology 2020;295:328–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
[24].Blu T, Thevenaz P, Unser M. Linear interpolation revitalized. IEEE Trans Image Process 2004;13:710–19. [DOI] [PubMed] [Google Scholar]
[25].Xu W, Fu YL, Zhu D. ResNet and its application to medical image processing: research progress and challenges. Comput Methods Programs Biomed 2023;240:107660. [DOI] [PubMed] [Google Scholar]
[26].Luo Y. Evaluating the state of the art in missing data imputation for clinical data. Brief Bioinform 2022;23:bbab489. [DOI] [PMC free article] [PubMed] [Google Scholar]
[27].Ke GL, Meng Q, Finley T, et al. LightGBM: a highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst 30 (Nips 2017) 2017;30:3146–54. [Google Scholar]
[28].Lundberg S, Lee SI. A unified approach to interpreting model predictions. arXiv. 2017.
[29].Chen T, Guestrin C. XGBoost: a scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; San Francisco, California, USA: Association for Computing Machinery; 2016. p. 785–94. [Google Scholar]
[30].Lundberg S, Lee S. A unified approach to interpreting model predictions. Neural Inf Process Syst 2017;30:4768–77. [Google Scholar]
[31].Bendall SC, Simonds EF, Qiu P, et al. Single-cell mass cytometry of differential immune and drug responses across a human hematopoietic continuum. Science 2011;332:687–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
[32].Steele NG, Carpenter ES, Kemp SB, et al. Multimodal mapping of the tumor and peripheral blood immune landscape in human pancreatic cancer. Nat Cancer 2020;1:1097–112. [DOI] [PMC free article] [PubMed] [Google Scholar]
[33].Marti-Aguado D, Jimenez-Pastor A, Alberich-Bayarri A, et al. Automated whole-liver MRI segmentation to assess steatosis and iron quantification in chronic liver disease. Radiology 2022;302:345–54. [DOI] [PubMed] [Google Scholar]
[34].Cui E, Long W, Wu J, et al. Predicting the stages of liver fibrosis with multiphase CT radiomics based on volumetric features. Abdom Radiol (NY) 2021;46:3866–76. [DOI] [PubMed] [Google Scholar]
[35].Beckers RCJ, Lambregts DMJ, Schnerr RS, et al. Whole liver CT texture analysis to predict the development of colorectal liver metastases-a multicentre study. Eur J Radiol 2017;92:64–71. [DOI] [PubMed] [Google Scholar]
[36].Gebauer L, Moltz JH, Muhlberg A, et al. Quantitative imaging biomarkers of the whole liver tumor burden improve survival prediction in metastatic pancreatic cancer. Cancers (Basel) 2021;13:5732. [DOI] [PMC free article] [PubMed] [Google Scholar]
[37].Altaf A, Mustafa A, Dar A, et al. Artificial intelligence-based model for the recurrence of hepatocellular carcinoma after liver transplantation. Surgery 2024;176:1500–06. [DOI] [PubMed] [Google Scholar]
[38].Nam JY, Lee JH, Bae J, et al. Novel model to predict HCC recurrence after liver transplantation obtained using deep learning: a multicenter study. Cancers (Basel) 2020;12:2791. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data in this study are available from the corresponding author on reasonable request.

[R1] [1].Sung H, Ferlay J, Siegel RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2021;71:209–49. [DOI] [PubMed] [Google Scholar]

[R2] [2].Villanueva A. Hepatocellular carcinoma. N Engl J Med 2019;380:1450–62. [DOI] [PubMed] [Google Scholar]

[R3] [3].Llovet JM, Zucman-Rossi J, Pikarsky E, et al. Hepatocellular carcinoma. Nat Rev Dis Primers 2016;2:16018. [DOI] [PubMed] [Google Scholar]

[R4] [4].Singal AG, Llovet JM, Yarchoan M, et al. AASLD Practice Guidance on prevention, diagnosis, and treatment of hepatocellular carcinoma. Hepatology 2023;78:1922–65. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] [5].Adeniji N, Dhanasekaran R. Current and emerging tools for hepatocellular carcinoma surveillance. Hepatol Commun 2021;5:1972–86. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] [6].Galle PR, Foerster F, Kudo M, et al. Biology and significance of alpha-fetoprotein in hepatocellular carcinoma. Liver Int 2019;39:2214–29. [DOI] [PubMed] [Google Scholar]

[R7] [7].Trevisani F, D’Intino PE, Morselli-Labate AM, et al. Serum α-fetoprotein for diagnosis of hepatocellular carcinoma in patients with chronic liver disease: influence of HBsAg and anti-HCV status. J Hepatol 2001;34:570–75. [DOI] [PubMed] [Google Scholar]

[R8] [8].Zhang X, Wang Z, Tang W, et al. Ultrasensitive and affordable assay for early detection of primary liver cancer using plasma cell-free DNA fragmentomics. Hepatology 2022;76:317–29. [DOI] [PubMed] [Google Scholar]

[R9] [9].Iyer A, Hamers AAJ, Pillai AB. CyTOF((R)) for the Masses. Front Immunol 2022;13:815828. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] [10].Shi J, Liu J, Tu X, et al. Single-cell immune signature for detecting early-stage HCC and early assessing anti-PD-1 immunotherapy efficacy. J Immunother Cancer 2022;10:e003133. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] [11].Zhang Q, Ye M, Lin C, et al. Mass cytometry-based peripheral blood analysis as a novel tool for early detection of solid tumours: a multicentre study. Gut 2023;72:996–1006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] [12].Harding-Theobald E, Louissaint J, Maraj B, et al. Systematic review: radiomics for the diagnosis and prognosis of hepatocellular carcinoma. Aliment Pharmacol Ther 2021;54:890–901. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] [13].Gu W, Chen Y, Zhu H, et al. Development and validation of CT-based radiomics deep learning signatures to predict lymph node metastasis in non-functional pancreatic neuroendocrine tumors: a multicohort study. EClinicalMedicine 2023;65:102269. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] [14].Dong D, Fang MJ, Tang L, et al. Deep learning radiomic nomogram can predict the number of lymph node metastasis in locally advanced gastric cancer: an international multicenter study. Ann Oncol 2020;31:912–20. [DOI] [PubMed] [Google Scholar]

[R15] [15].Zhong J, Shi J, Amundadottir LT. Artificial intelligence and improved early detection for pancreatic cancer. Innov (Camb) 2023;4:100457. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] [16].Taylor-Phillips S, Seedat F, Kijauskaite G, et al. UK National Screening Committee’s approach to reviewing evidence on artificial intelligence in breast cancer screening. Lancet Digit Health 2022;4:e558–e65. [DOI] [PubMed] [Google Scholar]

[R17] [17].He J, Wang B, Tao J, et al. Accurate classification of pulmonary nodules by a combined model of clinical, imaging, and cell-free DNA methylation biomarkers: a model development and external validation study. Lancet Digit Health 2023;5:e647–e56. [DOI] [PubMed] [Google Scholar]

[R18] [18].Cai G, Huang F, Gao Y, et al. Artificial intelligence-based models enabling accurate diagnosis of ovarian cancer using laboratory tests in China: a multicentre, retrospective cohort study. Lancet Digit Health 2024;6:e176–e86. [DOI] [PubMed] [Google Scholar]

[R19] [19].Zhou J, Sun HC, Wang Z, et al. Guidelines for diagnosis and treatment of primary liver cancer in China (2017 Edition). Liver Cancer 2018;7:235–60. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] [20].Rashid R, Sohrabi C, Kerwan A, et al. The STROCSS 2024 guideline: strengthening the reporting of cohort, cross-sectional, and case-control studies in surgery. Int J Surg 2024;110:3151–65. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] [21].Huang C, Hu P, Tian Y, et al. Mining whole-liver information with deep learning for preoperatively Predicting HCC recurrence-free survival. Annu Int Conf IEEE Eng Med Biol Soc 2023;2023:1–4. [DOI] [PubMed] [Google Scholar]

[R22] [22].Van Griethuysen JJ, Fedorov A, Parmar C, et al. Computational radiomics system to decode the radiographic phenotype. Cancer Res 2017;77:e104–e7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] [23].Zwanenburg A, Vallieres M, Abdalah MA, et al. The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology 2020;295:328–38. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] [24].Blu T, Thevenaz P, Unser M. Linear interpolation revitalized. IEEE Trans Image Process 2004;13:710–19. [DOI] [PubMed] [Google Scholar]

[R25] [25].Xu W, Fu YL, Zhu D. ResNet and its application to medical image processing: research progress and challenges. Comput Methods Programs Biomed 2023;240:107660. [DOI] [PubMed] [Google Scholar]

[R26] [26].Luo Y. Evaluating the state of the art in missing data imputation for clinical data. Brief Bioinform 2022;23:bbab489. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] [27].Ke GL, Meng Q, Finley T, et al. LightGBM: a highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst 30 (Nips 2017) 2017;30:3146–54. [Google Scholar]

[R28] [28].Lundberg S, Lee SI. A unified approach to interpreting model predictions. arXiv. 2017.

[R29] [29].Chen T, Guestrin C. XGBoost: a scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; San Francisco, California, USA: Association for Computing Machinery; 2016. p. 785–94. [Google Scholar]

[R30] [30].Lundberg S, Lee S. A unified approach to interpreting model predictions. Neural Inf Process Syst 2017;30:4768–77. [Google Scholar]

[R31] [31].Bendall SC, Simonds EF, Qiu P, et al. Single-cell mass cytometry of differential immune and drug responses across a human hematopoietic continuum. Science 2011;332:687–96. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] [32].Steele NG, Carpenter ES, Kemp SB, et al. Multimodal mapping of the tumor and peripheral blood immune landscape in human pancreatic cancer. Nat Cancer 2020;1:1097–112. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] [33].Marti-Aguado D, Jimenez-Pastor A, Alberich-Bayarri A, et al. Automated whole-liver MRI segmentation to assess steatosis and iron quantification in chronic liver disease. Radiology 2022;302:345–54. [DOI] [PubMed] [Google Scholar]

[R34] [34].Cui E, Long W, Wu J, et al. Predicting the stages of liver fibrosis with multiphase CT radiomics based on volumetric features. Abdom Radiol (NY) 2021;46:3866–76. [DOI] [PubMed] [Google Scholar]

[R35] [35].Beckers RCJ, Lambregts DMJ, Schnerr RS, et al. Whole liver CT texture analysis to predict the development of colorectal liver metastases-a multicentre study. Eur J Radiol 2017;92:64–71. [DOI] [PubMed] [Google Scholar]

[R36] [36].Gebauer L, Moltz JH, Muhlberg A, et al. Quantitative imaging biomarkers of the whole liver tumor burden improve survival prediction in metastatic pancreatic cancer. Cancers (Basel) 2021;13:5732. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] [37].Altaf A, Mustafa A, Dar A, et al. Artificial intelligence-based model for the recurrence of hepatocellular carcinoma after liver transplantation. Surgery 2024;176:1500–06. [DOI] [PubMed] [Google Scholar]

[R38] [38].Nam JY, Lee JH, Bae J, et al. Novel model to predict HCC recurrence after liver transplantation obtained using deep learning: a multicenter study. Cancers (Basel) 2020;12:2791. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Construction of an artificially intelligent model for accurate detection of HCC by integrating clinical, radiological, and peripheral immunological features

Yangyang Wang, MM

Shengqiang Chi, PhD

Yu Tian, PhD

Xueyao Li, ME

Hang Zhang, ME

Yiting Xu, BS

Chao Huang, PhD

Yiwei Gao, ME

Gaowei Jin, BS

Qihan Fu, MD

Wanyue Cao, MM

Cao Chen, BS

Haonan Ding, BS

Yuquan Zhang, BS

Yupeng Hong, MD

Junjian Li, MD

Xu Sun, MM

Enliang Li, MD

Yuhua Zhang, MD

Weiyun Yao, BS

Runtian Liu, MM

Yongfei Hua, MD

Haifeng Huang, BS

Minghui Xu, MM

Bo Zhang, MM

Weifeng Tao, BS

Tianxing Yang, BS

Yuming Gao, BS

Xiaoguang Wang, MM

Cheng Lin, MD

Jingsong Li, PhD

Qi Zhang, MD

Tingbo Liang, PhD, FACS

Abstract

Background:

Materials and methods:

Results:

Conclusions:

Introduction

Methods

Study design and participants

Figure 1.

EHR feature selection and model construction

Immunological feature selection and model construction

Radiomic feature selection and model construction

DL model construction

MMF model construction

Model evaluation and interpretability

Statistical analysis

Results

Characteristics of participants

Table 1.

Performance of the EHR model

Performance of the BioScore model

Table 2.

Performance of the RadiomicScore model

Figure 2.

Performance of the DLScore model

Performance of the MMF model

Figure 3.

Table 3.

Figure 4.

Model interpretability

Figure 5.

Figure 6.

Figure 7.

Performance of the models in subgroups

Discussion

Footnotes

Contributor Information

Ethical approval

Consent

Sources of funding

Author contributions

Conflicts of interest disclosure

Research registration unique identifying number (UIN)

Guarantor

Provenance and peer review