Skip to main content
Journal of Imaging Informatics in Medicine logoLink to Journal of Imaging Informatics in Medicine
. 2024 Mar 4;37(4):1297–1311. doi: 10.1007/s10278-024-01058-1

Diagnostic Performance of Artificial Intelligence in Detection of Hepatocellular Carcinoma: A Meta-analysis

Mohammad Amin Salehi 1, Hamid Harandi 1,2, Soheil Mohammadi 1,, Mohammad Shahrabi Farahani 3, Shayan Shojaei 1, Ramy R Saleh 4,5
PMCID: PMC11300422  PMID: 38438694

Abstract

Due to the increasing interest in the use of artificial intelligence (AI) algorithms in hepatocellular carcinoma detection, we performed a systematic review and meta-analysis to pool the data on diagnostic performance metrics of AI and to compare them with clinicians’ performance. A search in PubMed and Scopus was performed in January 2024 to find studies that evaluated and/or validated an AI algorithm for the detection of HCC. We performed a meta-analysis to pool the data on the metrics of diagnostic performance. Subgroup analysis based on the modality of imaging and meta-regression based on multiple parameters were performed to find potential sources of heterogeneity. The risk of bias was assessed using Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) and Prediction Model Study Risk of Bias Assessment Tool (PROBAST) reporting guidelines. Out of 3177 studies screened, 44 eligible studies were included. The pooled sensitivity and specificity for internally validated AI algorithms were 84% (95% CI: 81,87) and 92% (95% CI: 90,94), respectively. Externally validated AI algorithms had a pooled sensitivity of 85% (95% CI: 78,89) and specificity of 84% (95% CI: 72,91). When clinicians were internally validated, their pooled sensitivity was 70% (95% CI: 60,78), while their pooled specificity was 85% (95% CI: 77,90). This study implies that AI can perform as a diagnostic supplement for clinicians and radiologists by screening images and highlighting regions of interest, thus improving workflow.

Supplementary Information

The online version contains supplementary material available at 10.1007/s10278-024-01058-1.

Keywords: Hepatocellular carcinoma, HCC, Artificial intelligence, Meta-analysis

Introduction

Globally, primary liver cancers rank sixth in terms of incidence and third in terms of cancer-related mortality [1]. According to the GLOBOCAN 2018 database, liver cancer is detected in around 800,000 new cases year, leading to more than 700,000 fatalities [2]. Hepatocellular carcinoma (HCC), also known as hepatoma, is the most prevalent primary malignant tumor of the liver, accounting for approximately 70 to 80% of all such malignancies [3]. Fortunately, individuals diagnosed with HCC today have a range of therapeutic modalities available to them, including chemotherapy, radiation therapy, surgical resection, percutaneous ablation, and liver transplantation [4].

As with most diseases of a similar nature, early diagnosis can assist in determining the most appropriate course of treatment and, as a consequence, improve the patient's prognosis with regard to mortality [57] and morbidity [7, 8]. The early detection of HCC is greatly aided by diagnostic radiology, which plays a crucial role in the diagnostic process. In a meta-analysis of eighteen studies, Liang et al. reported the pooled sensitivity and specificity of LI-RADS ≥ 3 for HCC diagnosis, 0.86 (95% CI: 0.78–0.91) and 0.85 (95% CI: 0.78–0.90), respectively [9].

The concept of artificial intelligence (AI) refers to machines performing tasks that typically require the intelligence of a human to accomplish. Furthermore, machine learning is an essential component of AI that enables devices to acquire knowledge by analyzing data, identifying patterns, and making decisions with the minimum amount of human intervention [10]. Deep learning (DL) is also a subset of machine learning consisting of three or more layers of neural networks, algorithms based on the concept of a human neuron [11], that process and validate datasets, using training data and its outcome, producing accurate predictions for testing data [12]. Numerous radiological functions, including the detection of abnormal lymph nodes, the assessment of skeletal maturity, and the identification of mammographic lesions, have exhibited encouraging outcomes due to the implementation of AI in radiology [13]. The methods entail utilizing machine learning algorithms such as support vector machines, random forest, and decision tree on radiomics data, which are quantitative characteristics derived from several radiologic modalities. Furthermore, artificial neural networks (ANN) are employed for the analysis of numerical data, while convolutional neural networks (CNN) are utilized for the processing of image data [14].

The incorporation of AI as a supplementary tool for radiologists has undergone a significant transformation over the past few years, potentially representing a paradigm shift in the field [1517]. Its influence is presently being realized, extending to the domain of interventional radiology [18]. The detection and classification of liver lesions are no exceptions to this rule [19, 20]. At the time we are conducting this review, The US Food and Drug Administration (FDA) has recently approved using several artificial intelligence-enhanced software for medical purposes [21]. Nonetheless, the practical applicability of this concept is still subject to some debate [9, 2225]. Several recent studies have addressed the constraints and challenges associated with the implementation of AI in the field of medicine [2628].

Accordingly, we conducted a review to systematically assess the published studies that applied artificial intelligence in the imaging field for diagnosis of hepatocellular carcinoma. Furthermore, we performed a meta-analysis to determine this system’s accuracy in identifying HCC and distinguishing it from other forms of liver masses.

Materials and Methods

Protocol and Registration

This systematic review and meta-analysis was prepared in line with the Preferred Reporting Items for Systematic Reviews and Meta-analysis (PRISMA) statement [29]. The authors have submitted this review to the International Prospective Register of Systematic Reviews (PROSPERO) website (CRD42023398040). M.A.S and H.H conducted primary and full-text screenings. M.S.F and H.H. extracted data from the articles. These data were also rechecked by S.M. Study quality assessment was performed by S.M and M.A.S.

Search Strategy and Study Selection

In order to find studies that evaluated an AI algorithm for the detection of HCC, a search in PubMed and Scopus was conducted in September 2022 and subsequently updated in January 2024. We utilized the Medical Subject Headings (MeSH) terms “Hepatocellular Carcinoma” (HCC) and “Artificial Intelligence,” as well as their synonyms and terms found below them in the MeSH hierarchy, to search for relevant articles in online databases (Table E1). We included all articles that met the following criteria: original research studies that developed or validated an AI algorithm for detecting HCC among patients with a hepatic mass (e.g., HCC, metastasis, hemangioma, and cysts) or those without any masses. The following criteria were applied as the exclusion criteria: non-original articles, and studies that true negatives (TN), false positives (FP), true positives (TP), and false negatives (FN) were not mentioned or could not be calculated to prepare a contingency table. In terms of study language, time, or settings, there were no restrictions. Disagreements were resolved by consensus.

Data Extraction

In each study, the following variables were extracted: first author and publication year; country of the study; the number of participants for each trait; imaging modality; utilized algorithm and architecture; imaging dimension; evaluation metrics; and training size; type of validation and validation size; imaging view; the number of true negatives, false positives, true positives, false negatives; sensitivity and specificity; and positive predictive value, area under the curve (AUC), and accuracy. A contingency table was constructed for each study using the nominal data reported therein. To ensure the accuracy of the data for analysis, two reviewers independently reviewed each table.

Statistical Analysis

To evaluate the diagnostic performance of both AI algorithms and radiologists, we performed a meta-analysis of relevant studies. We employed a random-effects model to calculate the pooled sensitivity and specificity, along with their corresponding 95% confidence intervals (CI), if at least three eligible studies were identified. We also constructed summary receiver operating characteristic (ROC) curves to estimate the pooled sensitivities and specificities.

In addition, we conducted a meta-regression analysis to explore potential sources of inter-study heterogeneity, examining covariates of the usage of data augmentation and the modality of each study. All statistical tests were two sided and a significance level of P < 0.05 was used to indicate statistical significance. All quantitative data analysis was performed using Stata version 17 software (Stata Corp, College Station, TX) [30, 31].

Quality Assessment

We assessed the extent to which studies adhered to reporting guidelines using the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) checklist. TRIPOD statement comprises 22 recommendations to promote transparent reporting in studies involving the development or validation of prediction models [32]. We used a modified version of TRIPOD (Table E2) since not every item on this checklist was useful for AI studies; for instance, it is irrelevant to report follow-up time, in diagnostic accuracy studies. In addition, the Prediction Model Study Risk of Bias Assessment Tool (PROBAST) checklist was employed to evaluate the presence of bias and assess the applicability of the included studies [33] (Table E3). This tool utilizes a series of questions pertaining to participants, predictors, outcomes, and analysis to provide both a comprehensive and detailed evaluation.

Publication Bias

To mitigate the effects of publication bias, we conducted a search of the reference lists of the included studies. Additionally, we performed a formal assessment of publication bias using diagnostic log odds ratios and asymmetry testing via regression analysis [34].

Results

Study Selection and Characteristics

The primary database search found 3177 records. Duplication removal excluded 951 studies. All other studies underwent title and abstract screening, resulting in the exclusion of 1936 articles. Finally, out of the remaining studies, 246 were excluded from the full-text screening based on: no correlation to the title (n = 175); non-original studies (n = 9); non-English (n = 1); no controls (n = 4), not using imaging as its diagnosis method (n = 44); insufficient data (n = 9); and full text not available (n = 4). The next phase included backward reference screening on the 44 included studies to identify potentially neglected eligible studies (Fig. 1). Any conflicts were resolved through discussion.

Fig. 1.

Fig. 1

PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) diagram describing the process of study selection

Tables 1, 2, and 3 show each study’s radiologic view and reference standard, which is used to develop an image labeling reference. Sixteen studies used ultrasound (US) [3550], fifteen used computed tomography (CT) [5165], and eleven studies used magnetic resonance imaging (MRI) [6676]. One study utilized both CT and MRI [77], while another study used four different modalities [78]. Each study’s control group and the radiologic view is indicated in Table 1, 2, and 3 and E4. Moreover, as revealed in Tables E5 and E6, 43 studies validated their outcome internally [3575, 77, 78], while only four used external validations [43, 44, 58, 76].

Table 1.

Study characteristics, internal validation (with a comparison group)

First author Year Country Imaging modality View Comparison group No. of images per set Reference standard Model output
Training Tuning Testing
Zhang 2009 Japan MRI NR Expert clinicians 50 - 30 Expert consensus Classification
Jiang 2019 China MRI Horizontal Version 2018 EASL, LI-RADS criteria 137 - 92 Histopathology Binary classification
Raluca Brehar 2020 Romania US NR Comparison of multiple algorithms

Image set 1:

10,834

Image set 2:

6672

-

Image set 1:

3224

Image set 2:

2068

Past study Binary classification
Wang 2021 China MRI NR Expert clinicians 444 - 113 Pathology or follow-up images Seven-way classification
Paula M. Oestmann 2021 Germany MRI NR Expert clinicians 140 - 10 Histopathology Classification
Hirotsugu Nakai 2021 Japan CT Axial Expert clinicians 493 - 62 Expert consensus, histopathology Binary classification
Meiyun Wang 2021 China CT Axial Expert clinicians 7512 - 95 Expert consensus Classification
Naoshi Nishida 2022 Japan US Axial Expert clinicians - - 55 Expert consensus Classification
Haiping Zhang 2022 China MRI NR Expert clinicians 72 - 30 Biopsy or surgical resection, expert consensus Binary classification
Chi-Tung Cheng 2022 China CT Axial Expert clinicians 771 - 370 Expert consensus, pathology Classification
Anisha 2023 India CT NR Comparison of multiple algorithms - - - NR classification
Maryam Fotouhi 2023 Iran MRI NR Expert clinicians 274.4 - 78 Pathology or follow-up results Binary classification
Sutthirak Tangruangkiat 2023 Thailand Ultrasound NR Previously published studies 600 - 150 MRI/CT reports Classification
Marinela-Cristiana Urhut 2023 Romania Ultrasound NR Expert clinicians-previously published studies - - - CT or MRI or US Binary classification
Abhishek Midya 2023 USA CT Axial Expert clinicians-previously published studies 60 - - Expert consensus, histopathology Binary classification
Gou 2023 China Contrast enhanced CT NR Expert clinicians 239 - 103 Histopathology Binary classification
Xuepeng Zhang 2023 China Contrast enhanced CT Axial Comparison of multiple algorithms-previously published studies 222 - 95 Histopathology Binary classification

CT computed tomography, CECT contrast-enhanced computed tomography, CEUS contrast-enhanced ultrasonography using Sonzoid, HCC hepatocellular carcinoma, HEM cavernous hemangioma, HEP hepatic abscess, FNH focal nodular hyperplasia, ICC Intrahepatic cholangiocarcinoma, MET hepatic metastasis, GdEOB-DTPA gadoliniumethoxybenzyl-diethylenetriamine penta-acetic acid, MRI magnetic resonance imaging, NR not reported, US ultrasound

Table 2.

Study characteristics, internal validation (without a comparison group)

First author Year Country Imaging modality View Comparison group No. of images per set Reference standard Model output
Training Tuning Testing
E-Liang Chen 1998 Taiwan CT NR NR NR - 30 Past study Binary classification
Junji Shiraishi 2008 Japan US NR NR - - 103 Biopsy or surgical specimens, expert consensus Classifying focal liver lesions
Katsutoshi Sugimoto 2009 Japan US NR NR NR - 137 Expert consensus Three-way classification based on physicians’ subjective pattern
S.S. Kumar 2011 India CT NR NR 45 - 45 Past study Binary classification
Mittal 2011 India US NR NR

Image set 1: 250

Image set 2: 250

Image set 3: 250

Image set 1: 50

Image set 2: 50

Image set 3: 50

Image set 1: 500

Image set 2: 340

Image set 3: 160

Expert radiologist Binary classification
Costin Teodor Streba 2012 Romania US NR NR - - - Other imagistic methods (CT and CE-MRI), liver biopsy Classification
Virmani 2014 India US NR NR 57 - 51 Expert clinicians, patient history Five-way classification
Ilias Gatos 2017 greece MRI NR NR NR - 71 Expert consensus, biopsy Multiclass PNN classification
Amita Das 2018 India CT NR NR 163 - 62 Past study Classification
Marie¨lle J. A. Jansen 2019 Netherlands MRI Axial NR NR - 213 Expert consensus Classification
Makoto Yamakawa 2019 Japan US NR NR NR - 324 Past study Four-way classification
Akash Nayak 2019 India CT Axial NR NR -

Image set 1: 726

Image set 2: 15

Past study Binary classification
Oyama 2019 Japan MRI 3D NR N − 1 = 99 - 100 Different for each group Binary classification
Wenqi Shi 2020 China CT Axial, coronal, and sagittal NR 359 - 90 Expert consensus, histopathology Binary classification
S.Murugesan 2020 India X-ray, CT, MRI, and PET NR NR

Image set 1: 137

Image set 2: 73

-

Image set 1: 91

Image set 2: 18

Past study Binary classification
Cao 2020 China CT Axial NR 410 - 107 Histopathology, patient history, and expert clinicians Classification
Naeem 2020 Pakistan CT, MRI Horizontal NR NR - 21,600 Expert clinicians, medical tests, biopsy report Classification
Qinghua Huang 2020 China US NR NR

Image set 1: 336

Image set 2: 387

-

Image set 1: 30

Image set 2: 30

Biopsy or surgical resection, expert consensus Binary classification
Ryu 2021 North korea US NR NR 3909 - 400 Different for each group Segmentation and classification
Takenaga 2021 Japan MRI Horizontal NR 670 474 402 Expert clinicians and archived reports Detection and classification
Shanshan Ren 2021 China US NR NR 149 - 38 Expert consensus Classification
Thodsawit Tiyarattanachai 2021 Thailand US NR NR 40,397 - 6191 Pathology and/or MRI/CT reports Classification
Mubasher Hussain 2022 Pakistan CT NR NR NR - 1000 Expert consensus Multiclass liver tumor classification
Wei-bin Zhang 2022 China US NR NR 305 - 102 Surgery Binary classification models
Fatemeh Azimi Nanvaee 2023 Iran US NR NR - - - Expert consensus Binary classification
Yating Ling 2023 China Contrast enhanced CT NR NR 481 - 120 Histopathology Binary classification

CT computed tomography, HCC hepatocellular carcinoma, ICC intrahepatic cholangiocarcinoma, FNH focal nodular hyperplasia, FLL focal liver lesions, LT liver transplantation, MWA microwave ablation, RFA radiofrequency ablation, TACE transcatheter arterial chemoembolization, DCE-CT dynamic contrast-enhanced computed tomography, MRI magnetic resonance imaging, NR not reported, N/A not applicable, PET positron emission tomography, US ultrasound

Table 3.

Study characteristics, external validation

First author Year Country Imaging modality View Comparison group No. of images per set Reference standard Model output
Training Tuning Testing
Shanshan Ren 2021 China US NR NR 149 - 38 Expert consensus Classification
Thodsawit Tiyarattanachai 2021 Thailand US NR NR 40,397 - 18,922 pathology and/or MRI/CT reports Classification
Meiyun Wang 2021 China CT axial Expert clinicians

Model 1: 7512

Model 2:

7512

-

Model 1: 556

Model 2:

82

Expert consensus Classification
Seongkeun Park 2023 Korea MRI NR NR - 245 30 Pathology and/or MRI/CT reports Binary classification

ICC intrahepatic cholangiocarcinoma, HCC hepatocellular carcinomas, LT liver transplantation, MWA microwave ablation, RFA radiofrequency ablation, TACE transcatheter arterial chemoembolization, CT computed tomography, MRI magnetic resonance imaging, US ultrasound, NR not reported

Study Participants

The number of participants in each study differed substantially (median, 223.5; range, 30–23756; interquartile range, 106–382; Table E4). In addition, the proportion of HCC subjects in each study varied widely (median, 43.2%; range, 8.61–80.22; interquartile range, 21.95–60.54; Table E4). Ten studies failed to report the total number of subjects [40, 44, 4850, 52, 62, 65, 76, 78]. Also, 12 studies did not report the percentage of HCC subjects in the article [3941, 44, 45, 47, 49, 52, 53, 58, 59, 78].

Algorithm Advancement and Model Output

Tables 1, 2, and 3 shows the widely ranged number of images per set: training (median, 289.7; interquartile range, 140–670), tuning (median, 50; interquartile range, 50–359.5), and test (median, 100; interquartile range, 48–355). Ten studies did not report these data sets separately [35, 36, 40, 46, 51, 54, 59, 67, 68, 77]. It is notable that only two studies used tuning sets [37, 72]. In addition, Table E4 shows data augmentation use, transfer learning utilization, and the validation method in each study.

Thirty-two studies evaluated their model output with sensitivity and specificity, 32 with accuracy, 20 used area under the receiver operating characteristic curve, 9 utilized F1, and 18 articles used positive and negative predictive values.

Quality Assessment

Study adherence to TRIPOD guidelines was variable (Fig. 2). Six items were insufficiently reported in the 44 included studies (< 50% adherence): eligibility criteria for images (43%), sample size estimation (8%), model interpretability (41%), confidence interval reporting (43%), supplementary information availability (39%), and funding source report (49%).

Fig. 2.

Fig. 2

Adherence to Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD), and Prediction Model Study Risk of Bias Assessment Tool (PROBAST) reporting guidelines

As illustrated in Fig. 2, the PROBAST assessment revealed 27 (61%) studies with a high risk of bias and only 5 (11%) articles with high applicability concerns. The key factor to this evaluation was studies with small sample-sized internal validation models or with no external validation. As a result of the inclusion and exclusion criteria of the evaluated studies, patient selection bias was high in 19 (43%) studies. However, every study had a low concern for applicability in this domain. 24 (55%) studies had a high risk of bias for analysis. Studies generally indicated their outcomes with a low concern for bias (37; 84%) and applicability (39; 88%).

Meta-analysis

Forty-four studies provided sufficient data to extract 119 contingency tables used for binary HCC diagnosis (Table 4) [3578]. We calculated 94 contingency tables from 43 studies evaluating algorithm performance on internal validation, seven tables from 4 articles on external validation, and 18 from 6 studies assessing internally validated clinician performance. The pooled sensitivity and specificity for each class of these studies are shown in Table 4. This table also reveals the pooled metrics in HCC detection for each imaging modality. Analysis was not carried out on studies evaluating externally validated clinician performance, in addition to human performance with AI support, as a result of there being less than three of these studies.

Table 4.

pooled sensitivities, specificities, and area under the curve, positive likelihood ratio, negative likelihood ratio, diagnostic odds ratio

Parameter Sensitivity Specificity AUC Positive likelihood ratio Negative likelihood ratio Diagnostic odds ratio No. of contingency tables
Algorithms, internal validation, all modalities 0.84 (0.81–0.87) 0.92 (0.9–0.94) 0.94 (0.92–0.96) 10.4 (8.2–13.3) 0.17 (0.14–0.21) 60 (41–87) 94
CT studies 0.86 (0.81–0.90) 0.94 (0.91–0.96) 0.96 (0.94–0.98) 14 (9.9–19.9) 0.15 (0.10—0.20) 97 (54–172) 25
MRI studies 0.77 (0.71–0.82) 0.85 (0.81–0.88) 0.88 (0.85–0.91) 5.1 (4–6.5) 0.27 (0.21–0.35) 19 (13–28) 35
US studies 0.88 (0.83–0.91) 0.95 (0.92–0.97) 0.97 (0.95–0.98) 16.5 (10.7–25.5) 0.13 (0.09–0.18) 127 (69–234) 29
Algorithms, external validation 0.85 (0.78–0.89) 0.84 (0.72–0.91) 0.91 (0.88–0.93) 5.3 (3–9.3) 0.18 (0.13–0.26) 29 (15–57) 7
Clinicians, internal validation 0.70 (0.60–0.78) 0.85 (0.77–0.90) 0.84 (0.81–0.87) 4.5 (2.8–7.3) 0.36 (0.26–0.50) 13 (6–26) 18

AUC area under the curve, CT computed tomography, MRI magnetic resonance imaging, US ultrasound

The main results for each group of studies are shown in Table 4. The pooled sensitivity and specificity for internally validated AI algorithms are 84% (95% CI: 81, 87) and 92% (95% CI: 90, 94), respectively. The subgroup analysis of AI algorithm performance at internal validation showed a pooled sensitivity and specificity 86% (95% CI: 81, 90) and 94% (95% CI: 91, 96) for studies using CT scan (Fig. 3), 77% (95% CI: 71, 82) and 85% (95% CI: 81, 88) for studies using MRI (Fig. 4), and 88% (95% CI: 83, 91) and 95% (95% CI: 92, 97) for studies using US (Fig. 5). In addition, AI algorithms that were validated externally had a pooled sensitivity of 85% (95% CI: 78, 89) and specificity of 84% (95% CI: 72, 91). When clinicians were internally validated, their pooled sensitivity was 70% (95% CI: 60, 78), while their pooled specificity was 85% (95% CI: 77, 90).

Fig. 3.

Fig. 3

Forest plots of algorithms performance at internal validation for studies using computed tomography. The plot shows the effect size estimate (diamond symbol), confidence interval (horizontal line), and the individual study effect sizes (square symbols) with their corresponding weights

Fig. 4.

Fig. 4

Forest plots of algorithms performance at internal validation for studies using magnetic resonance imaging. The plot shows the effect size estimate (diamond symbol), confidence interval (horizontal line), and the individual study effect sizes (square symbols) with their corresponding weights

Fig. 5.

Fig. 5

Forest plots of algorithms performance at internal validation for studies using ultrasonography. The plot shows the effect size estimate (diamond symbol), confidence interval (horizontal line), and the individual study effect sizes (square symbols) with their corresponding weights

Meta-regression was only assessed in one parameter, data augmentation use, the only variable with a sufficient number of studies (Table E7). Our analysis revealed that studies using data augmentation had a sensitivity of (82%; 95% CI: 76,88; P, 0.00) and specificity of (94%; 95% CI: 91,97; P, 0.00). Surprisingly, articles that did not augment data had higher sensitivity (85%; 95% CI: 82, 88; P, 0.00) and lower specificity (91%; 95% CI: 89, 93; P, 0.00).

Publication Bias

We performed publication bias analysis on studies assessing algorithm development at internal validation, external validation, and studies evaluating clinician performance on internal validation test sets. The slope coefficient was − 19.74 (95% CI: − 35.59, − 3.90; P = 0.015), − 3.51 (95% CI: − 19.61, 12.586; P = 0.59), and − 33.98 (95% CI: − 62.77, − 5.199; P = 0.024), respectively. As shown, there is a high risk of publication bias in studies evaluating internally validated algorithms (P = 0.015) and internally evaluating clinician performance (P = 0.024). The publication bias for each imaging modality in these studies can be seen in Table E8.

Discussion

AI has recently become a rapidly expanding field and a growing number of studies are evaluating the potential of AI in the detection, classification, prediction, and prognosis of HCC. Hence, we performed the first systematic review and meta-analysis which pools the diagnostic performance measures of AI and clinicians in HCC extensively. To summarize, our analysis pooled the internally and externally validated AI algorithm results of studies and compared them with clinician performance at internal validation. Our study revealed that AI algorithms exceeded clinicians at internal validation with a pooled sensitivity of 84% (95% CI: 81, 87) compared to 70% (95% CI: 60, 78) and a pooled specificity of 92% (95% CI: 90, 94) against 85% (95% CI: 77, 90). Furthermore, out of the 16 respectively. However, these results should be interpreted with caution as it is likely that most studies underestimate clinician diagnostic performances, because the included studies mainly used clinicians with a low level of experience [46, 57, 58, 73, 74, 79]. Therefore, this finding needs clarification from future studies by internally validating algorithms in clinical environments and fairly comparing them with a large number of relevant experienced clinicians to decrease the risk of interrater variation. Also, the subgroup analysis for AI algorithms at internal validation based on the imaging modality revealed a sensitivity and specificity of 86% (95% CI: 81, 90) and 94% (95% CI: 91, 96) for CT scan studies, respectively. MRI studies had a sensitivity of 77% (95% CI: 71, 82) and a specificity of 85% (95% CI: 81, 88). We also performed a subgroup analysis on US studies, revealing a surprisingly higher sensitivity and specificity of 88% (95% CI: 83, 91) and 95% (95% CI: 92, 97), respectively. The high publication bias of internally validated studies shown in Table E8 could have resulted in this counterintuitive finding studies that utilized US as their imaging modality, only the studies by Ren et al. and Tiyarattanachai et al. externally validated their algorithms [80, 81]. It is notable that none of the included studies applied more than one modality to compare the diagnostic performance of AI algorithms between different imaging techniques. This implies the need for further studies utilizing different modalities with externally validated algorithms and lower publication bias to shed more light on this finding. Analysis of AI test sets at external validation showed a pooled sensitivity of 85% (95% CI: 78, 89) and a pooled specificity of 84% (95% CI: 72, 91).

Two sets of studies did not reach the required number (i.e., less than three) to perform a meta-analysis. First, studies evaluating externally validated clinician performance [58, 79]. Kim et al. [79] reported a sensitivity of 98%, specificity of 93%, and accuracy of 94%. Wang et al. reported different diagnostic metrics for each radiologist separately [58]. Due to this insufficient number of studies, comparing these results with AI at external validation is not possible. External validation is an important step in validating diagnostic studies. It can aid in recognizing the need for model updating or recalibration. On the other hand, internal validation can overstate algorithm performance, leading to an insufficiency in model output generalizability [82]. Therefore, further externally validated clinicians in randomized clinical trials are needed for this comparison, to better compare AI with clinicians in the clinical environment. Second, Wang et al. also revealed HCC diagnostic assessments for three clinicians using AI assistance [58]. While both studies by Kim et al. and Wang et al. were associated with a low level of risk of bias in PROBAST and TRIPOD checklists, further studies are needed to shed more light on the diagnostic performance of clinicians with AI assistance and compare them with clinicians and AI algorithms, separately [58, 79].

The limitations of this study should be reviewed when interpreting its results. First, not all studies evaluating AI algorithm’s HCC diagnostic performance reported sufficient data to prepare a contingency table and perform a meta-analysis and, thus, were excluded from this study. We would recommend the future studies to provide diagnostic performance metrics (i.e., the number of TP, TN, FP, and FN cases), to be able to be included in large meta-analysis studies. Second, while adherence to TRIPOD guidelines was mostly acceptable, multiple articles did not report essential study details such as training and testing set information. Third, more than half the studies had a high concern for bias, meaning they overestimated algorithm performance, which may limit the conclusions of the meta-analysis. Last but not least, we faced a high level of heterogeneity between studies. As a result, we tried to perform meta-regression and subgroup analyses to control for the factors responsible for the high level of heterogeneity, and we were able to do so for the imaging modalities and data augmentation use. However, the trait of control group among the included studies were highly heterogeneous, and hepatic lesion classification studies mainly had different comparators, for example, hemangiomas [3540, 4446, 49, 5153, 56, 58, 59, 65, 66, 68, 69, 7173, 77], hepatic cysts [37, 39, 40, 4446, 48, 49, 58, 59, 65, 66, 68, 71, 72, 77], metastatic carcinomas [3540, 45, 46, 49, 53, 55, 56, 58, 61, 6669, 72, 73], focal nodular hyperplasia [42, 47, 49, 56, 58, 71, 73, 76], and intrahepatic cholangiocarcinoma [43, 49, 57, 58, 64, 65, 71, 73, 74]. Moreover, most of these studies had a lack of necessary data to build contingency tables for each comparator separately. On the other hand, the number of HCC detection studies with a healthy control group that reported sufficient data for contingency table construction needed to be more adequate for meta-analysis. Consequently, we were not able to differ between HCC detection and hepatic lesion classification studies and perform a subgroup analysis based on different comparators. However, this diversity of study comparators may show an appropriate estimate of real-world applicability for HCC diagnosis in the clinical setting, where a variety of patients with different hepatic anomalies refer. Thus, we suggest that future studies use a mixed variety of control groups, to implicate real-life utilization of AI, and also report, separately for each comparator, the required information for contingency table construction.

In conclusion, this study cautiously implies that AI outperforms clinicians in HCC diagnosis at internal validation. The pooled sensitivity and specificity for internally validated AI algorithms were 84% (95% CI: 81, 87) and 92% (95% CI: 90, 94), respectively. Externally validated AI algorithms had a pooled sensitivity of 85% (95% CI: 78, 89) and specificity of 84% (95% CI: 72, 91). When clinicians were internally validated, their pooled sensitivity was 70% (95% CI: 60, 78), while their pooled specificity was 85% (95% CI: 77, 90). However, as discussed before, future studies must utilize clinicians with real-life clinical backgrounds at external validation with separate results for each comparator. Most of the included studies were done on small-scale datasets in high-income developed countries with very little data from lower middle- and low-income countries, which obscures their credibility. Addressing these issues in future studies are necessary steps forward toward clinical application. Nevertheless, at the time of this review, AI can perform as a diagnostic supplement for clinicians and radiologists by screening images and highlighting regions of interest, thus improving workflow. It may also act as a second reader after radiologists, improving diagnostic certainty. AI-assisted diagnosis of HCC offers the potential for early, accurate and consistent detection, leading to improved patient outcomes, reduced healthcare costs, and increased accessibility to quality healthcare services. In addition, the advancements in AI radiomics and the findings of this study support the potential of AI to enhance the diagnostic ability of imaging tools, improve patient management, and contribute to the concept of precision medicine in HCC. However, this requires a thorough clinician understanding of AI performance through transparent reports of future study methods and results to precisely interpret algorithm outputs.

Supplementary Information

Below is the link to the electronic supplementary material.

Funding

None.

Data Availability

The data that support the findings of this study are available from the authors upon reasonable request.

Declarations

Ethics Approval

Not applicable.

Competing Interest

The authors declare no competing interests.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin. 2021;71(3):209-49. [DOI] [PubMed] [Google Scholar]
  • 2.Ferlay J, Colombet M, Soerjomataram I, Mathers C, Parkin DM, Pineros M, et al. Estimating the global cancer incidence and mortality in 2018: GLOBOCAN sources and methods. Int J Cancer. 2019;144(8):1941-53. [DOI] [PubMed] [Google Scholar]
  • 3.McGlynn KA, Petrick JL, London WT. Global epidemiology of hepatocellular carcinoma: an emphasis on demographic and regional variability. Clin Liver Dis. 2015;19(2):223-38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Vogel A, Meyer T, Sapisochin G, Salem R, Saborowski A. Hepatocellular carcinoma. Lancet. 2022;400(10360):1345-62. [DOI] [PubMed] [Google Scholar]
  • 5.Ding J, Wen Z. Survival improvement and prognosis for hepatocellular carcinoma: analysis of the SEER database. BMC Cancer. 2021;21(1):1157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Njei B, Rotman Y, Ditah I, Lim JK. Emerging trends in hepatocellular carcinoma incidence and mortality. Hepatology. 2015;61(1):191-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Mittal S, Kanwal F, Ying J, Chung R, Sada YH, Temple S, et al. Effectiveness of surveillance for hepatocellular carcinoma in clinical practice: A United States cohort. J Hepatol. 2016;65(6):1148-54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Collier J, Sherman M. Screening for hepatocellular carcinoma. Hepatology. 1998;27(1):273-8. [DOI] [PubMed] [Google Scholar]
  • 9.Liang Y, Xu F, Guo Y, Lai L, Jiang X, Wei X, et al. Diagnostic performance of LI-RADS for MRI and CT detection of HCC: A systematic review and diagnostic meta-analysis. Eur J Radiol. 2021;134:109404. [DOI] [PubMed] [Google Scholar]
  • 10.Kuo RYL, Harrison C, Curran TA, Jones B, Freethy A, Cussons D, et al. Artificial Intelligence in Fracture Detection: A Systematic Review and Meta-Analysis. Radiology. 2022;304(1):50-62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Han SH, Kim KW, Kim S, Youn YC. Artificial Neural Network: Understanding the Basic Concepts without Mathematics. Dement Neurocogn Disord. 2018;17(3):83-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.IBM. What is deep learning? [Available from: https://tinyurl.com/2p8drtfk.
  • 13.Richardson ML, Garwood ER, Lee Y, Li MD, Lo HS, Nagaraju A, et al. Noninterpretive Uses of Artificial Intelligence in Radiology. Acad Radiol. 2021;28(9):1225-35. [DOI] [PubMed] [Google Scholar]
  • 14.Goldberg JE, Rosenkrantz AB. Artificial Intelligence and Radiology: A Social Media Perspective. Curr Probl Diagn Radiol. 2019;48(4):308-11. [DOI] [PubMed] [Google Scholar]
  • 15.Baltzer PA, Dietzel M, Kaiser WA. A simple and robust classification tree for differentiation between benign and malignant lesions in MR-mammography. Eur Radiol. 2013;23(8):2051-60. [DOI] [PubMed] [Google Scholar]
  • 16.Zsoter N, Bandi P, Szabo G, Toth Z, Bundschuh RA, Dinges J, et al. PET-CT based automated lung nodule detection. Annu Int Conf IEEE Eng Med Biol Soc. 2012;2012:4974-7. [DOI] [PubMed] [Google Scholar]
  • 17.Huang W, Tan ZM, Lin Z, Huang GB, Zhou J, Chui CK, et al. A semi-automatic approach to the segmentation of liver parenchyma from 3D CT images with Extreme Learning Machine. Annu Int Conf IEEE Eng Med Biol Soc. 2012;2012:3752-5. [DOI] [PubMed] [Google Scholar]
  • 18.Mewes A, Hensen B, Wacker F, Hansen C. Touchless interaction with software in interventional radiology and surgery: a systematic literature review. Int J Comput Assist Radiol Surg. 2017;12(2):291-305. [DOI] [PubMed] [Google Scholar]
  • 19.Mitrea D, Badea R, Mitrea P, Brad S, Nedevschi S. Hepatocellular Carcinoma Automatic Diagnosis within CEUS and B-Mode Ultrasound Images Using Advanced Machine Learning Methods. Sensors (Basel). 2021;21(6). [DOI] [PMC free article] [PubMed]
  • 20.Ding Y, Ruan S, Wang Y, Shao J, Sun R, Tian W, et al. Novel deep learning radiomics model for preoperative evaluation of hepatocellular carcinoma differentiation based on computed tomography data. Clin Transl Med. 2021;11(11):e570. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Administration TUFaD. Artificial Intelligence and Machine Learning (AI/ML)-Enabled Medical Devices fda.gov2022 [Available from: shorturl.at/lmX29.
  • 22.Sanmarchi F, Fanconi C, Golinelli D, Gori D, Hernandez-Boussard T, Capodici A. Predict, diagnose, and treat chronic kidney disease with machine learning: a systematic literature review. J Nephrol. 2023. [DOI] [PMC free article] [PubMed]
  • 23.Ghozy S, Azzam AY, Kallmes KM, Matsoukas S, Fifi JT, Luijten SPR, et al. The diagnostic performance of artificial intelligence algorithms for identifying M2 segment middle cerebral artery occlusions: A systematic review and meta-analysis. J Neuroradiol. 2023. [DOI] [PubMed]
  • 24.Widaatalla Y, Wolswijk T, Adan F, Hillen LM, Woodruff HC, Halilaj I, et al. The application of artificial intelligence in the detection of basal cell carcinoma: A systematic review. J Eur Acad Dermatol Venereol. 2023. [DOI] [PubMed]
  • 25.Almasan O, Leucuta DC, Hedesiu M, Muresanu S, Popa SL. Temporomandibular Joint Osteoarthritis Diagnosis Employing Artificial Intelligence: Systematic Review and Meta-Analysis. J Clin Med. 2023;12(3). [DOI] [PMC free article] [PubMed]
  • 26.Patil S, Albogami S, Hosmani J, Mujoo S, Kamil MA, Mansour MA, et al. Artificial Intelligence in the Diagnosis of Oral Diseases: Applications and Pitfalls. Diagnostics (Basel). 2022;12(5). [DOI] [PMC free article] [PubMed]
  • 27.Krittanawong C, Johnson KW, Rosenson RS, Wang Z, Aydar M, Baber U, et al. Deep learning for cardiovascular medicine: a practical primer. Eur Heart J. 2019;40(25):2058-73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Thrall JH, Li X, Li Q, Cruz C, Do S, Dreyer K, et al. Artificial Intelligence and Machine Learning in Radiology: Opportunities, Challenges, Pitfalls, and Criteria for Success. J Am Coll Radiol. 2018;15(3 Pt B):504–8. [DOI] [PubMed]
  • 29.Higgins JPT TJ, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.3 (updated February 2022). Cochrane, 2022.
  • 30.Dwamena B. MIDAS: Stata module for meta-analytical integration of diagnostic test accuracy studies. 2009.
  • 31.Harbord RM, Whiting P. Metandi: meta-analysis of diagnostic accuracy using hierarchical logistic regression. The Stata Journal. 2009;9(2):211-29. [Google Scholar]
  • 32.Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement. BMC Med. 2015;13:1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Wolff RF, Moons KGM, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: A Tool to Assess the Risk of Bias and Applicability of Prediction Model Studies. Ann Intern Med. 2019;170(1):51-8. [DOI] [PubMed] [Google Scholar]
  • 34.Deeks JJ, Macaskill P, Irwig L. The performance of tests of publication bias and other sample size effects in systematic reviews of diagnostic test accuracy was assessed. J Clin Epidemiol. 2005;58(9):882-93. [DOI] [PubMed] [Google Scholar]
  • 35.Shiraishi J, Sugimoto K, Moriyasu F, Kamiyama N, Doi K. Computer-aided diagnosis for the classification of focal liver lesions by use of contrast-enhanced ultrasonography. Medical Physics. 2008;35(5):1734–46. [DOI] [PMC free article] [PubMed]
  • 36.Sugimoto K, Shiraishi J, Moriyasu F, Doi K. Computer-aided Diagnosis of Focal Liver Lesions by Use of Physicians' Subjective Classification of Echogenic Patterns in Baseline and Contrast-enhanced Ultrasonography. Academic Radiology. 2009;16(4):401–11. [DOI] [PubMed]
  • 37.Mittal D, Kumar V, Saxena SC, Khandelwal N, Kalra N. Neural network based focal liver lesion diagnosis using ultrasound images. Computerized Medical Imaging and Graphics. 2011;35(4):315-23. [DOI] [PubMed] [Google Scholar]
  • 38.Streba CT, Ionescu M, Gheonea DI, Sandulescu L, Ciurea T, Saftoiu A, et al. Contrast-enhanced ultrasonography parameters in neural network diagnosis of liver tumors. World Journal of Gastroenterology. 2012;18(32):4427-34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Virmani J, Kumar V, Kalra N, Khandelwal N. Neural network ensemble based CAD system for focal liver lesions from B-mode ultrasound. Journal of Digital Imaging. 2014;27(4):520-37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Yamakawa M, Shiina T, Nishida N, Kudo M, editors. Computer aided diagnosis system developed for ultrasound diagnosis of liver lesions using deep learning. IEEE International Ultrasonics Symposium, IUS; 2019. [DOI] [PubMed]
  • 41.Brehar R, Mitrea DA, Vancea F, Marita T, Nedevschi S, Lupsor-Platon M, et al. Comparison of Deep-Learning and Conventional Machine-Learning Methods for the Automatic Recognition of the Hepatocellular Carcinoma Areas from Ultrasound Images. Sensors (Basel). 2020;20(11). [DOI] [PMC free article] [PubMed]
  • 42.Huang Q, Pan F, Li W, Yuan F, Hu H, Huang J, et al. Differential Diagnosis of Atypical Hepatocellular Carcinoma in Contrast-Enhanced Ultrasound Using Spatiooral Diagnostic Semantics. IEEE Journal of Biomedical and Health Informatics. 2020;24(10):2860-9. [DOI] [PubMed] [Google Scholar]
  • 43.Ren S, Li Q, Liu S, Qi Q, Duan S, Mao B, et al. Clinical Value of Machine Learning-Based Ultrasomics in Preoperative Differentiation Between Hepatocellular Carcinoma and Intrahepatic Cholangiocarcinoma: A Multicenter Study. Frontiers in Oncology. 2021;11. [DOI] [PMC free article] [PubMed]
  • 44.Tiyarattanachai T, Apiparakoon T, Marukatat S, Sukcharoen S, Geratikornsupuk N, Anukulkarnkusol N, et al. Development and validation of artificial intelligence to detect and diagnose liver lesions from ultrasound images. PLoS ONE. 2021;16(6 June). [DOI] [PMC free article] [PubMed]
  • 45.Ryu H, Shin SY, Lee JY, Lee KM, Kang HJ, Yi J. Joint segmentation and classification of hepatic lesions in ultrasound images using deep learning. European Radiology. 2021;31(11):8733-42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Nishida N, Yamakawa M, Shiina T, Mekada Y, Nishida M, Sakamoto N, et al. Artificial intelligence (AI) models for the ultrasonographic diagnosis of liver tumors and comparison of diagnostic accuracies between AI and human experts. Journal of Gastroenterology. 2022;57(4):309-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Zhang WB, Hou SZ, Chen YL, Mao F, Dong Y, Chen JG, et al. Deep Learning for Approaching Hepatocellular Carcinoma Ultrasound Screening Dilemma: Identification of α-Fetoprotein-Negative Hepatocellular Carcinoma From Focal Liver Lesion Found in High-Risk Patients. Frontiers in Oncology. 2022;12. [DOI] [PMC free article] [PubMed]
  • 48.Tangruangkiat S, Chaiwongkot N, Pamarapa C, Rawangwong T, Khunnarong A, Chainarong C, et al. Diagnosis of focal liver lesions from ultrasound images using a pretrained residual neural network. J Appl Clin Med Phys. 2024;25(1):e14210. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Urhut MC, Sandulescu LD, Streba CT, Mamuleanu M, Ciocalteu A, Cazacu SM, et al. Diagnostic Performance of an Artificial Intelligence Model Based on Contrast-Enhanced Ultrasound in Patients with Liver Lesions: A Comparative Study with Clinicians. Diagnostics (Basel). 2023;13(21). [DOI] [PMC free article] [PubMed]
  • 50.Azimi Nanvaee F, Setayeshi S. Hepatocellular Carcinoma Diagnosis Based on Ultrasound Images Using Feature Selection Techniques and K-nearest Neighbor Classifier. Hepat Mon. 2023;23(1):e136213. [Google Scholar]
  • 51.Chen EL, Chung PC, Chen CL, Tsai HM, Chang CI. An automatic diagnostic system for CT liver image classification. IEEE Trans Biomed Eng. 1998;45(6):783-94. [DOI] [PubMed] [Google Scholar]
  • 52.Kumar SS, Moni RS. Diagnosis of liver tumour from CT images using contourlet transform. International Journal of Biomedical Engineering and Technology. 2011;7(3):276-90. [Google Scholar]
  • 53.Das A, Das P, Panda SS, Sabut S. Adaptive fuzzy clustering-based texture analysis for classifying liver cancer in abdominal CT images. International Journal of Computational Biology and Drug Design. 2018;11(3):192-208. [Google Scholar]
  • 54.Nayak A, Baidya Kayal E, Arya M, Culli J, Krishan S, Agarwal S, et al. Computer-aided diagnosis of cirrhosis and hepatocellular carcinoma using multi-phase abdomen CT. International Journal of Computer Assisted Radiology and Surgery. 2019;14(8):1341-52. [DOI] [PubMed] [Google Scholar]
  • 55.Cao SE, Zhang LQ, Kuang SC, Shi WQ, Hu B, Xie SD, et al. Multiphase convolutional dense network for the classification of focal liver lesions on dynamic contrast-enhanced computed tomography. World J Gastroenterol. 2020;26(25):3660-72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Shi W, Kuang S, Cao S, Hu B, Xie S, Chen S, et al. Deep learning assisted differentiation of hepatocellular carcinoma from focal liver lesions: choice of four-phase and three-phase CT imaging protocol. Abdominal Radiology. 2020;45(9):2688-97. [DOI] [PubMed] [Google Scholar]
  • 57.Nakai H, Fujimoto K, Yamashita R, Sato T, Someya Y, Taura K, et al. Convolutional neural network for classifying primary liver cancer based on triple-phase CT and tumor marker information: a pilot study. Japanese Journal of Radiology. 2021;39(7):690-702. [DOI] [PubMed] [Google Scholar]
  • 58.Wang M, Fu F, Zheng B, Bai Y, Wu Q, Wu J, et al. Development of an AI system for accurately diagnose hepatocellular carcinoma from computed tomography imaging data. British Journal of Cancer. 2021;125(8):1111-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Hussain M, Saher N, Qadri S. Computer Vision Approach for Liver Tumor Classification Using CT Dataset. Applied Artificial Intelligence. 2022;36(1).
  • 60.Cheng CT, Cai J, Teng W, Zheng Y, Huang YT, Wang YC, et al. A flexible three-dimensional heterophase computed tomography hepatocellular carcinoma detection algorithm for generalizable and practical screening. Hepatology Communications. 2022. [DOI] [PMC free article] [PubMed]
  • 61.Anisha A, Jiji G, Ajith Bosco Raj T. Deep feature fusion and optimized feature selection based ensemble classification of liver lesions. The Imaging Science Journal. 2023;71(6):518–36.
  • 62.Midya A, Chakraborty J, Srouji R, Narayan RR, Boerner T, Zheng J, et al. Computerized Diagnosis of Liver Tumors From CT Scans Using a Deep Neural Network Approach. IEEE J Biomed Health Inform. 2023;27(5):2456-64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Guo L, Li X, Zhang C, Xu Y, Han L, Zhang L. Radiomics Based on Dynamic Contrast-Enhanced Magnetic Resonance Imaging in Preoperative Differentiation of Combined Hepatocellular-Cholangiocarcinoma from Hepatocellular Carcinoma: A Multi-Center Study. J Hepatocell Carcinoma. 2023;10:795-806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Zhang X, Jia N, Wang Y. Multi-input dense convolutional network for classification of hepatocellular carcinoma and intrahepatic cholangiocarcinoma. Biomedical Signal Processing and Control. 2023;80:104226. [Google Scholar]
  • 65.Ling Y, Ying S, Xu L, Peng Z, Mao X, Chen Z, et al. Automatic volumetric diagnosis of hepatocellular carcinoma based on four-phase CT scans with minimum extra information. Front Oncol. 2022;12:960178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Zhang X, Kanematsu M, Fujita H, Zhou X, Hara T, Yokoyama R, et al. Application of an artificial neural network to the computer-aided differentiation of focal liver disease in MR imaging. Radiological Physics and Technology. 2009;2(2):175-82. [DOI] [PubMed] [Google Scholar]
  • 67.Gatos I, Tsantis S, Karamesini M, Spiliopoulos S, Karnabatidis D, Hazle JD, et al. Focal liver lesions segmentation and classification in nonenhanced T2-weighted MRI. Medical Physics. 2017;44(7):3695-705. [DOI] [PubMed] [Google Scholar]
  • 68.Jansen MJA, Kuijf HJ, Veldhuis WB, Wessels FJ, Viergever MA, Pluim JPW. Automatic classification of focal liver lesions based on MRI and risk factors. PLoS ONE. 2019;14(5). [DOI] [PMC free article] [PubMed]
  • 69.Oyama A, Hiraoka Y, Obayashi I, Saikawa Y, Furui S, Shiraishi K, et al. Hepatic tumor classification using texture and topology analysis of non-contrast-enhanced three-dimensional T1-weighted MR images with a radiomics approach. Scientific Reports. 2019;9(1). [DOI] [PMC free article] [PubMed]
  • 70.Jiang H, Liu X, Chen J, Wei Y, Lee JM, Cao L, et al. Man or machine? Prospective comparison of the version 2018 EASL, LI-RADS criteria and a radiomics model to diagnose hepatocellular carcinoma. Cancer Imaging. 2019;19(1). [DOI] [PMC free article] [PubMed]
  • 71.Oestmann PM, Wang CJ, Savic LJ, Hamm CA, Stark S, Schobert I, et al. Deep learning–assisted differentiation of pathologically proven atypical and typical hepatocellular carcinoma (HCC) versus non-HCC on contrast-enhanced MRI of the liver. European Radiology. 2021;31(7):4981-90. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Takenaga T, Hanaoka S, Nomura Y, Nakao T, Shibata H, Miki S, et al. Multichannel three-dimensional fully convolutional residual network-based focal liver lesion detection and classification in Gd-EOB-DTPA-enhanced MRI. International Journal of Computer Assisted Radiology and Surgery. 2021;16(9):1527-36. [DOI] [PubMed] [Google Scholar]
  • 73.Wang SH, Han XJ, Du J, Wang ZC, Yuan C, Chen Y, et al. Saliency-based 3D convolutional neural network for categorising common focal liver lesions on multisequence MRI. Insights into Imaging. 2021;12(1). [DOI] [PMC free article] [PubMed]
  • 74.Zhang H, Guo D, Liu H, He X, Qiao X, Liu X, et al. MRI-Based Radiomics Models to Discriminate Hepatocellular Carcinoma and Non-Hepatocellular Carcinoma in LR-M According to LI-RADS Version 2018. Diagnostics. 2022;12(5). [DOI] [PMC free article] [PubMed]
  • 75.Fotouhi M, Samadi Khoshe Mehr F, Delazar S, Shahidi R, Setayeshpour B, Toosi MN, et al. Assessment of LI-RADS efficacy in classification of hepatocellular carcinoma and benign liver nodules using DCE-MRI features and machine learning. Eur J Radiol Open. 2023;11:100535. [DOI] [PMC free article] [PubMed]
  • 76.Park S, Byun J, Hwang SM. Utilization of a Machine Learning Algorithm for the Application of Ancillary Features to LI-RADS Categories LR3 and LR4 on Gadoxetate Disodium-Enhanced MRI. Cancers (Basel). 2023;15(5). [DOI] [PMC free article] [PubMed]
  • 77.Naeem S, Ali A, Qadri S, Mashwani WK, Tairan N, Shah H, et al. Machine-learning based hybrid-feature analysis for liver cancer classification using fused (MR and CT) images. Applied Sciences (Switzerland). 2020;10(9).
  • 78.Murugesan S, Bhuvaneswaran RS, Khanna Nehemiah H, Keerthana Sankari S, Nancy Jane Y. Feature Selection and Classification of Clinical Datasets Using Bioinspired Algorithms and Super Learner. Computational and Mathematical Methods in Medicine. 2021;2021. [DOI] [PMC free article] [PubMed]
  • 79.Kim J, Min JH, Kim SK, Shin SY, Lee MW. Detection of Hepatocellular Carcinoma in Contrast-Enhanced Magnetic Resonance Imaging Using Deep Learning Classifier: A Multi-Center Retrospective Study. Scientific Reports. 2020;10(1). [DOI] [PMC free article] [PubMed]
  • 80.Tiyarattanachai T, Apiparakoon T, Marukatat S, Sukcharoen S, Geratikornsupuk N, Anukulkarnkusol N, et al. Development and validation of artificial intelligence to detect and diagnose liver lesions from ultrasound images. PLoS One. 2021;16(6):e0252882. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Ren S, Li Q, Liu S, Qi Q, Duan S, Mao B, et al. Clinical Value of Machine Learning-Based Ultrasomics in Preoperative Differentiation Between Hepatocellular Carcinoma and Intrahepatic Cholangiocarcinoma: A Multicenter Study. Front Oncol. 2021;11:749137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Bleeker SE, Moll HA, Steyerberg EW, Donders AR, Derksen-Lubsen G, Grobbee DE, et al. External validation is necessary in prediction research: a clinical example. J Clin Epidemiol. 2003;56(9):826-32. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The data that support the findings of this study are available from the authors upon reasonable request.


Articles from Journal of Imaging Informatics in Medicine are provided here courtesy of Springer

RESOURCES