Skip to main content
Quantitative Imaging in Medicine and Surgery logoLink to Quantitative Imaging in Medicine and Surgery
. 2025 Aug 18;15(9):7922–7934. doi: 10.21037/qims-2025-238

Impact of a deep learning image reconstruction algorithm on the robustness of abdominal computed tomography radiomics features using standard and low radiation doses

Shuo Yang 1, Yifan Bie 1, Lei Zhao 1, Kun Luan 1, Xingchao Li 1, Yanheng Chi 1, Zhen Bian 1, Deqing Zhang 1, Guodong Pang 1, Hai Zhong 1,
PMCID: PMC12397659  PMID: 40893527

Abstract

Background

Deep learning image reconstruction (DLIR) can enhance image quality and lower image dose, yet its impact on radiomics features (RFs) remains unclear. This study aimed to compare the effects of DLIR and conventional adaptive statistical iterative reconstruction-Veo (ASIR-V) algorithms on the robustness of RFs using standard and low-dose abdominal clinical computed tomography (CT) scans.

Methods

A total of 54 patients with hepatic masses who underwent abdominal contrast-enhanced CT scans were retrospectively analyzed. The raw data of standard dose in the venous phase and low dose in the delayed phase were reconstructed using five reconstruction settings, including ASIR-V at 30% (ASIR-V30%) and 70% (ASIR-V70%) levels, and DLIR at low (DLIR-L), medium (DLIR-M), and high (DLIR-H) levels. The PyRadiomics platform was used for the extraction of RFs in 18 regions of interest (ROIs) in different organs or tissues. The consistency of RFs among different algorithms and different strength levels was tested by coefficient of variation (CV) and quartile coefficient of dispersion (QCD). The consistency of RFs among different strength levels of the same algorithm and clinically comparable levels across algorithms was evaluated by intraclass correlation coefficient (ICC). Robust features were identified by Kruskal-Wallis and Mann-Whitney U test.

Results

Among the five reconstruction methods, the mean CV and QCD in the standard-dose group were 0.364 and 0.213, respectively, and the corresponding values were 0.444 and 0.245 in the low-dose group. The mean ICC values between ASIR-V 30% and 70%, DLIR-L and M, DLIR-M and H, DLIR-L and H, ASIR-V30% and DLIR-M, and ASIR-V70% and DLIR-H were 0.672, 0.734, 0.756, 0.629, 0.724, and 0.651, respectively, in the standard-dose group, and the corresponding values were 0.500, 0.567, 0.700, 0.474, 0.499, and 0.650 in the low-dose group. The ICC values between DLIR-M and H under low-dose conditions were even higher than those of ASIR-V30% and -V70% under standard dose conditions. Among the five reconstruction settings, averages of 14.0% (117/837) and 10.3% (86/837) of RFs across 18 ROIs exhibited robustness under standard-dose and low-dose conditions, respectively. Some 23.1% (193/837) of RFs demonstrated robustness between the low-dose DLIR-M and H groups, which was higher than the 21.0% (176/837) observed in the standard-dose ASIR-V30% and -V70% groups.

Conclusions

Most of the RFs lacked reproducibility across algorithms and energy levels. However, DLIR at medium (M) and high (H) levels significantly improved RFs consistency and robustness, even at reduced doses.

Keywords: Deep learning, computed tomography (CT), image reconstruction, texture analysis, liver

Introduction

Radiomics involves the computerized extraction and analysis of radiomics features (RFs) from medical images to explore potential connections between images and diseases (1,2). However, the reliability of RFs has posed significant challenges in early-stage research, potentially limiting the generalizability of subsequent models in multi-center clinical applications (3-5). RFs are influenced by various factors, such as computed tomography (CT) scanners, acquisition and reconstruction parameters, and image reconstruction algorithms (6-11). Therefore, evaluating and optimizing these influencing factors is crucial for enhancing the utility of cross-center radiomics models (12,13).

Different image reconstruction algorithms exhibit distinct characteristics in terms of kernel design, computational efficiency, accuracy, and stability (14-16). Unlike the adaptive statistical iterative reconstruction-Veo (ASIR-V) algorithm, the deep learning image reconstruction (DLIR) algorithm (TrueFidelityTM) learns to distinguish noise from true anatomical structures through hierarchical feature extraction, preserving natural image textures more effectively (16). This characteristic imparts potential for improving image quality, reducing radiation dose, and optimizing lesion detection rates (17-20). However, different reconstruction algorithms and levels have varying effects on the robustness of RFs (21,22). As yet, the influence of DLIR on RFs has not been thoroughly evaluated, particularly in studies involving clinical models.

The primary objective of this research was to investigate the influence of different levels of the DLIR algorithm on the consistency of reconstructed images in patients with hepatic masses and explore the robust RFs, focusing on standard- and low-dose scanning.

Methods

Study population

The present study was approved by the Institutional Review Board of Shandong University (No. KYLL2025286), and informed consent was provided by all the patients. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. A total of 70 patients who underwent abdominal contrast-enhanced CT scans between August 2020 and April 2021 were retrospectively enrolled. The inclusion criteria were as follows: (I) patients with suspected hepatic masses according to primarily diagnosis in our hospital; (II) hepatic masses without surgical interventional therapy or radiotherapy before CT examination; and (III) patients who agreed to undergo the scanning protocol of the study. The exclusion criteria were as follows: (I) patients without clear pathology results or clinical diagnosis (n=7); (II) masses undetected by CT that could not be delineated (n=4); and (III) patients with diffused masses that could not be delineated (n=5). A total of 54 patients were enrolled in the study after excluding 16 patients who met the exclusion criteria.

CT scanners and image data

All patients in the study underwent a 256-slice multidetector contrast-enhanced liver CT (Revolution CT, GE Healthcare Waukesha, WI, USA) examination with the parameters outlined in Table 1. The intravenous contrast agent used was Ioversol (320 mgI/mL, Jiangsu Hengrui Pharmaceuticals Co., Ltd., China) at a weight-based contrast of 1.2 mL/kg injected at a speed of 3.0 mL/s. The arterial phase was scanned at a delay of 5 seconds after the threshold of the abdominal aorta reached 120 Hounsfield units (HU). The portal venous phase was scanned with a standard dose and a delay of 30 seconds after the arterial phase, followed by the delayed phase scanned with a low dose and a delay of 90 seconds after the portal venous phase. The dose was adjusted by different noise indexes. The raw data of the standard-dose portal venous phase and low-dose delayed phase were reconstructed using the following five methods: ASIR-V at 30% (ASIR-V30%) and 70% (ASIR-V70%) levels, and DLIR at low (DLIR-L), medium (DLIR-M), and high (DLIR-H) levels. The reconstruction was carried out using a 1.25-mm slice thickness at a 1.25-mm interval in the axial plane.

Table 1. Scanning parameters for standard and low doses.

Group Tube voltage (kVp) Tube current (mA) Noise index Rotation time (s) Pitch Slice thickness (mm) CTDI (mGy), median (IQR)
Standard-dose 120 50–500 16.0 0.50 0.992:1 1.25 11.33 (9.03, 12.87)
Low-dose 120 50–500 8.0 0.50 0.992:1 1.25 2.45 (1.86, 2.54)
P value <0.01

, Mann-Whitney U test. CTDI, computed tomography dose index; IQR, interquartile range.

Segmentation and feature extraction

Radiologist A with four years of experience in abdominal radiology utilized ITK-SNAP (version 3.6.0; http://www.itksnap.org/pmwiki/pmwiki.php) to delineate regions of interest (ROIs). A total of 16 circular two-dimensional (2D) ROIs (ROIs 1 to 16) in different organs or tissues, one 2D full-layer ROI (ROI 17), and one three-dimensional (3D) hepatic mass ROI (ROI 18) were manually delineated in DLIR-H images at standard- and low-dose levels, respectively. Each ROI represented a specific organ or tissue, such as the right liver lobe (ROI 1), left liver lobe (ROI 2), neck of the pancreas (ROI 3), spleen (ROI4), right kidney (ROI 5), left kidney (ROI 6), right adrenal gland (ROI 7), left adrenal gland (ROI 8), aortaventralis (ROI 9), portal trunk (ROI 10), postcava (ROI 11), visceral fat (ROI 12), subcutaneous fat (ROI 13), right erector spinae muscle (ROI 14), left erector spinae muscle (ROI 15), 12th thoracic vertebrae (ROI 16), full-layer in porta hepatis level (ROI 17), and 3D hepatic mass (ROI 18). To ensure that the ROIs do not exceed the organ boundaries, the diameters of the outlines for ROIs 1 to 16 were specified as follows: the diameters for ROI 7 (right adrenal gland) and ROI 8 (left adrenal gland) were set at 6 mm, the diameter for ROI 10 (portal trunk) was 10 mm, whereas the outline diameters for the remaining ROIs were 15 mm. ROI 17 was delineated as a 2D full-layer region along the abdominal wall edge at the porta hepatis level on axial images, encompassing all intra-abdominal tissues within that slice (e.g., liver, spleen, blood vessels, fat, muscles). ROI 18 was delineated layer by layer on the axial image along the main hepatic mass edge.

For each case with the same dose, the segment file corresponded to five reconstructed images during feature extraction. Given that the raw data remained consistent, the ROIs ensured uniformity. RFs were extracted from the five reconstructed images utilizing an open-source software package PyRadiomics (Version 4.10.2; https://pyradiomics.readthedocs.io/en/latest/) within the 3D Slicer software (version 4.11; https://www.slicer.org) respectively. Prior to feature extraction, images underwent standardized preprocessing to ensure intra-study consistency, including resampling to a voxel spacing of 1.0 mm3, intensity normalization using Z-score transformation, and gray-level discretization with a fixed bin width of 25 HU. A total of 837 features were extracted for each ROI, encompassing 18 first-order features (histogram), 75 texture features [24 gray-level co-occurrence matrix (GLCM); 14 gray-level difference matrix (GLDM); 16 gray-level run-length matrix (GLRLM); 16 gray-level zone size matrix (GLZSM); 5 neighborhood gray-tone difference matrix (GTDM)] and 744 wavelet features. The study workflow is presented in Figure 1.

Figure 1.

Figure 1

Summary of the study workflow consisting of four steps: image reconstruction, image segmentation, radiomics extraction, and radiomics analysis. The raw data for a 61-year-old female patient with pathologically confirmed hepatic abscess imaged under standard-dose conditions were reconstructed using five reconstruction algorithms, including ASIR-V30% and -V70%, and DLIR-L, M, and H. ITK-SNAP was used to delineate 18 ROIs at DLIR-H reconstruction. PyRadiomics was employed to extract 837 RFs at five reconstruction settings respectively, including 18 first-order, 75 texture, and 744 wavelet RFs. ASIR-V, adaptive statistical iterative reconstruction-Veo; CV, coefficient of variation; DLIR, deep learning image reconstruction; H, high; ICC, intraclass correlation coefficient; L, low; M, medium; QCD, quartile coefficient of dispersion; RFs, radiomics features; ROIs, regions of interest.

Statistical analysis

The statistical analysis was performed using R language version 3.6.3 (https://www.r-project.org/). Either the independent t-test or Mann-Whitney U test was used to compare the continuous variables between groups. The coefficient of variation [CV, CV = (σ/µ) ×100%] and quartile coefficient of dispersion [QCD, QCD = (Q3 − Q1/Q3 + Q1) ×100%] were employed to analyze the consistency among five reconstruction settings. The intraclass correlation coefficient (ICC) with a two-way mixed model was calculated and utilized to assess the consistency among different strength levels of the same algorithm (ASIR-V30% and -V70%, DLIR-L and M, DLIR-M and H, and DLIR-L and H) as well as clinically comparable levels across algorithms (ASIR-V30% and DLIR-M, ASIR-V70% and DLIR-H). The Kruskal-Wallis test was used to determine the RFs’ robustness across five reconstruction settings. The Mann-Whitney U test was used to determine the RFs’ robustness between different strength levels of the same algorithm and clinically comparable levels across algorithms. A P value <0.05 was considered statistically significant.

Results

Participants and characteristics

A total of 54 participants were included in this study, comprising 30 males and 24 females with an average age of 59±1 years and body mass index (BMI) of 25.73±0.45 kg/m2. Pathological assessment confirmed 40 cases, comprising 33 hepatocellular carcinoma, 3 cholangiocarcinoma, 3 secondary liver metastases, and 1 hepatic abscess cases. Clinical diagnosis confirmed 14 cases, including 6 cases of secondary liver metastasis, 4 cases of primary liver cancer, and 4 cases of hepatic hemangioma. Of these, 42 were single lesions and 12 were multiple cases (Table 2).

Table 2. Demographic and basic data for patients.

Parameter Value
No. of patients 54
Age (years) 59±1
Sex
   Male 30
   Female 24
Body mass index (kg/m2) 25.73±0.45
Hepatic masses
   No. of malignant/benign masses 49/5
   No. of confirmed by pathological/clinical diagnosed 40/14

Unless otherwise specified, presented values are numbers. , data are mean ± standard deviation.

Consistency of RFs at different algorithms and strength levels

Under varying algorithms and strength levels, the overall radiomics consistency was low. The standard-dose output among the five reconstruction settings yielded a mean CV of 0.364 and a mean QCD of 0.213 across all 18 ROIs. In contrast, the low-dose output demonstrated a mean CV of 0.444 and a mean QCD of 0.245 across the 18 ROIs, indicating a lower level of consistency compared to the standard-dose group. Single-component ROIs [1–16] had shown a significantly higher average CV compared to complex-structured ROIs (17 and 18) (all P<0.05; Tables 3,4). First-order features were more consistent compared to texture and wavelet features, with a mean CV of 0.246 and mean QCD of 0.101 in the standard-dose group (all P<0.05), and a mean CV of 0.36 and mean QCD of 0.17 in the low-dose group (Tables 3-5).

Table 3. RFs consistency among five reconstruction settings under standard-dose conditions.

ROI Organs CV QCD
All First-order Texture Wavelet All First-order Texture Wavelet
1 Right liver lobe 0.397 0.203 0.469 0.428 0.213 0.100 0.192 0.214
2 Left liver lobe 0.410 0.213 0.432 0.458 0.201 0.110 0.138 0.212
3 Pancreas 0.367 0.250 0.345 0.394 0.206 0.117 0.139 0.213
4 Spleen 0.419 0.972 0.471 0.412 0.204 0.126 0.200 0.207
5 Right kidney 0.357 0.147 0.273 0.420 0.232 0.079 0.155 0.242
6 Left kidney 0.364 0.129 0.257 0.380 0.195 0.071 0.138 0.202
7 Right adrenal 0.347 0.200 0.275 0.410 0.178 0.104 0.142 0.183
8 Left adrenal 0.361 0.314 0.269 0.660 0.239 0.131 0.120 0.249
9 Aortaventralis 0.419 0.283 0.472 0.488 0.241 0.138 0.199 0.246
10 Portal 0.398 0.188 0.347 0.445 0.219 0.100 0.162 0.218
11 Postcava 0.387 0.285 0.333 0.612 0.249 0.137 0.157 0.257
12 Visceral fat 0.350 0.165 0.371 0.434 0.182 0.080 0.138 0.180
13 Subcutaneous fat 0.351 0.196 0.392 0.464 0.197 0.097 0.186 0.204
14 Thoracic vertebrae 0.328 0.214 0.214 0.459 0.267 0.089 0.119 0.283
15 Right erector spinae 0.402 0.231 0.326 0.550 0.213 0.120 0.173 0.219
16 Left erector spinae 0.392 0.266 0.318 0.513 0.241 0.119 0.144 0.251
17 Full-layer 0.242 0.034 0.084 0.251 0.155 0.017 0.054 0.164
18 Hepatic mass 0.266 0.137 0.158 0.287 0.193 0.085 0.087 0.493
1–18 Average 0.364 0.246 0.323 0.448 0.213 0.101 0.147 0.235
P ROIs 1–16 vs. 17 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001
ROIs 1–16 vs. 18 0.03 0.04 <0.001 <0.001 0.12 0.01 <0.001 0.07

, independent t-tests were performed to compare the CV and QCD between single-organ/tissue ROIs (ROIs 1–16) and complex-structure ROIs (ROIs 17–18). CV, coefficient of variation; QCD, quartile coefficient of dispersion; RFs, radiomics features; ROIs, regions of interest.

Table 4. RFs consistency among five reconstruction settings under low-dose conditions.

ROI Organs CV QCD
All First-order Texture Wavelet All First-order Texture Wavelet
1 Right liver lobe 0.543 0.478 0.521 0.550 0.268 0.142 0.234 0.280
2 Left liver lobe 0.521 0.296 0.529 0.472 0.251 0.253 0.276 0.253
3 Pancreas 0.503 0.523 0.458 0.490 0.242 0.198 0.228 0.300
4 Spleen 0.505 0.316 0.546 0.556 0.274 0.161 0.277 0.276
5 Right kidney 0.445 0.197 0.390 0.486 0.253 0.111 0.196 0.260
6 Left kidney 0.422 0.402 0.368 0.464 0.231 0.110 0.189 0.240
7 Right adrenal 0.400 0.565 0.388 0.401 0.239 0.277 0.226 0.272
8 Left adrenal 0.419 0.455 0.368 0.537 0.234 0.273 0.183 0.282
9 Aortaventralis 0.473 0.325 0.481 0.496 0.266 0.142 0.269 0.271
10 Portal 0.455 0.341 0.515 0.462 0.250 0.139 0.238 0.255
11 Postcava 0.418 0.253 0.393 0.428 0.245 0.159 0.196 0.252
12 Visceral fat 0.447 0.266 0.462 0.449 0.235 0.186 0.239 0.251
13 Subcutaneous fat 0.463 0.317 0.503 0.502 0.252 0.107 0.206 0.292
14 Thoracic vertebrae 0.384 0.321 0.273 0.411 0.230 0.199 0.155 0.237
15 Right erector spinae 0.462 0.613 0.433 0.542 0.269 0.214 0.206 0.273
16 Left erector spinae 0.470 0.636 0.406 0.508 0.258 0.260 0.218 0.254
17 Full-layer 0.327 0.048 0.113 0.364 0.198 0.032 0.085 0.193
18 Hepatic mass 0.343 0.188 0.232 0.357 0.221 0.113 0.128 0.224
1–18 Average 0.444 0.363 0.410 0.471 0.245 0.171 0.208 0.259
P ROIs 1–16 vs. 17 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001
ROIs 1–16 vs. 18 <0.001 <0.001 <0.001 <0.001 0.08 0.03 <0.001 0.10

, independent t-tests were performed to compare the CV and QCD between single-organ/tissue ROIs (ROIs 1–16) and complex-structure ROIs (ROIs 17–18). CV, coefficient of variation; QCD, quartile coefficient of dispersion; RFs, radiomics features; ROIs, regions of interest.

Table 5. Comparison of CV and QCD among first-order, texture, and wavelet features.

P value Standard-dose group Low-dose group
CV QCD CV QCD
First-order vs. texture vs. wavelet <0.001 <0.001 0.04 <0.001
First-order vs. texture 0.01 0.02 0.86 0.61
First-order vs. wavelet <0.001 <0.001 0.03 <0.001
Texture vs. wavelet 0.01 <0.001 0.41 0.01

, Kruskal-Wallis test was used to compare CV and QCD among five reconstruction settings and Bonferroni method was used for pairwise comparisons. CV, coefficient of variation; QCD, quartile coefficient of dispersion.

Consistency between different strength levels of the same algorithm and clinically comparable levels across algorithms

Between different strength levels of the same algorithm (ASIR-V30% and -V70%, DLIR-L and M, DLIR-M and H, and DLIR-L and H) and clinically comparable levels across algorithms (ASIR-V30% and DLIR-M, ASIR-V70% and DLIR-H), the mean ICC values for all features across 18 ROIs were 0.672, 0.734, 0.756, 0.629, 0.724, and 0.651, respectively, in the standard-dose group, and the corresponding values were 0.500, 0.567, 0.700, 0.474, 0.499, and 0.650 in the low-dose group. The DLIR-M and H groups exhibited higher ICC values in both dose groups. Specifically, the mean ICC of DLIR-M and H in the low-dose group (0.700) remained higher than that of ASIR-V30% and -V70% in the standard-dose group (0.672). Additionally, first-order features consistently showed higher ICC values than texture and wavelet features across all comparisons (Figure 2).

Figure 2.

Figure 2

The mean ICC values at different strength levels of same algorithm (ASIR-V 30% and 70%, DLIR-L and M, DLIR-M and H, DLIR-L and H) or clinically comparable levels across algorithms (ASIR-V30% and DLIR-M, ASIR-V70% and DLIR-H). “+” are presented as mean ICC values, with specific values provided below the box plot. a, different strength levels of the same algorithm; b, clinically comparable levels across algorithms. ICC, intraclass correlation coefficient; ASIR-V, adaptive statistical iterative reconstruction-Veo; DLIR, deep learning image reconstruction; H, high; L, low; M, medium.

Robustness of RFs

Under standard-dose conditions, an average of 14.0% (117/837) of RFs across 18 ROIs exhibited robustness across five reconstruction settings (median 117 features; range, 68–328), with only 4.7% (39/837) demonstrating robustness across all 18 ROIs. Under low-dose conditions, an average of 10.3% (86/837) of RFs were robust (median 86; range, 50–186), with only 3.0% (25/837) yielding robustness across all ROIs. Mean, median, and ClusterShade features simultaneously demonstrated robustness across any algorithm, ROI, or dose. First-order features exhibited higher robustness than texture and wavelet features in both dose groups (Table 6).

Table 6. Robust RFs among five reconstructions under standard and low dose groups.

ROI Location Standard-dose group Low-dose group
Overall (n=837) First-order (n=18) Texture (n=75) Wavelet (n=744) Overall (n=837) First-order (n=18) Texture (n=75) Wavelet (n=744)
1 Right liver Lobe 69 7 4 58 50 6 1 43
2 Left liver lobe 73 7 5 61 56 7 2 47
3 Pancreas 99 7 11 81 52 6 1 45
4 Spleen 68 7 4 57 51 6 1 44
5 Right kidney 88 9 7 72 70 8 3 59
6 Left kidney 88 9 4 75 70 8 4 58
7 Right adrenal 199 9 27 163 108 4 4 99
8 Left adrenal 188 9 22 157 106 7 7 92
9 Aortaventralis 72 7 5 60 59 7 3 49
10 Portal 97 8 12 77 63 7 2 54
11 Postcava 114 9 15 90 73 7 5 61
12 Visceral fat 103 9 10 84 73 6 3 64
13 Subcutaneous fat 126 8 15 103 85 7 7 72
14 Thoracic vertebrae 89 11 10 68 82 9 5 68
15 Right erector spinae 74 6 5 63 60 3 4 53
16 Left erector spinae 84 6 7 71 59 3 5 51
17 Full-layer 150 15 24 111 186 14 18 154
18 Hepatic mass 328 18 50 260 243 8 30 205
1–18 Average 117 9 13 95 86 7 6 73
Overall 39 6 1 32 25 2 1 22

Data are presented as numbers. RFs, radiomics features; ROI, region of interest.

Some 45.0% (377/837) of RFs were consistent between the DLIR-M and H group under standard-dose conditions (median 377; range, 221–719), significantly higher than the 21.0% (176/837) observed between ASIR-V30% and -V70% (median 176; range, 113–362). Notably, the low-dose DLIR-M and H group exhibited 23.1% (193/837; median 193; range, 124–327) RFs, surpassing the robustness of the standard-dose ASIR-V group (Figure 3).

Figure 3.

Figure 3

Heatmap of robust RFs identified at different strength levels of same algorithm, clinically comparable levels across algorithms. Presented values are numbers. The number of RFs under low-dose conditions was significantly lower than that under standard-dose conditions across each ROI. Regardless of whether the dose was conventional or low, the number of RFs in the DLIR-M and H groups was consistently higher than that in the ASIR-V30% and -V70% groups across each ROI. The number of RFs under low-dose DLIR-M and H groups was higher than that under standard-dose ASIR-V 30% and 70% groups. a, different strength levels of the same algorithm; b, clinically comparable levels across algorithms. ASIR-V, adaptive statistical iterative reconstruction-Veo; DLIR, deep learning image reconstruction; H, high; L, low; M, medium; RFs, radiomics features; ROI, region of interest.

Discussion

In this study, abdominal CT scans at standard-dose venous and low-dose delayed phases were reconstructed using ASIR-V (30% and 70%) and DLIR (L, M, and H) to assess RFs robustness. The key findings revealed that DLIR-M and H significantly enhanced the RFs’ robustness across dose levels, achieving ICCs of 0.756 in the standard-dose group and 0.700 in the low-dose group. Notably, the ICC for low-dose DLIR-M and H surpassed that of standard-dose ASIR-V30% and -V70% (0.672). Correspondingly, the proportion of robust RFs in DLIR-M and H group reached 23.1% under low-dose conditions, which was substantially higher than the 21.0 observed in standard-dose ASIR-V30% and -V70% group.

RF robustness was influenced by various factors, such as CT scanners, scanning and reconstruction parameters, image reconstruction algorithms, and material variables (9,10). These factors affected HU values, noise levels, resolution, contrast, interpixel relationships, and other image parameters that are closely related to the definition of RFs (10,23). DLIR was based on a convolutional neural network (CNN) architecture and, through deep training on massive high-quality CT datasets, used multi-layer nonlinear transformations to automatically identify differences between noise and true anatomical structures. This mechanism overcame the “smoothing effects” of traditional iterative reconstruction (IR) algorithms, improving image quality while accurately preserving fine texture structures (24). Although previous phantom studies have shown that the feature reproducibility of DLIR differed from that of traditional IR algorithms (25,26), our study systematically evaluated the consistency of RFs between different strength levels of the same algorithm (ASIR-V30% and -V70%, DLIR-L and M, DLIR-M and H, and DLIR-L and H) and clinically equivalent levels across algorithms (ASIR-V30% and DLIR-M, ASIR-V70%). Based on the principle of noise power spectrum (NPS) matching, ASIR-V30% was paired with DLIR-M, and ASIR-V70% was paired with DLIR-H as clinically equivalent reconstruction parameter combinations (27). The results showed that the ICCs of the DLIR-M and H groups were higher than those of the ASIR-V30% and -V70% groups and other DLIR strength levels. The low-dose DLIR-M and H groups (ICC =0.700) demonstrated better feature consistency than the standard-dose ASIR-V30% and -V70% groups (0.672). In addition to quantitative metrics, statistical tests further confirmed that the number of robust features in the low-dose DLIR-M and H groups was higher than that of the ASIR-V30% and -V70% groups and other DLIR strength levels. The observed advantages of DLIR likely stemmed from its cross-dose generalization capability—by learning from diverse clinical data, it effectively reduced feature variability caused by reconstruction algorithms (24). Furthermore, the stability demonstrated by this algorithm in low-contrast tissues such as fat and bone indicates that its noise-texture preservation mechanism can mitigate the impact of dose variations on feature consistency independently of contrast agent concentration. DLIR-M and H not only improve image quality (24,28,29), but also foster better consistency in radiomic features. Although RF robustness alone does not guarantee diagnostic superiority, it serves as a prerequisite for reliable radiomic modeling.

In the analysis of feature types, first-order features were more stable than other feature types. Some 14.0% (117/837) of RFs across 18 ROIs exhibited robustness across five reconstructions under the standard-dose condition, whereas only 39 features were robust across all 18 ROIs simultaneously, comprising 6 first-order features (energy, kurtosis, mean, median, RootMeanSquared, and TotalEnergy), 1 texture feature (ClusterShade), and 32 wavelet features. Under low-dose conditions, only 10.3% (86/837) of RFs demonstrated robustness, with 3.0% (25/837) yielding robust results across all 18 ROIs, comprising 2 (2/18, 11.1%) first-order (mean, median), 1 (1/75, 13.3%) texture (ClusterShade), and 22 (22/744, 3.0%) wavelet features. First-order features were more stable than other feature types, which supported the previous data in standard-dose condition (30,31). This may be because first-order features rely on global HU distribution metrics, which are less affected by noise texture alterations introduced by reconstruction algorithms—particularly pronounced in low-dose images. In contrast, texture and wavelet features depend on inter-pixel relationships sensitive to noise suppression effects. The study showed that mean, median, and ClusterShade features remained invariant to both reconstruction algorithms and dose variations, underscoring their utility as robust radiomic biomarkers.

Furthermore, the RF robustness was influenced by the ROI structure, and the mean CV and QCD values for ROI 17 (the full slice of human body mixed densities) and ROI 18 (hepatic mass) were significantly lower than those for single-organ or tissue ROIs (ROIs 1–16), with a higher number of robust RFs observed in the former. Berenguer et al. (3) found that the reproducibility of RFs showed large material differences in which the densest wood had the highest reproducibility of 85.3% (151/177). Chen et al. (10) showed that phantoms mixed with different clinically relevant densities were more robust than single clinically relevant densities in RFs. It is important to note that most of these studies used phantoms as subjects, which may not have accurately represented complex textures in human tissues (32,33). In the present study, different ROIs showed different results, whether in terms of different algorithms or different strength levels. RFs may be more robust in the human body with more prominent structure. It is possible that the more complex tissue structure carries more morphological information and thus has less impact on the overall HU value, making it more stable and helping to improve feature stability. This finding also underscores the necessity of evaluating radiomics robustness using clinical cases in ethical contexts, as phantoms might introduce biases compared to in vivo human samples. This statistical comparison highlights the critical role of ROI composition in radiomics analysis, providing a rationale for prioritizing homogeneous regions during feature selection.

The present study has a few limitations. First, the impact of different doses on feature robustness was not compared when using the same algorithm. Conventional and low doses were analyzed separately since patient movement and contrast concentration can potentially affect the accuracy of ROI delineation. Second, the feature extraction method using PyRadiomics in this study may not fully adhere to image biomarker standardization initiative (IBSI) standards, potentially affecting cross-center generalizability. Future research should integrate IBSI-compliant tools to optimize the workflow. Third, the study utilized a single dataset with a relatively small sample size, a single type of equipment, and a uniform scanning protocol. These factors limited its statistical power and potentially impeded the comprehensive capture of clinical variability, which is essential for reaching generalizable conclusions.

Conclusions

Most of the RFs were not reproducible across different algorithms and strength levels, and the consistency of RFs became even more non-reproducible as the dose was reduced, with only a few features remaining unaffected by the algorithms, structure, or dose. However, DLIR significantly enhanced consistency in comparison to the ASIR-V algorithms, even the consistency between DLIR M and H under low-dose conditions was higher than that of ASIR-V at 30% and 70%. The DLIR algorithm not only ensure image quality when the dose is reduced, but also stabilized RFs, making it a valuable consideration for retrospective data collection and future protocol implementations in radiomics.

Supplementary

The article’s supplementary files as

qims-15-09-7922-coif.pdf (454.7KB, pdf)
DOI: 10.21037/qims-2025-238

Acknowledgments

None.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Institutional Review Board of Shandong University (No. KYLL2025286), and informed consent was provided by all the patients.

Footnotes

Funding: None.

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-2025-238/coif). The authors have no conflicts of interest to declare.

Data Sharing Statement

Available at https://qims.amegroups.com/article/view/10.21037/qims-2025-238/dss

qims-15-09-7922-dss.pdf (70.2KB, pdf)
DOI: 10.21037/qims-2025-238

References

  • 1.Lambin P, Rios-Velazquez E, Leijenaar R, Carvalho S, van Stiphout RG, Granton P, Zegers CM, Gillies R, Boellard R, Dekker A, Aerts HJ. Radiomics: extracting more information from medical images using advanced feature analysis. Eur J Cancer 2012;48:441-6. 10.1016/j.ejca.2011.11.036 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Gillies RJ, Kinahan PE, Hricak H. Radiomics: Images Are More than Pictures, They Are Data. Radiology 2016;278:563-77. 10.1148/radiol.2015151169 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Berenguer R, Pastor-Juan MDR, Canales-Vázquez J, Castro-García M, Villas MV, Mansilla Legorburo F, Sabater S. Radiomics of CT Features May Be Nonreproducible and Redundant: Influence of CT Acquisition Parameters. Radiology 2018;288:407-15. 10.1148/radiol.2018172361 [DOI] [PubMed] [Google Scholar]
  • 4.Zhong J, Wu Z, Wang L, Chen Y, Xia Y, Wang L, Li J, Lu W, Shi X, Feng J, Dong H, Zhang H, Yao W. Impacts of Adaptive Statistical Iterative Reconstruction-V and Deep Learning Image Reconstruction Algorithms on Robustness of CT Radiomics Features: Opportunity for Minimizing Radiomics Variability Among Scans of Different Dose Levels. J Imaging Inform Med 2024;37:123-33. 10.1007/s10278-023-00901-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Pfaehler E, Zhovannik I, Wei L, Boellaard R, Dekker A, Monshouwer R, El Naqa I, Bussink J, Gillies R, Wee L, Traverso A. A systematic review and quality of reporting checklist for repeatability and reproducibility of radiomic features. Phys Imaging Radiat Oncol 2021;20:69-75. 10.1016/j.phro.2021.10.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Cui Y, Yin FF. Impact of image quality on radiomics applications. Phys Med Biol 2022. doi: . 10.1088/1361-6560/ac7fd7 [DOI] [PubMed] [Google Scholar]
  • 7.Thomas HMT, Wang HYC, Varghese AJ, Donovan EM, South CP, Saxby H, Nisbet A, Prakash V, Sasidharan BK, Pavamani SP, Devadhas D, Mathew M, Isiah RG, Evans PM. Reproducibility in Radiomics: A Comparison of Feature Extraction Methods and Two Independent Datasets. Appl Sci (Basel) 2024;13:7291. 10.3390/app13127291 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Espinasse M, Pitre-Champagnat S, Charmettant B, Bidault F, Volk A, Balleyguier C, Lassau N, Caramella C. CT Texture Analysis Challenges: Influence of Acquisition and Reconstruction Parameters: A Comprehensive Review. Diagnostics (Basel) 2020;10:258. 10.3390/diagnostics10050258 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Park JE, Park SY, Kim HJ, Kim HS. Reproducibility and Generalizability in Radiomics Modeling: Possible Strategies in Radiologic and Statistical Perspectives. Korean J Radiol 2019;20:1124-37. 10.3348/kjr.2018.0070 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Chen Y, Zhong J, Wang L, Shi X, Lu W, Li J, Feng J, Xia Y, Chang R, Fan J, Chen L, Zhu Y, Yan F, Yao W, Zhang H. Robustness of CT radiomics features: consistency within and between single-energy CT and dual-energy CT. Eur Radiol 2022;32:5480-90. 10.1007/s00330-022-08628-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Zhang H, Lu T, Wang L, Xing Y, Hu Y, Xu Z, Lu J, Yang J, Chu J, Zhang B, Zhong J. Robustness of radiomics within photon-counting detector CT: impact of acquisition and reconstruction factors. Eur Radiol 2025;35:4661-73. 10.1007/s00330-025-11374-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zwanenburg A, Vallières M, Abdalah MA, Aerts HJWL, Andrearczyk V, Apte A, et al. The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-based Phenotyping. Radiology 2020;295:328-38. 10.1148/radiol.2020191145 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Adelsmayr G, Janisch M, Kaufmann-Bühler AK, Holter M, Talakic E, Janek E, Holzinger A, Fuchsjäger M, Schöllnast H. CT texture analysis reliability in pulmonary lesions: the influence of 3D vs. 2D lesion segmentation and volume definition by a Hounsfield-unit threshold. Eur Radiol 2023;33:3064-71. 10.1007/s00330-023-09500-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Willemink MJ, de Jong PA, Leiner T, de Heer LM, Nievelstein RA, Budde RP, Schilham AM. Iterative reconstruction techniques for computed tomography Part 1: technical principles. Eur Radiol 2013;23:1623-31. [DOI] [PubMed] [Google Scholar]
  • 15.Willemink MJ, Noël PB. The evolution of image reconstruction for CT-from filtered back projection to artificial intelligence. Eur Radiol 2019;29:2185-95. 10.1007/s00330-018-5810-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Koetzier LR, Mastrodicasa D, Szczykutowicz TP, van der Werf NR, Wang AS, Sandfort V, van der Molen AJ, Fleischmann D, Willemink MJ. Deep Learning Image Reconstruction for CT: Technical Principles and Clinical Prospects. Radiology 2023;306:e221257. 10.1148/radiol.221257 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Lyu P, Liu N, Harrawood B, Solomon J, Wang H, Chen Y, Rigiroli F, Ding Y, Schwartz FR, Jiang H, Lowry C, Wang L, Samei E, Gao J, Marin D. Is it possible to use low-dose deep learning reconstruction for the detection of liver metastases on CT routinely? Eur Radiol 2023;33:1629-40. 10.1007/s00330-022-09206-3 [DOI] [PubMed] [Google Scholar]
  • 18.Park J, Shin J, Min IK, Bae H, Kim YE, Chung YE. Image Quality and Lesion Detectability of Lower-Dose Abdominopelvic CT Obtained Using Deep Learning Image Reconstruction. Korean J Radiol 2022;23:402-12. 10.3348/kjr.2021.0683 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Yoshida K, Nagayama Y, Funama Y, Ishiuchi S, Motohara T, Masuda T, Nakaura T, Ishiko T, Hirai T, Beppu T. Low tube voltage and deep-learning reconstruction for reducing radiation and contrast medium doses in thin-slice abdominal CT: a prospective clinical trial. Eur Radiol 2024;34:7386-96. [DOI] [PubMed] [Google Scholar]
  • 20.Toia GV, Zamora DA, Singleton M, Liu A, Tan E, Leng S, Shuman WP, Kanal KM, Mileto A. Detectability of Small Low-Attenuation Lesions With Deep Learning CT Image Reconstruction: A 24-Reader Phantom Study. AJR Am J Roentgenol 2023;220:283-95. 10.2214/AJR.22.28407 [DOI] [PubMed] [Google Scholar]
  • 21.Zhong J, Xia Y, Chen Y, Li J, Lu W, Shi X, Feng J, Yan F, Yao W, Zhang H. Deep learning image reconstruction algorithm reduces image noise while alters radiomics features in dual-energy CT in comparison with conventional iterative reconstruction algorithms: a phantom study. Eur Radiol 2023;33:812-24. 10.1007/s00330-022-09119-1 [DOI] [PubMed] [Google Scholar]
  • 22.Midya A, Chakraborty J, Gönen M, Do RKG, Simpson AL. Influence of CT acquisition and reconstruction parameters on radiomic feature reproducibility. J Med Imaging (Bellingham) 2018;5:011020. 10.1117/1.JMI.5.1.011020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Zhong J, Pan Z, Chen Y, Wang L, Xia Y, Wang L, Li J, Lu W, Shi X, Feng J, Yan F, Zhang H, Yao W. Robustness of radiomics features of virtual unenhanced and virtual monoenergetic images in dual-energy CT among different imaging platforms and potential role of CT number variability. Insights Imaging 2023;14:79. 10.1186/s13244-023-01426-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Chu B, Gan L, Shen Y, Song J, Liu L, Li J, Liu B. A Deep Learning Image Reconstruction Algorithm for Improving Image Quality and Hepatic Lesion Detectability in Abdominal Dual-Energy Computed Tomography: Preliminary Results. J Digit Imaging 2023;36:2347-55. 10.1007/s10278-023-00893-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Xue G, Liu H, Cai X, Zhang Z, Zhang S, Liu L, Hu B, Wang G. Impact of deep learning image reconstruction algorithms on CT radiomic features in patients with liver tumors. Front Oncol 2023;13:1167745. 10.3389/fonc.2023.1167745 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Michallek F, Genske U, Niehues SM, Hamm B, Jahnke P. Deep learning reconstruction improves radiomics feature stability and discriminative power in abdominal CT imaging: a phantom study. Eur Radiol 2022;32:4587-95. 10.1007/s00330-022-08592-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Greffier J, Hamard A, Pereira F, Barrau C, Pasquier H, Beregi JP, Frandon J. Image quality and dose reduction opportunity of deep learning image reconstruction algorithm for CT: a phantom study. Eur Radiol 2020;30:3951-9. 10.1007/s00330-020-06724-w [DOI] [PubMed] [Google Scholar]
  • 28.Caruso D, De Santis D, Del Gaudio A, Guido G, Zerunian M, Polici M, Valanzuolo D, Pugliese D, Persechino R, Cremona A, Barbato L, Caloisi A, Iannicelli E, Laghi A. Low-dose liver CT: image quality and diagnostic accuracy of deep learning image reconstruction algorithm. Eur Radiol 2024;34:2384-93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Lyu P, Li Z, Chen Y, Wang H, Liu N, Liu J, Zhan P, Liu X, Shang B, Wang L, Gao J. Deep learning reconstruction CT for liver metastases: low-dose dual-energy vs standard-dose single-energy. Eur Radiol 2024;34:28-38. 10.1007/s00330-023-10033-3 [DOI] [PubMed] [Google Scholar]
  • 30.Choe J, Lee SM, Do KH, Lee G, Lee JG, Lee SM, Seo JB. Deep Learning-based Image Conversion of CT Reconstruction Kernels Improves Radiomics Reproducibility for Pulmonary Nodules or Masses. Radiology 2019;292:365-73. 10.1148/radiol.2019181960 [DOI] [PubMed] [Google Scholar]
  • 31.Zhovannik I, Bussink J, Traverso A, Shi Z, Kalendralis P, Wee L, Dekker A, Fijten R, Monshouwer R. Learning from scanners: Bias reduction and feature correction in radiomics. Clin Transl Radiat Oncol 2019;19:33-8. 10.1016/j.ctro.2019.07.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Li Y, Reyhan M, Zhang Y, Wang X, Zhou J, Zhang Y, Yue NJ, Nie K. The impact of phantom design and material-dependence on repeatability and reproducibility of CT-based radiomics features. Med Phys 2022;49:1648-59. 10.1002/mp.15491 [DOI] [PubMed] [Google Scholar]
  • 33.Mahmood U, Apte A, Kanan C, Bates DDB, Corrias G, Manneli L, Oh JH, Erdi YE, Nguyen J, O'Deasy J, Shukla-Dave A. Quality control of radiomic features using 3D-printed CT phantoms. J Med Imaging (Bellingham) 2021;8:033505. 10.1117/1.JMI.8.3.033505 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

The article’s supplementary files as

qims-15-09-7922-coif.pdf (454.7KB, pdf)
DOI: 10.21037/qims-2025-238

Data Availability Statement

Available at https://qims.amegroups.com/article/view/10.21037/qims-2025-238/dss

qims-15-09-7922-dss.pdf (70.2KB, pdf)
DOI: 10.21037/qims-2025-238

Articles from Quantitative Imaging in Medicine and Surgery are provided here courtesy of AME Publications

RESOURCES