Abstract
Accurate segmentation of liver and tumor regions in medical imaging is crucial for the diagnosis, treatment, and monitoring of hepatocellular carcinoma (HCC) patients. However, manual segmentation is time-consuming and subject to inter- and intra-rater variability. Therefore, automated methods are necessary but require rigorous validation of high-quality segmentations based on a consensus of raters. To address the need for reliable and comprehensive data in this domain, we present LiverHccSeg, a dataset that provides liver and tumor segmentations on multiphasic contrast-enhanced magnetic resonance imaging from two board-approved abdominal radiologists, along with an analysis of inter-rater agreement.
LiverHccSeg provides a curated resource for liver and HCC tumor segmentation tasks. The dataset includes a scientific reading and co-registered contrast-enhanced multiphasic magnetic resonance imaging (MRI) scans with corresponding manual segmentations by two board-approved abdominal radiologists and relevant metadata and offers researchers a comprehensive foundation for external validation, and benchmarking of liver and tumor segmentation algorithms. The dataset also provides an analysis of the agreement between the two sets of liver and tumor segmentations. Through the calculation of appropriate segmentation metrics, we provide insights into the consistency and variability in liver and tumor segmentations among the radiologists. A total of 17 cases were included for liver segmentation and 14 cases for HCC tumor segmentation. Liver segmentations demonstrates high segmentation agreement (mean Dice, 0.95 ± 0.01 [standard deviation]) and HCC tumor segmentations showed higher variation (mean Dice, 0.85 ± 0.16 [standard deviation]).
The applications of LiverHccSeg can be manifold, ranging from testing machine learning algorithms on public external data to radiomic feature analyses. Leveraging the inter-rater agreement analysis within the dataset, researchers can investigate the impact of variability on segmentation performance and explore methods to enhance the accuracy and robustness of liver and tumor segmentation algorithms in HCC patients. By making this dataset publicly available, LiverHccSeg aims to foster collaborations, facilitate innovative solutions, and ultimately improve patient outcomes in the diagnosis and treatment of HCC.
Keywords: Liver segmentation, Tumor segmentation, Hepatocellular carcinoma, Inter-rater agreement, Inter-rater variability, Multiphasic contrast-enhanced magnetic resonance imaging, Benchmarking, Imaging biomarkers
Specifications Table
Subject | Medical Imaging |
Specific subject area | Whole liver and HCC tumor segmentation. |
Type of data | Table Medical Imaging Scientific Reading Segmentation files |
How the data were acquired | The data was retrieved from The Cancer Genome Atlas Liver Hepatocellular Carcinoma Collection (TCGA-LIHC) (https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=6885436) |
Data format | Raw Analyzed Annotated |
Description of data collection | One multi-phasic contrast-enhanced MRI study per patient was included from the TCGA-LIHC database [1] in the dataset and all analysis. Liver and tumor segmentation was conducted using the software 3D Slicer. |
Data source location | The Cancer Genome Atlas Liver Hepatocellular Carcinoma Collection (TCGA-LIHC) [1] (https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=6885436) |
Data accessibility | All data is available at the Zenodo repository: https://doi.org/10.5281/zenodo.7957515 |
1. Value of the Data
-
•
The LiverHccSeg dataset provides a publicly available resource for liver and tumor segmentation in patients with hepatocellular carcinoma (HCC). The dataset provides manual whole liver- (n = 17) and tumor segmentations (n == 14), enabling the evaluation of artificial intelligence algorithms for accurate and reliable liver and tumor detection and segmentation.
-
•
The inclusion of two sets of liver and tumor segmentations from two board-approved abdominal radiologists in the LiverHccSeg dataset adds significant value. Researchers can leverage the inter-rater agreement analysis to gain insights into the variability in liver and tumor segmentations, leading to a better understanding of the challenges and uncertainties associated with HCC segmentation, and facilitating the development of improved segmentation techniques.
-
•
LiverHccSeg provides both, a consistent data structure and measures of reproducibility for our manual segmentations.
-
•
By providing consistently labeled NIfTI images and segmentation masks, we aim to support researchers in seamlessly integrating this dataset into their evaluation workflows, ultimately fostering more efficient and reliable machine learning algorithm evaluation processes while ensuring compatibility and interoperability with various software tools and libraries commonly used in scientific analyses.
-
•
A dedicated scientific reading of the images was conducted to minimize the potential biases and inconsistencies that may arise from relying solely on clinical reports. Moreover, our tumor segmentations show a high inter-rater agreement and ensure that our segmentation masks are reproducible.
2. Objective
Liver cancer is a leading cause of cancer-related mortality worldwide [2], with increasing incidence and mortality rates [3,4]. Hepatocellular carcinoma (HCC) accounts for the majority of liver cancer cases [5]. Magnetic resonance (MR) imaging (MRI) has proven effective in detecting and diagnosing HCC without invasive biopsies [6,7]. Accurate liver segmentation is crucial for volumetry assessment and serves as a preprocessing step for tumor detection algorithms and accurate HCC tumor segmentation is essential for the extraction of quantitative imaging biomarkers such as radiomics [8].
Publicly available datasets allow for fair and objective comparisons between different algorithms or approaches. The LiverHccSeg dataset addresses the lack of publicly available, annotated multiphasic MRI datasets and offers researchers and developers a resource for evaluating algorithms and analyzing imaging biomarkers on external data. In addition to providing a benchmark with this dataset, we also assess the inter-rater variability between two different sets of tumor segmentations, which serves as a measure of reproducibility for human segmentations. This is essential in assessing the reliability of manual annotations and establishing a baseline for algorithm performance comparison. LiverHccSeg promotes fair comparisons, facilitates advancements in HCC research, and supports the development of more accurate and robust segmentation algorithms.
3. Data Description
The data that appears in this article include:
-
1.
dicoms.zip: This zip file contains all the raw MR images from The Cancer Genome Atlas Liver Hepatocellular Carcinoma Collection (TCGA-LIHC) [1] in the Digital Imaging and Communications in Medicine (DICOM) format used for the curation of this dataset. The data is structured as Patient-ID/DATE/SEQUENCE where Patient-ID is the unique de-identified patient ID, DATE is the date of the image acquisition, and SEQUENCE is the name of the MR sequence.
-
2.
LiverHccSeg_MetaData.xlsx: This spreadsheet contains all the metadata from the DICOM headers along with the data from the scientific image readings.
-
3.
nifti_and_segms.zip: This zip file contains all MR images along with the liver and tumor segmentations in the Neuroimaging Informatics Technology Initiative (NIfTI) format.
The data is structured as Patient-ID/DATE/SEQUENCE where Patient-ID is the unique anonymized patient identifier, DATE is the date of the image acquisition, and SEQUENCE is the name of the MRI sequence or segmentation image.
The NIfTI files are named as follows:
pre.nii.gz: Pre-contrast T1-weighted MRI
art.nii.gz: Arterial-phase T1-weighted MRI
pv.nii.gz: Portal-venous-phase T1-weighted MRI
del.nii.gz: Delayed-phase T1-weighted MRI
art_pre.nii.gz: Pre-contrast T1-weighted MRI registered to the corresponding arterial-phase T1-weighted image
art_pv.nii.gz: Portal-venous-phase T1-weighted MRI registered to the corresponding arterial-phase T1-weighted MRI
art_del.nii.gz: Delayed-phase T1-weighted MRI registered to the corresponding arterial-phase T1-weighted MRI
The corresponding manual segmentations are named after the rater and the type of segmentation and follow the format ‘RATER_ROI.nii.gz’ where RATER denotes the human rater and ROI denotes the region of interest that was segmented, for example, ‘rater1_liver.nii.gz’, ‘rater2_liver.nii.gz’, ‘rater1_tumor1.nii.gz’, and ‘rater2_tumor1.nii.gz’. For tumor segmentations, an integer indicates the tumor identification number for different tumor ROIs, for example ‘rater1_tumor1.nii.gz’ and ‘rater2_tumor1.nii.gz’. The segmentations can be used for the arterial phase NIfTI file as well as the corresponding co-registered pre-contrast (art_pre.nii.gz), portal-venous (art_pv.nii.gz), and delayed-phase (art_del.nii.gz) images.
-
4.
segm_metrics.xlsx: This spreadsheet summarizes the segmentation agreement between the two sets of liver and tumor segmentations by the two board-certified abdominal radiologists.
4. Experimental Design, Materials and Methods
4.1. Inclusion of patients
All available scans from The Cancer Genome Atlas Liver Hepatocellular Carcinoma Collection (TCGA-LIHC) were downloaded [1]. One multiphasic MRI study (pre-contrast and triphasic post-contrast including arterial, portal venous, and delayed phases) per patient was included. Patients who did not exhibit a tumor or residual tumor were excluded from the tumor segmentation dataset; however, they were included in the liver segmentation dataset. Fig. 1 summarizes the inclusion and exclusion process, and patient characteristics are reported in Table 1. In cases where a subject had multiple scans, the inclusion process prioritized the inclusion of pre-treatment images and among the pre-treatment images, preference was given to scans with the highest image quality based on visual qualitative assessment.
Table 1.
Parameter | Liver Segmentation Cohort (n = 17) |
Tumor Segmentation Cohort (n = 14) |
---|---|---|
Demographics | ||
Male, n (%) | 11 (0.65) | 9 (0.64) |
Age, mean (std) | 61 (10.77) | 60 (11.01) |
Etiology | ||
Alcohol | 1 | 1 |
HCV | 3 | 3 |
HBV | 2 | 2 |
not available | 12 | 9 |
Radiological Data | ||
Number of Lesions, n (%) | ||
1 | - | 13 (0.93) |
3 | - | 1 (0.07) |
Maximum Tumor Diameter (cm), median [IQR] | - | 6.34 [6.285] |
Cumulative Tumor Diameter (cm), median [IQR] | - | 6.34 [6.285] |
Portal Vein Thrombosis, n (%) | ||
Absent | 14 (0.82) | 12 (0.86) |
Present | 3 (0.18) | 2 (0.14) |
Ascites on Imaging, n (%) | ||
Absent | 17 (1.0) | 14 (1.0) |
Present | 0 (0.0) | 0 (0.0) |
Portal Hypertension on Imaging, n (%) | ||
Absent | 12 (0.71) | 11 (0.79) |
Present | 5 (0.29) | 3 (0.21) |
Liver Volume (ccm), median [IQR] | 1968.94 [581.23] | 2091,52 [456.84] |
Total Tumor Volume (ccm), median [IQR] | - | 107.08 [336.55] |
4.2. MR imaging data
All imaging data were converted to the Neuroimaging Informatics Technology Initiative (NIfTI) format with the dcm2nii (v2.1.53) package [9] and available header information was extracted using the pydicom (v.2.1.2) package [9]. Multiparametric MRI sequences were labeled with a consistent syntax (‘pre’, ‘art’, ‘pv’, ‘del’, for the pre-contrast, arterial, portal-venous, and delayed contrast phases, respectively). All images were already de-identified by the TCIA website. Images were acquired between the years 1993 and 2007 on Philips and Siemens scanners with field strengths of 1.5 and 3 Tesla, respectively. Full details of the imaging parameters can be found in Table 2. Briefly, the median repetition time (TR) and median echo time (TE) were 365.8 ms and 26.4 ms, respectively. The median slice thickness was 9.5 mm, and the median bandwidth was 536.9 Hz.
Table 2.
Parameter | Liver Segmentation Cohort | Tumor Segmentation Cohort |
---|---|---|
n | n = 17 | n = 14 |
Manufacturer, n (%) | ||
Siemens | 16 (94.1) | 14 (100.0) |
Philips Healthcare | 1 (5.9) | 0 |
Model Name, n (%) | ||
Aera | 6 (35.3) | 6 (42.9) |
Avanto | 6 (35.3) | 5 (35.7) |
Sonata | 1 (5.9) | 1 (7.1) |
Symphony | 2 (11.8) | 2 (14.3) |
Espree | 1 (5.9) | 0 |
Ingenia | 1 (5.9) | 0 |
Magnetic Field Strength, n (%) | ||
1.5 T | 16 (94.1) | 14 (100.0) |
3.0 T | 1 (5.9) | 0 |
Contrast Agent, n (%) | ||
Gadovist | 4 (23.5) | 4 (28.6) |
Magnevist | 3 (17.6) | 2 (14.3) |
Multihance | 5 (29.4) | 4 (28.6) |
Omniscan | 1 (5.9) | 1 (7.1) |
not avaliable | 4 (23.5) | 3 (21.4) |
Flip Angle, mean (SD) | 65.4 (59.1) | 77.3 (58.7) |
Percent Phase Field of View, mean (SD) | 88.3 (24.5) | 88.4 (26.8) |
Echo Time, mean (SD) | 26.4 (37.7) | 31.7 (39.7) |
Repetition Time, mean (SD) | 365.8 (556.0) | 443.4 (586.3) |
Imaging Frequency, mean (SD) | 67.4 (15.5) | 63.7 (0.0) |
Pixel Bandwidth (Hz), mean (SD) | 536.9 (346.1) | 487.3 (201.1) |
Spacing Between Slices, mean (SD) | 9.5 (3.9) | 10.3 (3.2) |
4.3. Scientific MRI readings
After conversion, all images were read in a scientific reading by two board-certified abdominal radiologists (S.A. and S.H with 9 and 10 years of experience, respectively). Any disagreement between the two raters was discussed in a consensus meeting. All HCC lesions were classified according to LI-RADS criteria [6]. Table 3 summarizes the imaging features of the scientific readings.
Table 3.
Tumor Imaging Features | |
---|---|
Number of lesions | 16 (14 Patients) |
Diameter (cm), median [IQR] | 6.00 [5.88] |
Arterial Phase Hyperenhancement, n (%) | 14 (87.50) |
Washout, n (%) | 9 (56.25) |
Capsule, n (%) | 11 (68.75) |
Tumor Volume (ccm), median [IQR] | 87.60 [327.79] |
4.4. MR image co-registration
The co-registration of pre-contrast, portal-venous, and delayed-phase images with arterial phase images was performed using the software BioImage Suite (v3.5) [10]. A non-rigid intensity-based registration approach was applied, employing a free-form deformation (FFD) parameterized with 3D B-splines [11]. The FFD transformation was estimated by maximizing the normalized mutual information similarity metric [12] through gradient descent optimization. To enhance the optimization process, a multi-resolution image pyramid with three levels was utilized. The final B-spline control point spacing was set to 80 mm. The estimated transformation was then employed to warp the moving images (pre-contrast, portal-venous, and delayed-phase) into the reference image space, specifically the arterial phase image. All registrations were manually verified by visual inspection.
4.5. Liver and tumor segmentation and statistical analysis
All livers and tumors were manually segmented under the supervision of two board-certified abdominal radiologists using the software 3D Slicer (v4.10.2) [13] utilizing the paint (2D and 3D brush), draw, and erase (2D and 3D brush) tools. To compare the segmentation agreement between the two sets of liver and tumor segmentations, we calculated segmentation metrics using the Python package seg-metrics (v1.0.0) [14]. All segmentation metrics and statistics were calculated in Python (v3.7). A representative example case with corresponding liver and tumor segmentations from both raters is shown in Fig. 2. Liver and tumor segmentation metrics are summarized in Table 4 and Table 5, respectively.
Table 4.
Segmentation Metric | mean | SD | 25 % | median | 75 % |
---|---|---|---|---|---|
dice | 0.95 | 0.01 | 0.9 | 0.95 | 0.96 |
jaccard | 0.91 | 0.02 | 0.9 | 0.91 | 0.92 |
precision | 0.96 | 0.02 | 0.9 | 0.96 | 0.97 |
recall | 0.95 | 0.03 | 0.9 | 0.95 | 0.97 |
fpr | 0 | 0 | 0 | 0 | 0 |
fnr | 0.05 | 0.03 | 0 | 0.05 | 0.06 |
vs | 0.01 | 0.04 | 0 | 0.02 | 0.03 |
hd | 15.77 | 8.05 | 11.6 | 13 | 15.55 |
msd | 1.3 | 0.41 | 1.2 | 1.18 | 1.64 |
mdsd | 0.71 | 0.57 | 0 | 0.78 | 1.19 |
stdsd | 1.82 | 0.72 | 1.4 | 1.45 | 2 |
hd95 | 4.53 | 2.04 | 3.3 | 3.8 | 5.5 |
Table 5.
Segmentation Metric | mean | SD | 25 % | median | 75 % |
---|---|---|---|---|---|
dice | 0.85 | 0.16 | 0.84 | 0.89 | 0.92 |
jaccard | 0.76 | 0.18 | 0.73 | 0.81 | 0.86 |
precision | 0.85 | 0.19 | 0.88 | 0.89 | 0.92 |
recall | 0.89 | 0.09 | 0.84 | 0.92 | 0.97 |
fpr | 0.01 | 0.02 | 0 | 0 | 0 |
fnr | 0.11 | 0.09 | 0.03 | 0.08 | 0.16 |
vs | −0.08 | 0.38 | −0.05 | 0 | 0.08 |
hd | 15.59 | 25.58 | 5.53 | 7.83 | 12.84 |
msd | 3.53 | 9.58 | 0.78 | 1.13 | 1.44 |
mdsd | 2.89 | 9.78 | 0 | 0 | 1.13 |
stdsd | 3.36 | 7.35 | 1.07 | 1.32 | 1.97 |
hd95 | 9.73 | 22.09 | 2.74 | 3.33 | 5.6 |
Ethics Statements
The data provided by this dataset constitutes secondary use of an existing, publicly available dataset. The analysis of de-identified, publicly available data does not constitute human subjects research and does not require Institutional Review Board (IRB) review.
CRediT authorship contribution statement
Moritz Gross: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Data curation, Writing – original draft, Writing – review & editing, Visualization, Project administration. Sandeep Arora: Methodology, Validation, Formal analysis, Investigation, Data curation, Writing – original draft, Writing – review & editing, Supervision. Steffen Huber: Methodology, Validation, Formal analysis, Investigation, Data curation, Writing – original draft, Writing – review & editing, Supervision. Ahmet S. Kücükkaya: Methodology, Validation, Formal analysis, Investigation, Data curation, Writing – original draft, Writing – review & editing. John A. Onofrey: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Writing – original draft, Writing – review & editing, Visualization, Project administration, Resources, Supervision, Funding acquisition.
Acknowledgments
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
Research reported in this publication was supported by the National Institute of Diabetes and Digestive and Kidney Diseases of the National Institutes of Health under Award Number P30 KD034989. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Nation Institutes of Health. MG was supported by a travel stipend from the Rolf W. Günther Foundation for Radiological Sciences for his travel to Yale University.
Data Availability
References
- 1.Erickson BJ, Kirk S, Lee Y, et al. Radiology data from The Cancer Genome Atlas Liver Hepatocellular Carcinoma [TCGA-LIHC] collection. Cancer Imaging Arch. 2016 doi: 10.7937/K9/TCIA.2016.IMMQW8UQ. [DOI] [Google Scholar]
- 2.Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2021;71:209–249. doi: 10.3322/caac.21660. [DOI] [PubMed] [Google Scholar]
- 3.Siegel RL, Miller KD, Jemal A. Cancer statistics, 2019. CA Cancer J. Clin. 2019;69:7–34. doi: 10.3322/caac.21551. [DOI] [PubMed] [Google Scholar]
- 4.White DL, Thrift AP, Kanwal F, Davila J, El-Serag HB. Incidence of hepatocellular carcinoma in all 50 United States, from 2000 through 2012. Gastroenterology. 2017;152 doi: 10.1053/j.gastro.2016.11.020. 812-820.e815. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Perz JF, Armstrong GL, Farrington LA, Hutin YJ, Bell BP. The contributions of hepatitis B virus and hepatitis C virus infections to cirrhosis and primary liver cancer worldwide. J. Hepatol. 2006;45:529–538. doi: 10.1016/j.jhep.2006.05.013. [DOI] [PubMed] [Google Scholar]
- 6.Chernyak V, Fowler KJ, Kamaya A, et al. Liver Imaging Reporting and Data System (LI-RADS) version 2018: imaging of hepatocellular carcinoma in at-risk patients. Radiology. 2018;289:816–830. doi: 10.1148/radiol.2018181494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Hamer OW, Schlottmann K, Sirlin CB, Feuerbach S. Technology insight: advances in liver imaging. Nat. Clin. Pract. Gastroenterol. Hepatol. 2007;4:215–228. doi: 10.1038/ncpgasthep0766. [DOI] [PubMed] [Google Scholar]
- 8.Gross M, Spektor M, Jaffe A, et al. Improved performance and consistency of deep learning 3D liver segmentation with heterogeneous cancer stages in magnetic resonance imaging. PLoS One. 2021;16 doi: 10.1371/journal.pone.0260630. https://github.com/rordenlab/dcm2niix dcm2nii DICOM to NIfTI converter. (Accessed 7 December 2021) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Mason D, scaramallion, rhaxton, et al. 2020. pydicom/pydicom: pydicom 2.1.2, v2.1.2. Zenodo. [Google Scholar]
- 10.Papademetris MJ X., Rajeevan N., Okuda H., Constable R.T., Staib L.H. Section of Bioimaging Sciences, Dept. of Diagnostic Radiology, Yale School of Medicine; 2023. BioImage Suite: An Integrated Medical Image Analysis Suite.http://www.bioimagesuite.org [Google Scholar]
- 11.Rueckert D, Sonoda LI, Hayes C, Hill DLG, Leach MO, Hawkes DJ. Nonrigid registration using free-form deformations: application to breast MR images. IEEE Trans. Med. Imaging. 1999;18:712–721. doi: 10.1109/42.796284. [DOI] [PubMed] [Google Scholar]
- 12.Studholme C, Hill DL, Hawkes DJ. An overlap invariant entropy measure of 3D medical image alignment. Pattern Recognit. 1999;32:71–86. [Google Scholar]
- 13.Fedorov A., Beichel R., Kalpathy-Cramer J., et al. 3D Slicer as an image computing platform for the quantitative imaging network. Magn. Reson. Imaging. 2012;30:1323–1341. doi: 10.1016/j.mri.2012.05.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Ordgod . 2020. Ordgod/segmentation_metrics: seg-metrics. v1.0.0. Zenodo. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.