Abstract
Spiculations are important predictors of lung cancer malignancy, which are spikes on the surface of the pulmonary nodules. In this study, we proposed an interpretable and parameter-free technique to quantify the spiculation using area distortion metric obtained by the conformal (angle-preserving) spherical parameterization. We exploit the insight that for an angle-preserved spherical mapping of a given nodule, the corresponding negative area distortion precisely characterizes the spiculations on that nodule. We introduced novel spiculation scores based on the area distortion metric and spiculation measures. We also semi-automatically segment lung nodule (for reproducibility) as well as vessel and wall attachment to differentiate the real spiculations from lobulation and attachment. A simple pathological malignancy prediction model is also introduced. We used the publicly-available LIDC-IDRI dataset pathologists (strong-label) and radiologists (weak-label) ratings to train and test radiomics models containing this feature, and then externally validate the models. We achieved AUC=0.80 and 0.76, respectively, with the models trained on the 811 weakly-labeled LIDC datasets and tested on the 72 strongly-labeled LIDC and 73 LUNGx datasets; the previous best model for LUNGx had AUC=0.68. The number-of-spiculations feature was found to be highly correlated (Spearman’s rank correlation coefficient ρ = 0.44) with the radiologists’ spiculation score. We developed a reproducible and interpretable, parameter-free technique for quantifying spiculations on nodules. The spiculation quantification measures was then applied to the radiomics framework for pathological malignancy prediction with reproducible semi-automatic segmentation of nodule. Using our interpretable features (size, attachment, spiculation, lobulation), we were able to achieve higher performance than previous models. In the future, we will exhaustively test our model for lung cancer screening in the clinic.
Keywords: Conformal Mapping, Spiculation, Lung Cancer Screening
2000 MSC: 41A05, 41A10, 65D05, 65D17
1. Introduction
Lung cancer is the most common cause of cancerrelated death in the United States [1]. Lung cancer screening with a low-dose computed tomography (CT) for current and former smokers has been shown a clear survival benefit by the National Lung Cancer Screening Trial [2]. Recently radiomics studies have been proposed for various clinical applications [3, 4, 5], which extract a vast number of quantitative image features and then perform data mining to predict tumor responses and patient outcomes for more reliable and accurate prediction of local control and overall survival. Refer to [6] for an exhaustive review of radiomics and radiogenomics studies to predict clinical outcomes in lung cancer.
The radiomics analysis has also been studied for lung cancer screening. Hawkins et al. [5] proposed a random forest classifier using 23 stable radiomic features. Buty et al. [3] introduced a random forest classifier using a pre-trained deep neural network feature extractor (4096 appearance features), and a spherical harmonics feature extractor (400 shape features). The spherical harmonics are a decomposition of the frequency-space basis for representing functions defined over the sphere and applicable to describe the overall shape of the object. However, it cannot provide local features for a given region on a shape (e.g., spiculation). Kumar et al. [7] proposed a deep neural network model, which used 5000 features. Liu et al. [8] introduced a linear classifier based on 24 image traits visually scored by physicians. Choi et al. [4] proposed a model for predicting malignancy in pulmonary nodules using a support vector machine classifier coupled with a least absolute shrinkage and selection operator (SVM-LASSO), which only use two CT radiomic features (size and texture). While these radiomics studies have improved the accuracy of the predictions, the lack of clinical/biological interpretation of the features remains limited.
Radiographic edge characteristics of a nodule, specifically spiculation (spikes on the surface of nodules), influence the probability of malignancy [10]. Typically, malignant nodules have blurred and irregular boundaries, while benign nodules have well-defined and smooth boundaries. The American College of Radiology (ACR) developed the Lung Imaging Reporting and Data System (Lung-RADS) to standardize the lung cancer screening on CT images using size, appearance type (spiculation, lobulation, vessel/wall attachment) and calcification [11]. Lung-RADS suggests spiculation as an additional image finding that increases the suspicion of malignancy to improve prediction accuracy. The McWilliams [12] introduced a model to compute the probability of lung cancer, which uses nine variables, such as age, sex, emphysema, family history of lung cancer, the number of nodules, nodule size, nodule type, nodule location, and spiculation. Nodule size and spiculation were the significant malignancy predictors in both models.
Spiculation quantification of pulmonary nodule has been previously studied but not in the prediction of malignancy. Niehaus et al. [13] introduced a computer-aided diagnosis (CAD) system, which used the size dependence of shape features to quantify spiculations. Ciompi et al. [14] proposed a frequency-based shape descriptor specifically tailored to assess presence of spiculation in detected solid nodules for lung cancer screening. Dhara et al. [15] quantified spiculation peaks on a surface mesh, extracted from the binary mask of the segmented nodule. They used mean curvature and geodesic distance transformation for detecting apex of spiculation, and the baseline was then detected by tracing the sudden change of surface. The method was highly sensitive to the local variation of the surface, and hence, was challenging to detect baseline for noisy spiculation peak accurately.
In this work, we present a comprehensive pipeline to quantify spiculations, lobulations, and vessel/wall attachments, and evaluate their importance in malignancy prediction. This work extends our ShapeMI workshop paper [16]. The contributions of this paper are as follows:
A novel interpretable spiculation feature is presented, computed using the area distortion metric from conformal (angle-preserving) spherical parameterization. To the best of our knowledge, we are the first ones to exploit the insight that for an angle-preserved (conformal) spherical mapping of a given nodule (e.g., using a Ricci flow algorithm [9]), the corresponding negative area distortion accurately characterizes the spiculations/spikes on that nodule. Moreover, a simple one-step feature and prediction model is introduced, which only uses our interpretable features (size, spiculation, lobulation, vessel/wall attachment) and has the added advantage of using weak-labeled training data.
A semi-automatic segmentation algorithm is also introduced for more accurate and reproducible lung nodule segmentation as well as vessel/wall attachment segmentation. The segmentation method leads to more accurate spiculation quantification because the attachments can be excluded from spikes on the lung nodule surface (triangular mesh) data. Using just our interpretable features (size, attachment, spiculation, lobulation), we were able to achieve AUC=0.82 on LIDC and AUC=0.76 on LUNGx (the previous LUNGx best being AUC=0.68).
State-of-the-art correlation is achieved between ourspiculation score (the number of spiculations Ns) and radiologist’s spiculation score (ρ = 0.44).
The paper is organized as follows: first, we introduce the spherical parameterization technique for spiculation quantification and scoring based on semi-automatic segmentation of lung nodule surface (triangular mesh) data (Sections 2.1–2.4). The new spiculation measures are then performed for pathological malignancy prediction (Section 2.5) followed by comprehensive validation of our spiculation quantification on phantom FDA datasets from which we identify the new solid angle threshold (TΩ) to differentiate lobulation and spiculation in real datasets (Section 3.1). The correlations between our spiculation measures and radiologist’s spiculation scores (RS) are then provided along with the performance of malignancy prediction using the spiculation measures (Sections 3.2 and 3.3). Finally, we discuss the limitations of our work (Section 4) followed by conclusion (Section 4.1).
2. Method
When mapping any given compact surface (e.g., nodule) to a sphere, there is a trade-off between angle distortion and area distortion (e.g., lowering the angle distortion during the mapping increases the corresponding area distortion). Given this trade-off, the following insight can be exploited for accurately quantifying spiculations on a given nodule. For an angle-preserved (conformal) spherical mapping of a nodule (e.g., [17, 9]), the negative area distortion precisely characterizes the spiculations/spikes on that nodule (Fig. 1).
2.1. Conformal mappings and area distortion
First, we provide a theoretical overview of the area distortion in conformally mapping a genus zero Riemannian surface S to the unit sphere to motivate the spiculation quantification pipeline (see [18] and [19] for the relevant mathematical background). By the Theorema Egregium of Gauss, one cannot find a diffeomorphism from S with non-constant Gaussian curvature to which preserves both area and angles. Furthermore, by a general result in complex analysis (uniformization), S and are conformally equivalent. That is, there exists a diffeomorphism that preserves angles. Then ϕ is unique up to Möbius transformation on . This is the spherical parameterization of a compact genus 0 surface for which we want to measure area distortion.
One may directly use the mapping ϕ to compute the area distortion as in [17]. For working on a triangular mesh we have chosen the approach based on [9]. Let g0 be the Riemannian metric on S with corresponding Gaussian curvature K0. Let Ku be the curvature on the conformally equivalent surface with metric gu = e2ug0. Then it is well-known ([20]) that
(1) |
This equation provides a specific measure of the are distortion in any spherical parameterization procedure. [9] proposed a dynamic version of Eq. 1 which is essentially the 2D Ricci Flow. Indeed, for the unit sphere Ku = 1, and thus u satisfies the Poisson equation
(2) |
u is called the conformal distortion factor, and e2u measures the area distortion between the surface S and the sphere . If one examines the latter Poisson equation, one qualitatively sees that the more K0(x) varies, the greater the variation in u, and from the maximum principle, spikes/spiculations may be identified by the greatest negative variation in area distortion.
2.2. Angle-preserving spherical parameterization
We will outline in this section the methodology proposed in [9] for conformally mapping a compact genus 0 surface to a sphere, which we will use for spiculation detection/quantification.
Let S = (V, E, F), denote a triangular mesh where V denotes the vertices, E the edges, and F the faces. We assume that S represents the triangulation of a genus 0 compact surface, i.e., a topological sphere. As shown in Fig. 2, the idea is to divide S into two topological discs S1 and S2 with boundary curve given by γ. S1, S2,, and γ may be found via the zeroth level set of the eigenfunction corresponding to the smallest positive eigenvalue of the (discrete) Laplacian (the so-called Fiedler vector). A discretization of the 2D Ricci (Yamabe flow) is then used to conformally flatten S1 and S2 to discs, which are then conformally welded together and stereographically projected to get a conformal mapping to the Riemann sphere. Specific details can be found in [9].
2.3. Spiculation detection and quantification pipeline
We now formulate the pipeline derived from overall program discussed in Sections 2.1–2.2 in a discrete setting with respect to a triangulated surface S = (V, E, F). Here, one may measure the area distortion on each triangle. The spiculation quantification pipeline using this discrete version of spherical parameterization is as follows (with height and width detection; see Fig. 1 and Alg. 1):
Compute conformal (angle-preserving) spherical parameterization [9]: The first non-trivial eigenfunction of the Laplace-Beltrami operator is computed for a given mesh (Fig. 1a). The mesh was divided into two topological disks by the zeroth-level set (red curve in Fig. 1a) of this eigenfunction. The disks are conformally welded and stereographically projected to a sphere (Fig. 1b).
-
Compute the normalized area distortion. For each vertex vi, the area distortion is defined as
where [vi, vj, vk] is the triangle formed by vi, vj, vk and A(.) represents the area of a triangle. Find all the baselines B where area distortion is zero (Line 12 in Algorithm 1).
Recursively search closed curves from the baseline (zero area distortion) to the apex (the smallest area distortion) using the level-set method. During the search, the closed curves can break into multiple closed curves and move towards different apexes. Each pair of the apex (a terminal node) and its corresponding closed curves define a peak and are assigned unique IDs to track their progression and for height and width computations in the next step. (Line 1–10 in Algorithm 1)
Compute the sum of the distances between the successive centroids of the closed curves to obtain the peak height. The peak width is also computed on the area distortion map of the peak using a full width half maximum concept.
Malignant nodules are more likely to have irregular, lobulated or spiculated margins due to the spread of malignant cells within the pulmonary interstitium. The peak detection is able to capture both lobules and spicules. Among them the spiculated nodule is more likely malignant than others. The classification of the detected peaks into the spiculation (sharp peak) and lobulation (curved peak) will provide more descriptive feature information for the malignancy prediction. To exclude lobulation from spiculation, we applied thresholding for the height (Th ≥ 3mm) and solid angle (TΩ ≤ 0.65sr). The solid angle threshold was suggested in [15], and we confirmed it with the phantom FDA dataset (see Results section). We also applied a full width half maximum concept for more robust width measures of a peak, using the peak surface and its area distortion. The peak width was measured on an iso-contour at half minimum area distortions (all negative values).
Spiculation measures:
Here, the number of all peaks Np, the number of spiculation Ns, the number of lobulation Nl, the number of attachment Na, and the surface area ratio of attached regions ra = A(Sa)/A(Snodule) were available as spiculation measures as well as lobulation and attachment measures. We also described novel spiculation scores s1 and s2, s1 summarized sharpness of each spiculation by applying the mean (mean(ϵp(i))) of the area distortion of all the vertices in a detected spike,
where p(i) is spike i, hp(i) is height of spike p(i), and s2 summarized irregularity of spiculation by applying the variation (var(ϵp(i))) of the area distortion,
Fig. 3 illustrates a few nodule examples and the corresponding spiculation quantification measures and radiologist’s spiculation scores.
We compared our spiculation measures with [15] proposed spiculation scores sa and sb,
and
where ωp(i) is the solid angle subtended at apex of spike p(i). These scores summarized the sharpness (solid angle ωp(i)) and height (hp(i)) of spiculations as shown in Figure 3.
2.4. Semi-automatic segmentation
Spiculations are thin sharp spikes around the core of a nodule. We used semi-automatic segmentation to precisely segment/quantify spiculations as well as extract radiomic features. A consensus contour was generated for each nodule with two or more manual contours by using the simultaneous truth and performance level estimation (STAPLE) [21, 22] as ground truth. Many automatic segmentation methods have been proposed for nodule segmentation, but are mainly focused on segmenting core nodule regions. Furthermore, the implementation of these segmentation methods is not straightforward.
For the reproducible semi-automatic segmentation, we combined two well-known and easy-to-implement methods, GrowCut [25] and chest imaging platform (CIP) segmentation algorithms [26]. The GrowCut Segmentation is a cellular automata-based region growing algorithm that needs two sets of seed points for foreground and background, and they compete to grow the regions until convergence. GrowCut segmentation can leak into surrounding structures, such as the chest wall, airway walls, and vessel-like structures. The CIP segmentation is a level set-based algorithm that uses a front propagation approach from a seed point placed within the nodule. The propagation (or segmentation) is constrained by feature maps of the structures to prevent leakage into surrounding structures. However, CIP might ignore some tumor regions because of the inaccurate vessel and wall feature maps. Therefore, we combined these two methods to compensate for their limitations and to take advantage of both for attachment detection. Both methods are publicly available in 3D Slicer. Fig. 3(a) show segmentations of different nodules. The corresponding attachments, shown in Fig. 3(a) and (b), are computed using the morphological intersection of the GrowCut and CIP segmentations.
2.5. Malignancy Prediction
We evaluated our spiculation quantification measures and radiomic features for classifying pathological malignant nodules and benign nodules by adding spiculation features to our model [4]. The conventional spiculation scores (sa and sb) were also evaluated. A total of 103 radiomic features was extracted from each nodule to quantify its intensity, shape, and texture [4]. Intensity features are firstorder statistical measures that quantify the level and distribution of CT attenuations in a nodule (e.g., Minimum, Mean, Median, Maximum, Standard deviation (SD), Skewness, and Kurtosis). Shape features describe geometric characteristics (e.g., volume, diameter, elongation, roundness, and flatness) for voxels. Texture features quantify tissue density patterns. We used Graylevel cooccurrence matrix (GLCM): Energy, Entropy, Correlation, Inertia, Cluster prominence (CP), Cluster shade (CS), Haralicks correlation (HC), Inverse difference moment (IDM); and Graylevel runlength matrix (GLRM): Run-length non-uniformity (RNU), Gray-level non-uniformity (GNU), Long-run emphasis (LRE), Short-run emphasis (SRE), High gray-level run emphasis (HGRE), Low gray-level run emphasis (LGRE), Long-run high gray-level emphasis (LRHGE), Long-run low gray-level emphasis (LRLGE), Short-run high gray-level emphasis (SRHGE), Short-run low gray-level emphasis (SRLGE). The mean (average) and SD values of each texture feature were computed over 13 directions to obtain rotationally invariant features.
Moreover, we extracted features from the triangular mesh model, such as shape features (size - volume, average of longest and its perpendicular diameters, equivalent volume sphere’s diameter, and roundness) and statistical features (median, mean, minimum, maximum, variance, skewness, and kurtosis) of the area distortion metric ϵ. We performed univariate analysis to evaluate the significance of each feature to classify spiculation using the area under the receiver operating characteristic curve (AUC), Wilcoxon rank-sum test, and Spearman’s correlation coefficient ρ. Bonferroni correction was applied to the original p-values to counteract the problem of multiple comparisons since the multiple features were tested for a single outcome.
We applied the SVM-LASSO model [4] to predict the malignancy of nodules. The model uses size (BB_AP) and texture (SD_IDM) features. We inter-compared the original SVM-LASSO model and other feature combinations of the two features and new spiculation features or radiologist’s spiculation score (RS), respectively. We use the same data set and evaluation method in [4] to evaluate the new radiomics model with spiculation. Moreover, we also evaluated another model building process using weak-labeled data (radiological malignancy score, RM) to predict pathological malignancy (PM), which allowed more data to be used despite missing pathological malignancy.
3. Results
We have evaluated the proposed spiculation quantification method by comparing with radiologists score and applied the interpretable spiculation features in the nodule malignancy prediction, which is the main task in the lung cancer screening.
3.1. Data Preparation
The Lung Image Database Consortium image collection (LIDC-IDRI) [27, 28] and LUNGx datasets [29, 30] were applied to evaluate the proposed method, and the data flow is shown in Fig. 5. LIDC contains 1018 cases with low-dose screening thoracic CT scans and marked-up annotated lesions. Four experienced thoracic radiologists annotated nodules, including delineation, malignancy (RM), spiculation (RS), margin, texture, and lobulation. Eight hundred eighty-three cases in the dataset have nodules with contours. For the biggest nodules in each case, we applied semi-auto segmentation for more reproducible spiculation quantification and also calculated consensus segmentation using STAPLE to combine multiple contours by the radiologists. The accuracy of our semi auto-segmentation compared to the consensus contour was 0.71±0.13 in terms of the dice coefficient. LUNGx consists of 10 cases for calibration set (10 nodules) and 60 cases for the test set (73 nodules). We applied the same semi-auto segmentation to nodules in the LUNGx dataset.
For more rigorous data analysis, we divided the LIDC dataset into two subsets depending on whether pathological malignancy (LIDC_PM, N=72) or radiological malignancy (LIDC_RM, N=811) was available. The radiological malignancy scores are 1 - highly unlikely, 2 - moderately unlikely, 3 - indeterminate likelihood, 4 - moderately suspicious, and 5 - highly suspicious for cancer. RM>3 (moderately suspicious to highly suspicious) was considered radiological malignancy. RS in the dataset ranged between 1 (non-spiculated) and 5 (highly spiculated). There are up to four annotations for each nodule, we aggregated the scores by applying voting. When there are two most frequent scores, we chose higher score. We binarized the RS using three different cutoffs (1,2, and 3) because the current clinical standard uses binary classification, non-spiculated (NS) and spiculated (S), as shown in Table 2 (the cutoff at four is shown for reference).
Table 2:
LIDC_PM | LIDC_RM | |||
---|---|---|---|---|
Ts | NS | S | NS | S |
1 | 35 | 37 | 474 | 337 |
2 | 55 | 17 | 704 | 107 |
3 | 58 | 14 | 747 | 64 |
4 | 67 | 5 | 790 | 21 |
To optimize spiculation height and solid angle thresholds, Th and TΩ, for filtering out false positives such as small peaks and lobulations, we used Phantom FDA layout #4 as shown in Fig. 6 [31, 32, 33]. We tuned the thresholds to clearly differentiate the spiculations from lobulations (annotations available in Phantom FDA) and to detect as many spiculations as possible without false positives. The final selected thresholds were Th ≥ 3mm and TΩ ≤ 0.65sr. Fig. 7 shows the results of spiculation quantification for each nodule in the phantom data. The optimal thresholds excluded lobulations and elliptical shape corners from final spiculations.
3.2. Spiculation Quantification
The proposed method was implemented in Matlab 2017b, and experiments were performed on a workstation with Intel(R) Xeon(R) CPU E5–1620 v2 @ 3.70GHz and 32 GB RAM running macOS 10.15.5. Table. 3 shows average run-time of the spiculation quantification and its major components.
Table 3:
Average Run-time (s) | |
---|---|
Algorithm 1 | 0.29 |
Curvature Computation | 0.65 |
Other Computations | 0.42 |
Total | 1.36 |
Since we evaluated the malignancy prediction model on LIDC_PM, we performed univariate analysis on LIDC_RM to avoid the selection bias in malignancy prediction model building. In the univariate analysis, 84 features were identified as significant features (adjusted p-value<0.05) for spiculation quantification. Among these, 56 features were highly correlated with size features (ρ > 0.75); size is one of the main criteria for diagnosing malignancy. Thus, we removed all the size-related features, including sa and Np, to provide complementary information. After applying the size-related feature removal, 28 significant features remained, and we picked 20 highly correlated features with RS. Half of these were texture or intensity statistics features, which are not interpretable. Almost all of our spiculation measures were significant and ranked in the top 20. Table 4 show the univariate analysis results of the top 20 features using semi-auto segmentation. None of Dhara’s spiculation scores (sa and sb) were selected in the top 20 features even though they were significant features. sa was excluded by its high correlation with size (ρ = 0.87), and sb was not ranked among the top 20.
Table 4:
Rank | Feature name | AUC | Corr | |||
---|---|---|---|---|---|---|
Ts > 1 | Ts > 2 | Ts > 3 | Average | ρ | ||
1 | Ns | 0.73 | 0.79 | 0.81 | 0.78 | 0.44 |
2 | Roundness(mesh) | 0.73 | 0.83 | 0.82 | 0.79 | −0.44 |
3 | SD LRE | 0.74 | 0.80 | 0.76 | 0.77 | −0.44 |
4 | Mean ϵ | 0.72 | 0.80 | 0.81 | 0.77 | −0.41 |
5 | Minimum ϵ | 0.72 | 0.79 | 0.79 | 0.77 | −0.41 |
6 | Mean LRLGE | 0.72 | 0.76 | 0.75 | 0.74 | −0.40 |
7 | SD LRLGE | 0.71 | 0.77 | 0.77 | 0.75 | −0.39 |
8 | Weighted Principal Moment 1 | 0.71 | 0.76 | 0.78 | 0.75 | 0.39 |
9 | Median ϵ | 0.71 | 0.79 | 0.79 | 0.76 | −0.39 |
10 | Variance ϵ | 0.70 | 0.78 | 0.79 | 0.76 | 0.39 |
11 | Maximum ϵ | 0.71 | 0.77 | 0.78 | 0.75 | 0.38 |
12 | SD CS | 0.71 | 0.73 | 0.71 | 0.72 | −0.38 |
13 | 2D Weighted Principal Moment 1 | 0.71 | 0.74 | 0.76 | 0.73 | 0.37 |
14 | s1 | 0.70 | 0.77 | 0.79 | 0.75 | −0.37 |
15 | SD LGRE | 0.70 | 0.73 | 0.73 | 0.72 | −0.36 |
16 | 2D Roundness(voxel) | 0.68 | 0.78 | 0.77 | 0.74 | −0.34 |
17 | s2 | 0.68 | 0.75 | 0.77 | 0.73 | 0.34 |
18 | SD SRLGE | 0.68 | 0.72 | 0.71 | 0.70 | −0.33 |
19 | Mean Energy | 0.69 | 0.69 | 0.66 | 0.68 | −0.33 |
20 | 2D Sum | 0.67 | 0.72 | 0.73 | 0.71 | −0.32 |
We also performed multivariate analysis to classify PNs into spiculated or non-spiculated on different thresholds (Ts). The multivariate classification models were evaluated by 10 times 10-fold cross validation (CV) The classification performance is shown in Table 5. Highly spiculated PNs (Ts > 2 and Ts > 3) were accurately classified in both subsets (LIDC_PM: accuracy=83.9%, 79.7% and LIDC_RM: accuracy=78.8%, 79.1%), but all the spiculated PNs (Ts > 1) were not stable in the classification accuracy (LIDC_PM: accuracy=79.7% and LIDC_PM: accuracy=69.7%).
Table 5:
Threshold | Sensitivity | Specificity | Accuracy | AUC |
---|---|---|---|---|
10x10-fold CV on LIDC_PM | ||||
Ts > 1 | 69.9±2.1% | 87.5±1.8% | 79.7±1.7% | 0.84±0.01 |
Ts > 2 | 83.8±3.9% | 83.9±2.7% | 83.9±2.4% | 0.90±0.01 |
Ts > 3 | 73.0±2.9% | 81.7±1.2% | 79.7±0.9% | 0.87±0.01 |
10x10-fold CV on LIDC_RM | ||||
Ts > 1 | 47.9±0.4% | 85.8±0.3% | 69.7±0.3% | 0.75±0.01 |
Ts > 2 | 73.2±0.4% | 79.7±0.4% | 78.8±0.3% | 0.81±0.01 |
Ts > 3 | 75.2±2.0% | 79.4±0.3% | 79.1±0.4% | 0.83±0.01 |
3.3. Malignancy Prediction
We built models using feature combinations of the previously selected features (Size: BB_AP and Texture: SD_IDM) from [4], and the interpretable spiculation features (Ns, Na, Nl, Np,ra, s1, s2). As shown in Fig. 5, the model trained by LIDC was then externally validated by LUNGx dataset, which was collected for a lung cancer screening competition (LUNGx Challenge) and provides a calibration set (size-matched ten nodules, five benign and five malignant) and a test set (73 nodules, 37 benign and 36 malignant) [29, 30, 33]. For the external validation, we followed the model evaluation process of the LUNGx Challenge [30]. The model was calibrated by the calibration set of LUNGx (Model’) and finally evaluated by the test set (73 cases) of LUNGx. We used zero value instead of missing variable RS in the external validation because LUNGx does not provide it. Since pathological malignancy (PM) was only available for the 72 cases, we used weak-labeled data (LIDC_RM, N=811) based on the radiological malignancy score (RM). We divided the weak-labeled data into two groups (training 80% and validation 20%) for training and optimizing the model. Then, the best model was evaluated on strong-labeled data (LIDC_PM, N=72). We repeated the analysis 100 times to measure the statistical variance of the models. Table 6 shows the classification results of each model, and their external validation. The 10×10-fold CV of the model using Size and our spiculation features (Size+Spiculations) on LIDC_PM (accuracy=82.6% and AUC=0.85), which did not use weak-labeled data, outperformed the previous model (Size+Texture, accuracy=74.9% and AUC=0.83). However, their external validation on LUNGx was not good as the CV results (Size+Texture: accuracy=66.4% and AUC=0.63, Size+Spiculation: accuracy=67.5% and AUC=0.65). The Size+Spiculations model trained by weak-labeled data showed comparable performance (accuracy=75.2% and AUC=0.80) to the Size+Texture model (accuracy=73.7% and AUC=0.82) in the validation on LIDC_PM, but the performance of Size+Spiculations was much higher (accuracy=71.8% and AUC=0.76) than Size+Texture (accuracy=57.8% and AUC=0.61) in the external validation.
Table 6:
Features | Sensitivity | Specificity | Accuracy | AUC |
---|---|---|---|---|
10x10-fold CV on LIDC_PM | ||||
Size+Texture | 77.1±2.2% | 71.9±2.4% | 74.9±1.4% | 0.83±0.01 |
Size+Spiculation | 84.4±1.0% | 80.2±2.7% | 82.6±1.1% | 0.85±0.01 |
Validation on LIDC_PM | ||||
Size+Texture | 73.4±0.7% | 74.2±0.1% | 73.7±0.4% | 0.82±0.01 |
Size+Spiculations | 76.0±0.9% | 74.2±0.1% | 75.2±0.5% | 0.80±0.01 |
External validation on LUNGx | ||||
Size+Texture | 75.3±4.5% | 40.8±4.2% | 57.8±2.7% | 0.61±0.03 |
Size+Spiculations | 77.6±5.8% | 66.2±6.7% | 71.8±2.4% | 0.76±0.01 |
Table 7 shows the comparisons with the top 3 participants and 6 radiologists in LUNGx Challenge [30]. The model trained using the weak-labeled data showed an AUC of 0.76, which was better than the best model and two radiologists in the LUNGx challenge. The model trained by the strong-labeled data (AUC=0.69). Our previous radiomics model (Size+Texture) showed comparable performance (strong-labeled: AUC=0.67 and weak-labeled: AUC=0.68) with the best model in the LUNGx Challenge (AUC=0.68). Weak-labeled data training generated more robust and flexible models due to the larger volume of data available (about ten times larger than the strong-labeled data); even though the outcomes were not pathologically proven, they were correlated with the real outcome.
Table 7:
AUC | Segmentation | Classifier | Training data | |
---|---|---|---|---|
Models | ||||
9 | 0.61 | Auto | SVM | NLST |
10 | 0.66 | Discriminant function | LUNGx | |
Choi et al. [4] | 0.67 | Manual | SVM - Size+Texture | LIDC_PM |
11 | 0.68 | Semi-auto | Support vector regressor | In-house |
Choi et al. [4] | 0.68 | Manual | SVM - Size+Texture | LIDC_RM |
Proposed Method | 0.69 | Semi-auto | SVM - Size+Spiculation | LIDC_PM |
Radiologists | ||||
1 | 0.70 | |||
2 | 0.75 | |||
Proposed Method | 0.76 | Semi-auto | SVM - Size+Spiculations | LIDC_RM |
3 | 0.78 | |||
4 | 0.82 | |||
5 | 0.83 | |||
6 | 0.85 |
4. Discussion
Our reproducible and interpretable number-of-spiculations feature (Ns) achieved higher correlation with RS than other features, as shown in Table 4. Many texture features were selected in the top 20, but since it is hard to interpret the correlation between these texture features and spiculations, these can be excluded to avoid the uninterpretability of the final models. Moreover, just using our interpretable features, we can also avoid the mandatory image/feature harmonization in the pre-processing step for any new given dataset (repository). Roundness features showed good correlations with spiculation as well as good accuracy in the malignancy prediction because they quantify the irregularity of the target shape. However, these cannot filter out lobulation or attachments from spiculations.
The radiomics models using semi-auto segmentation showed relatively lower performance than manual segmentation. The models using size (BB_AP) and texture (SD_IDM) showed a big difference between manual segmentation (79.2% accuracy) and semi-auto segmentation (73.7% accuracy). However, it is difficult to normalize the texture feature. Thus the models using SD_IDM were less stable, and the performance was significantly degraded in the weak-labeled data training and external validation.
Adding radiologist’s spiculation score into our previous radiomics model using size and texture (Size+Texture, 74.9% accuracy) [4] could improve the performance (Size+Texture+RS, 77.4% accuracy). Similarly, combining Size and RS without Texture (Size+RS, 76.5% accuracy) showed better performance, and A model combining our spiculation features and Size without texture (Size+Spiculations, 75.2% accuracy) was slightly better than Size+Texture. In essence, the texture feature SD_IDM could be replaced by our interpretable spiculation features.
We employed weak-labeled data (LIDC_RM) to train the malignancy prediction model because of the lack of pathological malignancy data. These models showed comparable performance to the model trained by strong-labeled data (LIDC_PM). In the case of strong-labeled data training, it was difficult to avoid bias and over-fitting due to the lack of training data, while building accurate prediction models for malignant nodules. Hence, these models were more susceptible to failure because of the lack of adaptability to out-of-training unlabeled data. In contrast, weak-labeled data training can help build models that mimic conventional lung cancer screening by radiologists in the clinic using correlation with pathological malignancy. Moreover, a large amount of weak-labeled training data is usually accessible, thus allowing the creation of a more robust model and better performance than the strong-labeled data in external validation.
Therefore, we provide guidelines for a radiomics workflow to overcome the limitations of conventional radiomics studies using weak-labeled data and interpretable and reproducible features. Specifically, if the number of strong-labeled datasets is insufficient to build a good model, the abandoned weak-labeled data can be utilized in further analysis, as in the current study. Leveraging weak-labeled data in the clinic enables continuous tuning of radiomics models - training using diagnosis (weak-labeled) followed by evaluation using the clinical outcomes (strong-labeled). A possible pipeline for the new radiomics is as follows:
Univariate analysis or unsupervised learning of strong-labeled data
Build multivariate models based on results from step 1 and cross-validation using the data
Univariate analysis or unsupervised learning of weak-labeled data
Enhance the model from step 2 based on the results from step 3
External validation
Repeat steps 3–5 to tune the model
4.1. Conclusion and Future Work
We developed a reproducible and interpretable, parameter-free technique for quantifying spiculations on nodules using the area distortion metric from the conformal (angle-preserving) spherical parameterization. In this paper, to the best of our knowledge for the first time, we exploit the insight that for an angle-preserved (conformal) spherical mapping of a given nodule, the negative area distortion precisely characterizes the spiculations/spikes on that nodule. The spiculation quantification measures and radiomics features based on reproducible semi-automatic segmentation of nodule was then applied to the radiomics framework for pathological malignancy prediction. The number-of-spiculations feature was found to be highly correlated (Spearman’s rank correlation coefficient ρ = 0.44) with the radiologists’ spiculation score. Using just our interpretable features (size, attachment, spiculation, lobulation) in the radiomics framework, we were able to achieve AUC=0.80 on LIDC and AUC=0.76 on LUNGx (the previous LUNGx best being AUC=0.68).
In the future, we will exhaustively test our reproducible and interpretable model for lung cancer screening in the clinic. We plan to apply the recently developed deep learning models to segment nodules and expect further improvement in the spiculation quantification as well as prediction accuracy. The proposed interpretable features will be further investigated in other cancers, e.g., breast cancer BI-RADS [34].
Table 1:
Symbol | Definition |
---|---|
BB_AP | Bounding box length of anterior-posterior direction |
SD_IDM | Standard deviation of inverse difference moment |
ϵ | Area distortion metric |
Np | The number of peaks |
Ns | The number of spiculations |
Nl | The number of lobulations |
Na | The number of attached peaks |
s1 & s2 | The proposed spiculation scores (sharpness and irregularity) |
sa & sb | Dhara’s spiculation scores [15] |
Ts | Threshold to binarize RS |
Th | Minimum height of spiculaiton |
TΩ | Maximum solid angle of spiculation |
sr | Steradian, the SI unit of solid angle |
Highlights:
A novel interpretable spiculation feature is presented, computed using the area distortion metric from spherical conformal (angle-preserving) parameterization.
A simple one-step feature and prediction model is introduced which only uses our interpretable features (size, spiculation, lobulation, vessel/wall attachment) and has the added advantage of using weak-labeled training data.
A semi-automatic segmentation algorithm is also introduced for more accurate and reproducible lung nodule as well as vessel/wall attachment segmentation. This leads to more accurate spiculation quantification because the attachments can be excluded from spikes on the lung nodule surface (triangular mesh) data.
Using just our interpretable features (size, attachment, spiculation, lobulation), we were able to achieve AUC=0.82 on public Lung LIDC dataset and AUC=0.76 on public LUNGx dataset (the previous LUNGx best being AUC=0.68).
State-of-the-art correlation is achieved between our spiculation score (the number of spiculations, Ns) and radiologists spiculation score (ρ = 0.44).
Acknowledgements
This project was supported in part by NIH/NCI Grant R01CA172638, AFOSR Grants FA9550-17-1-0435, and FA9550-20-1-0029, National Institute of Aging Grant R01-AG048769, MSK Cancer Center Support Grant/Core Grant (P30 CA008748), and a grant from Breast Cancer Research Foundation BCRF-17-193. None of the authors have any competing financial interests. All the datasets used in this study are publicly available.
Appendix A. Results using manual segmentation
We also evaluated our method using manual segmentation. The consensus manual contour generated by using STAPLE to combine multiple contours (up to 4). Table A.8 shows the univariate analysis results of the top 20 features using manual segmentation. None of Dhara’s spiculation scores (sa and sb) were selected in the top 20 features even though they were significant features. Table A.9 shows the malignancy classification results of each model, and their external validation. The 10×10-fold CV of the model using Size and our spiculation features (Size+Spiculations) on LIDC_PM (accuracy=85.1% and AUC=0.85), which did not use weak-labeled data, showed comparable performance to the previous model (Size+Texture, accuracy=84.9% and AUC=0.89). The Size+Spiculations model trained by weak-labeled data showed comparable performance (accuracy=79.3% and AUC=0.86) to the Size+Texture model (accuracy=79.2% and AUC=0.83) in the validation on LIDC_PM, but the performance of Size+Spiculations was much higher (accuracy=71.5% and AUC=0.74) than Size+Texture (accuracy=60.7% and AUC=0.68) in the external validation.
Table A.8:
Rank | Feature name | AUC | Corr | |||
---|---|---|---|---|---|---|
Ts > 1 | Ts > 2 | Ts > 3 | Average | ρ | ||
1 | Roundness(mesh) | 0.73 | 0.84 | 0.82 | 0.80 | −0.44 |
2 | SD LRE | 0.74 | 0.80 | 0.78 | 0.77 | 0.44 |
3 | Ns | 0.72 | 0.82 | 0.83 | 0.79 | 0.44 |
4 | Mean ϵ | 0.71 | 0.80 | 0.79 | 0.77 | −0.40 |
5 | Minimum ϵ | 0.71 | 0.80 | 0.77 | 0.76 | −0.40 |
6 | WPM1 | 0.72 | 0.75 | 0.77 | 0.74 | 0.40 |
7 | Median ϵ | 0.71 | 0.78 | 0.79 | 0.76 | −0.39 |
8 | 2D Roundness(voxel) | 0.69 | 0.82 | 0.79 | 0.77 | −0.38 |
9 | 2D WPM1 | 0.70 | 0.74 | 0.75 | 0.73 | 0.37 |
10 | SD LRLGE | 0.69 | 0.74 | 0.74 | 0.72 | −0.35 |
11 | Mean LRLGE | 0.69 | 0.74 | 0.72 | 0.72 | −0.35 |
12 | 2D Sum | 0.69 | 0.73 | 0.72 | 0.71 | −0.34 |
13 | SD CS | 0.69 | 0.70 | 0.68 | 0.69 | −0.34 |
14 | s1 | 0.67 | 0.77 | 0.75 | 0.73 | −0.33 |
15 | SD LGRE | 0.67 | 0.71 | 0.71 | 0.70 | −0.31 |
16 | SD SRLGE | 0.66 | 0.70 | 0.70 | 0.69 | −0.30 |
17 | Nl | 0.65 | 0.69 | 0.71 | 0.68 | 0.29 |
18 | Mean Energy | 0.67 | 0.64 | 0.63 | 0.65 | −0.29 |
19 | s2 | 0.65 | 0.74 | 0.72 | 0.70 | 0.28 |
20 | SD SRE | 0.64 | 0.71 | 0.70 | 0.69 | −0.27 |
Table A.9:
Features | Sensitivity | Specificity | Accuracy | AUC |
---|---|---|---|---|
10x10-fold CV on LIDC_PM | ||||
Size+Texture | 86.9± 1.0% | 81.2±3.6% | 84.4±1.7% | 0.89±0.01 |
Size+Spiculation | 87.9±1.0% | 81.5± 1.6% | 85.1±1.0% | 0.88±0.01 |
Validation on LIDC_PM | ||||
Size+Texture | 78.1±0.3% | 80.7±0.3% | 79.2±0.2% | 0.86±0.01 |
Size+Spiculations | 78.3±0.7% | 80.6±0.0% | 79.3±0.4% | 0.83±0.01 |
External validation on LUNGx | ||||
Size+Texture | 67.5±5.7% | 54.0±5.5% | 60.7±3.5% | 0.68±0.04 |
Size+Spiculations | 81.9±1.9% | 61.3±3.7% | 71.5±2.1% | 0.74±0.02 |
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Declaration of interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References
- [1].Siegel RL, Miller KD, Jemal A, Cancer statistics, 2016, CA: A Cancer Journal for Clinicians 66 (1) (2016) 7–30. [DOI] [PubMed] [Google Scholar]
- [2].Aberle DR, DeMello S, Berg CD, Black WC, et al. , Results of the two incidence screenings in the National Lung Screening Trial, New England Journal of Medicine 369 (10) (2013) 920–931. doi: 10.1056/NEJMoa1208962. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [3].Buty M, Xu Z, Gao M, Bagci U, Wu A, Mollura DJ, Characterization of lung nodule malignancy using hybrid shape and appearance features, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, 2016, pp. 662–670. [Google Scholar]
- [4].Choi W, Oh JH, Riyahi S, Liu C-J, Jiang F, Chen W, White C, Rimner A, Mechalakos JG, Deasy JO, Lu W, Radiomics analysis of pulmonary nodules in low-dose CT for early detection of lung cancer, Medical Physics-doi: 10.1002/mp.12820. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Hawkins S, Wang H, Liu Y, Garcia A, Stringfield O, Krewer H, Li Q, Cherezov D, Gatenby RA, Balagurunathan Y, et al. , Predicting malignant nodules from screening CT scans, Journal of Thoracic Oncology 11 (12) (2016) 2120–2128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Thawani R, McLane M, Beig N, Ghose S, Prasanna P, Velcheti V, Madabhushi A, Radiomicsand radiogenomics in lung cancer: A review for the clinician, Lung Cancer 115 (2018) 34–41. doi: 10.1016/j.lungcan.2017.10.015. [DOI] [PubMed] [Google Scholar]
- [7].Kumar D, Shafiee MJ, Chung AG, Khalvati F, Haider MA, Wong A, Discovery radiomics for computed tomography cancer detection, arXiv preprint arXiv:1509.00117 [Google Scholar]
- [8].Liu Y, Balagurunathan Y, Atwater T, Antic S, Li Q, Walker RC, Smith G, Massion PP, Schabath MB, Gillies RJ, Radiological image traits predictive of cancer status in pulmonary nodules, Clinical Cancer Research (2016) clincanres–3102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Nadeem S, Su Z, Zeng W, Kaufman A, Gu X, Spherical parameterization balancing angle and area distortions, IEEE Transactions on Visualization and Computer Graphics 23 (6) (2017) 1663–1676. [DOI] [PubMed] [Google Scholar]
- [10].Swensen SJ, Silverstein MD, Ilstrup DM, Schleck CD, Edell ES, The probability of malignancy in solitary pulmonary nodules: Application to small radiologically indeterminate nodules, Archives of Internal Medicine 157 (8) (1997) 849–855. doi: 10.1001/archinte.1997.00440290031002. [DOI] [PubMed] [Google Scholar]
- [11].McKee BJ, Regis SM, McKee AB, Flacke S, Wald C, Performance of ACR Lung-RADS in a clinical CT lung screening program, Journal of the American College of Radiology 12 (3) (2015) 273–276. doi: 10.1016/j.jacr.2014.08.004. [DOI] [PubMed] [Google Scholar]
- [12].McWilliams A, Tammemagi MC, Mayo JR, Roberts H, et al. , Probability of cancer in pulmonary nodules detected on first screening CT, New England Journal of Medicine 369 (10) (2013) 910–919. doi: 10.1056/NEJMoa1214726. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Niehaus R, Raicu DS, Furst J, Armato S, Toward understanding the size dependence of shape features for predicting spiculation in lung nodules for computer-aided diagnosis, Journal of Digital Imaging 28 (6) (2015) 704–717. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [14].Ciompi F, Jacobs C, Scholten ET, van Riel SJ, Wille MMW, M.D. MP, van Ginneken B, Automatic detection of spiculation of pulmonary nodules in computed tomography images, in: Hadjiiski LM, Tourassi GD (Eds.), Medical Imaging 2015: Computer-Aided Diagnosis, Vol. 9414, International Society for Optics and Photonics, SPIE, 2015, pp. 58–63. doi: 10.1117/12.2081426. [DOI] [Google Scholar]
- [15].Dhara AK, Mukhopadhyay S, Saha P, Garg M, Khandelwal N, Differential geometry-based techniques for characterization of boundary roughness of pulmonary nodules in CT images, International Journal of Computer Assisted Radiology and Surgery 11 (3) (2016) 337–349. [DOI] [PubMed] [Google Scholar]
- [16].Choi W, Nadeem S, Riyahi S, Deasy JO, Tannenbaum A, Lu W, Interpretable spiculation quantification for lung cancer screening, in: Reuter M, Wachinger C, Lombaert H, Paniagua B, Lüthi M, Egger B (Eds.), Shape in Medical Imaging, Lecture Notes in Computer Science, Springer International Publishing, Cham, 2018, pp. 38–48. doi: 10.1007/978-3-030-04747-4-4. [DOI] [Google Scholar]
- [17].Angenent S, Haker S, Tannenbaum A, Kikinis R, On the LaplaceBeltrami operator and brain surface flattening, IEEE Transactions on Medical Imaging 18 (8) (1999) 700–711. [DOI] [PubMed] [Google Scholar]
- [18].Do Carmo M, Riemannian Geometry, Birkhäuser, 1992. [Google Scholar]
- [19].Tu L, Differential Geometry, Springer, 2017. [Google Scholar]
- [20].Kazdan J, Warner F, Curvature functions for compact 2D manifolds, Annals of Mathematics 99 (1) (1974) 14–47. [Google Scholar]
- [21].Warfield SK, Zou KH, Wells WM, Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation, IEEE Transactions on Medical Imaging 23 (7) (2004) 903–921. doi: 10.1109/TMI.2004.828354. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [22].Choi W, Xue M, Lane BF, Kang MK, Patel K, Regine WF, Klahr P, Wang J, Chen S, W. D¢-Souza, W. Lu, Individually optimized contrast-enhanced 4D-CT for radiotherapy simulation in pancreatic ductal adenocarcinoma, Medical Physics 43 (10) (2016) 5659–5666. doi: 10.1118/1.4963213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [23].Jiang J, Hu Y-C, Liu C-J, Halpenny D, Hellmann MD, Deasy JO, Mageras G, Veeraraghavan H, Multiple resolution residually connected feature streams for automatic lung tumor segmentation from ct images, IEEE transactions on medical imaging 38 (1) (2018) 134–144. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [24].Jiang J, Hu YC, Tyagi N, Rimner A, Lee N, Deasy JO, Berry S, Veeraraghavan H, Psigan: Joint probabilistic segmentation and image distribution matching for unpaired cross-modality adaptation based mri segmentation, IEEE Transactions on Medical Imaging. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Vezhnevets V, Konouchine V, GrowCut: Interactive multi-label ND image segmentation by cellular automata, in: proc. of Graphicon, Vol. 1, Citeseer, 2005, pp. 150–156. [Google Scholar]
- [26].Yip SSF, Parmar C, Blezek D, Estepar RSJ, Pieper S, Kim J, Aerts HJWL, Application of the 3d slicer chest imaging platform segmentation algorithm for large lung nodule delineation, PLOS ONE 12 (6) (2017) 1–17. doi: 10.1371/journal.pone.0178944. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [27].Armato SG, McLennan G, Bidaut L, McNitt-Gray MF, Meyer CR, et al. , Data from LIDC-IDRI. The Cancer Imaging Archive. (2015). doi: 10.7937/K9/TCIA.2015.LO9QL9SX. [DOI] [Google Scholar]
- [28].Armato SG, McLennan G, Bidaut L, McNitt-Gray MF, Meyer CR, et al. , The lung image database consortium (LIDC) and image database resource initiative (IDRI): A completed reference database of lung nodules on CT scans, Medical Physics 38 (2) (2011) 915–931. doi: 10.1118/1.3528204. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [29].Armato SG, Hadjiiski L, Tourassi GD, Drukker K, Giger ML, Li F, Redmond G, Farahani K, Kirby JS, Clarke LP, SPIE-AAPM-NCI lung nodule classification challenge dataset. The Cancer Imaging Archive (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [30].Armato SG, Drukker K, Li F, Hadjiiski L, Tourassi GD, Kirby JS, Clarke LP, Engelmann RM, Giger ML, Redmond G, et al. , LUNGx Challenge for computerized lung nodule classification, Journal of Medical Imaging 3 (4) (2016) 044506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [31].Gavrielides MA, Kinnard LM, Myers KJ, Peregoy J, Pritchard WF, Zeng R, Esparza J, Karanian J, Petrick N, Data from Phantom_FDA. The Cancer Imaging Archive. (2015). doi: 10.7937/K9/TCIA.2015.ORBJKMUX. [DOI] [Google Scholar]
- [32].Gavrielides MA, Kinnard LM, Myers KJ, Peregoy J, Pritchard WF, Zeng R, Esparza J, Karanian J, Petrick N, A resource for the assessment of lung nodule size estimation methods: database of thoracic CT scans of an anthropomorphic phantom, Opt. Express 18 (14) (2010) 15244–15255. doi: 10.1364/OE.18.015244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [33].Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F, The cancer imaging archive (TCIA): Maintaining and operating a public information repository, Journal of Digital Imaging 26 (6) (2013) 1045–1057. doi: 10.1007/s10278-013-9622-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [34].Kerlikowske K, Scott CG, Mahmoudzadeh AP, Ma L, Winham S, Jensen MR, Wu FF, Malkov S, Pankratz VS, Cummings SR, Shepherd JA, Brandt KR, Miglioretti DL, Vachon CM, Automated and clinical breast imaging reporting and data system density measures predict risk for screen-detected and interval cancers: A case-control study, Annals of Internal Medicine 168 (11) (2018) 757–765. doi: 10.7326/M17-3008. [DOI] [PMC free article] [PubMed] [Google Scholar]