Quantitative image features from radiomic biopsy differentiate oncocytoma from chromophobe renal cell carcinoma

Akshay Jaggi; Domenico Mastrodicasa; Gregory W Charville; R Brooke Jeffrey, Jr; Sandy Napel; Bhavik Patel

doi:10.1117/1.JMI.8.5.054501

. 2021 Sep 7;8(5):054501. doi: 10.1117/1.JMI.8.5.054501

Quantitative image features from radiomic biopsy differentiate oncocytoma from chromophobe renal cell carcinoma

Akshay Jaggi ^a,^*, Domenico Mastrodicasa ^a, Gregory W Charville ^b, R Brooke Jeffrey Jr ^a, Sandy Napel ^a, Bhavik Patel ^c,^d

PMCID: PMC8423237 PMID: 34514033

Abstract.

Purpose: To differentiate oncocytoma and chromophobe renal cell carcinoma (RCC) using radiomics features computed from spherical samples of image regions of interest, “radiomic biopsies” (RBs).

Approach: In a retrospective cohort study of 102 CT cases [68 males (67%), 34 females (33%); mean age ± SD, $63 \pm 12 years$ ], we pathology-confirmed 42 oncocytomas (41%) and 60 chromophobes (59%). A board-certified radiologist performed two RB rounds. From each RB round, we computed radiomics features and compared the performance of a random forest and AdaBoost binary classifier trained from the features. To control for overfitting, we performed 10 rounds of 70% to 30% train-test splits with feature-selection, cross-validation, and hyperparameter-optimization on each split. We evaluated the performance with test ROC AUC. We tested models on data from the other RB round and compared with the same round testing with the DeLong test. We clustered important features for each round and measured a bootstrapped adjusted Rand index agreement.

Results: Our best classifiers achieved an average AUC of $0.71 \pm 0.024$ . We found no evidence of an effect for RB round ( $p = 1$ ). We also found no evidence for a decrease in model performance when tested on the other RB round ( $p = 0.85$ ). Feature clustering produced seven clusters in each RB round with high agreement ( $Rand index = 0.981 \pm 0.002$ , $p < 0.00001$ ).

Conclusions: A consistent radiomic signature can be derived from RBs and could help distinguish oncocytoma and chromophobe RCC.

Keywords: radiomics, machine learning, chromophobe renal cell carcinoma, oncocytoma, kidney, computed tomography

1. Introduction

Over the past few decades, renal cancer diagnosis has seen a growing paradigm shift with increased utilization of percutaneous biopsy in the management of indeterminate renal masses.¹ The American Urologic Guidelines now incorporate the use of renal biopsy in specific circumstances.²^,³ The rationale behind this trend is based on several factors. First, improving capabilities of advanced imaging techniques, such as multidetector computed tomography, magnetic resonance imaging (MRI), and ultrasound (US), and their increased utilization, has led to increased detection of incidental renal masses, particularly those $< 4 cm$ .²^,⁴^–⁷ Along with an increased incidence of small renal cell carcinomas (RCCs), the incidence of benign renal lesions has also concomitantly increased.¹^,⁸ Second, while most solid renal masses in adults are RCCs, a significant fraction are benign, particularly if they are small in size. Benign lesions can comprise up to 30% of masses $< 2 cm$ in size and more than 44% of those $< 1 cm$ .⁹

In addition, the use of surgical and ablation treatments for renal masses has been rising over the past two decades, which would suggest higher rates of earlier detection, staging, and improved outcomes.⁵^,¹⁰ However, mortality rates have not significantly improved with such aggressive treatment approaches.⁵^,¹¹ Moreover, RCCs do not all behave similarly and can have different degrees of biologic aggressiveness based on histologic subtype, nuclear grade, and size.¹^,¹⁰ This variable clinical behavior, in turn, can impact management and treatment strategies, as it pertains not only to the type of surgical resection (i.e., nephron sparing versus complete surgical resection) but also to a growing number of patients with comorbidities who may be considered for active surveillance.¹² Finally, image-guided percutaneous renal biopsies are considered relatively safe with current techniques.¹^,¹³

While image-guided percutaneous biopsy may differentiate benign from malignant solid renal masses and provide histologic subtyping of RCCs, reliable confirmation of oncocytomas at biopsy can be challenging. This is largely due to the fact that oncocytic cells can be present in oncocytomas as well as other “oncocytic renal neoplasms,” including chromophobe RCC, the eosinophilic variant of papillary RCC, and hybrid tumors that contain benign and malignant components.¹^,¹⁴ Of these oncocytic renal neoplasms, chromophobe RCC is most difficult to distinguish from oncocytoma in a biopsy specimen. Because chromophobe RCCs can contain oncocytic elements and biopsies are subject to sampling error, reliable exclusion of RCCs or confirmation of a benign oncocytoma may not always be possible. In these instances, at biopsy, a pathologist may render a designation of “oncocytic renal neoplasm” with the added statement that, if the biopsy is representative of the entire lesion, then oncocytoma is likely; otherwise, chromophobe RCC cannot be excluded.¹⁴ Though some reports suggest that special immunohistologic stains may reliably make the diagnosis of oncocytoma, an invasive procedure to obtain tissue (i.e., biopsy) would still be required and not all centers perform these ancillary immunohistochemical studies.¹² Given the challenges of distinguishing oncocytoma and chromophobe RCC at biopsy, imaging-based diagnostic tools could prevent the need for such invasive, often inconclusive procedures. Though some studies have reported imaging features on CT that could help distinguish oncocytomas from RCC, there may be overlap of the presence of such features and reliable distinction is not always possible. Furthermore, studies have also suggested this distinction is not reliable on MRI.¹⁵^–¹⁸

Differentiation of oncocytoma and chromophobe RCC by imaging has been difficult due to their similar appearances as enhancing renal masses.¹⁹ In an early study, size, shape, and interface with surrounding tissues could not distinguish chromophobe RCC and oncocytoma.¹⁹ However, oncocytoma did show higher internal density and stellate scarring. Later, Wu et al. established that “combined evaluation of stellate scar, spoken-wheel-like enhancement, and segmental enhancement inversion features” could uniquely distinguish oncocytoma from chromophobe RCC on CT.²⁰ While these studies suggest that diagnosis of oncocytoma on CT could potentially be possible using the aforementioned imaging features, reliable differentiation from RCC is often difficult, particularly for small lesions.¹⁵^,¹⁷^,²¹^,²²

More detailed quantitative image analysis will likely serve a vital role in differentiating these subtypes. Quantitative image analysis, also known as radiomics, seeks to characterize image features of various types (e.g., intensity distribution, shape, margin sharpness, and size) with numerical values.²³^–²⁶ Computation of radiomics features from radiological images has traditionally relied on segmentation of the object of interest, which can be labor-intensive, time-consuming, and highly variable.²⁷ However, some clinical prediction tasks may only require intensity and texture features, estimates of which may not require a full or accurate segmentation of the object.²⁸^–³¹

To address this, we propose the radiomic biopsy (RB) as an alternative when shape, size, and margin features are not necessary for the classification problem.²⁹^,³⁰ We define the RB as a spherical sample, or a cluster of connected spherical samples, of a volume of interest (VOI). Past work in a hepatocellular carcinoma cohort showed that a similar approach can produce a large decrease in required segmentation time.³⁰ Due to their simple shapes, creation of RBs with an appropriate tool could potentially save radiologists’ time when size, shape, or margin features are not relevant to a clinical prediction task.²⁹^,³⁰

Differentiating oncocytoma and chromophobe RCC is a clinical application potentially suitable for RBs since the two subtypes do not significantly differ in shape or size.¹³^,¹⁴ One specific radiomics analysis, CT texture analysis (CTTA), computes many intensity distribution radiomics features that assess tumor spatial heterogeneity by statistical analysis of the distribution of densities in neighboring pixels. Few studies have demonstrated the capability of utilizing CTTA to distinguish cysts or fat-poor angiomyolipomas from RCC or oncocytomas from RCCs with very few subtypes being of the chromophobe type ( $n = 3$ ).³²^,³³ We hypothesized that oncocytomas and chromophobe RCCs can be distinguished with a machine learning-based predictive algorithm derived from CTTA features of RBs of CT scans. Thus, the purpose of this study was to develop a machine-learning classifier for this diagnostic task and provide a proof-of-concept that the RB technique can produce discriminative classifiers from CTTA features alone.

2. Materials and Methods

This retrospective, single-center, Health Insurance Portability and Accountability Act-compliant study was approved by the Institutional Review Board of (blinded to reviewers), and a waiver of informed consent was obtained.

2.1. TRIPOD and RQS

This study was designed and reported considering guidelines set by the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) and the more-recently proposed radiomics quality score (RQS).³⁴^,³⁵ These two guidelines address low quality reporting of multivariate predictive models and radiomics studies. While TRIPOD provides a checklist for reporting, RQS offers a numeric score for evaluating radiomics study design. The RQS of this paper was optimized given the available data.

After all methods were designed and applied, the RQS of the paper was assessed (Table 2). The study scored 44.44 out of 100 with notable points lost for not being prospective and no external validation. Some additional lost points were irrelevant to this use case (e.g., use of cutoff analysis for risk groups). This score surpasses the average RQS score (22.34%) reported by a recent radiomics survey.³⁶

Table 2.

Radiomics quality score evaluation checklist.

	Criteria	Possible points	Our points
1	Image protocol quality: well-documented image protocols (for example, contrast, slice thickness, and energy) and/or usage of public image protocols allow reproducibility/replicability	+ 1 (if protocols are well-documented) + 1 (if public protocol is used)	2
2	Multiple segmentations: possible actions are segmentation by different physicians/algorithms/software, perturbing segmentations by (random) noise, segmentation at different breathing cycles. Analyze feature robustness to segmentation variabilities	1	1
3	Phantom study on all scanners: detect interscanner differences and vendor-dependent features. Analyze feature robustness to these sources of variability	1	0
4	Imaging at multiple time points: collect images of individuals at additional time points. Analyze feature robustness to temporal variabilities (for example, organ movement, organ expansion/shrinkage)	1	0
5	Feature reduction or adjustment for multiple testing: decreases the risk of overfitting. Overfitting is inevitable if the number of features exceeds the number of samples. Consider feature robustness when selecting features	−3 (if neither measure is implemented)	3
5		+3 (if either measure is implemented)	3
6	Multivariable analysis with nonradiomics features (for example, EGFR mutation): is expected to provide a more holistic model and permits correlating/inferencing between radiomics and nonradiomics features	1	1
7	Detect and discuss biological correlates: demonstration of phenotypic differences (possibly associated with underlying gene–protein expression patterns) deepens understanding of radiomics and biology	1	1
8	Cut-off analyses: determine risk groups by either the median, a previously published cut-off, or report a continuous risk variable. Reduces the risk of reporting overly optimistic results	1	0
9	Discrimination statistics: report discrimination statistics (for example, C-statistic, ROC curve, AUC) and their statistical significance (for example, p-values, confidence intervals). One can also apply resampling method (for example, bootstrapping, cross-validation)	+ 1 (if a discrimination statistic and its statistical significance are reported)	1
9		+ 1 (if a resampling method technique is also applied)	1
10	Calibration statistics: report calibration statistics (for example, calibration-in-the-large/slope, calibration plots) and their statistical significance (for example, P-values, confidence intervals). One can also apply resampling method (for example, bootstrapping, cross-validation)	+ 1 (if a calibration statistic and its statistical significance are reported)	0
10		+ 1 (if a resampling method technique is also applied)	0
11	Prospective study registered in a trial database: provides the highest level of evidence supporting the clinical validity and usefulness of the radiomics biomarker	+ 7 (for prospective validation of a radiomics signature in an appropriate trial)	0
12	Validation: the validation is performed without retraining and without adaptation of the cut-off value, provides crucial information with regard to credible clinical performance	−5 (if validation is missing)	2
		+ 2 (if validation is based on a dataset from the same institute)
		+ 3 (if validation is based on a dataset from another institute)
		+ 4 (if validation is based on two datasets from two distinct institutes)
		+ 4 (if the study validates a previously published signature)
		+ 5 (if validation is based on three or more datasets from distinct institutes)
13	Comparison to “gold standard”: assess the extent to which the model agrees with/is superior to the current “gold standard” method (for example, TNM-staging for survival prediction). This comparison shows the added value of radiomics	2	2
14	Potential clinical utility: report on the current and potential application of the model in a clinical setting (for example, decision curve analysis).	2	2
15	Cost-effectiveness analysis: report on the cost-effectiveness of the clinical application (for example, QALYs generated)	1	0
16	Open science and data: make code and data publicly available. Open science facilitates knowledge transfer and reproducibility of the study	+ 1 (if scans are open source)	1
		+ 1 (if region of interest segmentations are open source)
		+ 1 (if code is open source)
		+ 1 (if radiomics features are calculated on a set of representative ROIs and the calculated features and representative ROIs are open source)
			16/36 = 44.44

Open in a new tab

2.2. Patient Population

We retrospectively searched the pathology laboratory information system at our single tertiary care academic center (blinded to reviewers) for consecutive patients who underwent a partial or radical nephrectomy for either a pathology confirmed oncocytoma or chromophobe RCC between December 2005 and December 2017 at our institution. Patients were eligible for inclusion in the study if they underwent a clinically indicated contrast-enhanced CT of the kidney during the nephrographic phase of contrast.²²^,³⁷^,³⁸

The search yielded an initial target population of 256 consecutive patients who were considered eligible for inclusion in the study (Fig. 1). Subjects were excluded from the study if they were not scanned in the nephrographic phase of contrast. One-hundred fifty-four patients out of the initially 256 eligible patients were excluded from the study. The final study population was comprised of 102 patients [68 males (67%), 34 females (33%); mean age ± standard deviation, $63 \pm 12 years$ ]. A total of 42 oncocytomas (41%) and 60 chromophobes (59%) were identified in these 102 patients. Each patient had a single exam.

2.3. Multidetector CT Technique

Patients underwent anteroposterior and lateral digital scout radiographs after which axial CT image acquisition commenced. All patients were positioned supine with feet first on the CT table. Patients were scanned during the nephrographic phase (90 s after the onset of intravenous contrast injection). All patients received 150 mL of intravenous contrast material (Isovue 370; Bracco Diagnostics, Monroe Township, New Jersey) injected at a rate of $4 mL / s$ . Table 3 shows the different scanner models and manufacturers on which patients were scanned. Distribution of scanner models does not differ between oncocytoma and chromophobes ( $p = 0.2544$ , Monte Carlo simulation of Pearson’s chi-squared test).³⁹ Axial images were reconstructed at a slice thickness of 5 mm. All patients had a single lesion greater than 1 cm.

Table 3.

Summary of scanner types.

Summary		$n$	%
Siemens	Definition AS+	9	8.82
	Definition	14	13.73
	Definition edge	2	1.96
	Sensation 64	26	25.49
	Sensation 16	1	0.98
	Perspective	5	4.90
	Emotion 6	1	0.98
	Emotion 16	2	1.96
	Force	2	1.96
GE	Discovery LS	1	0.98
	Discovery 690	1	0.98
	Discovery CT750 HD	6	5.88
	LightSpeed 16	3	2.94
	LightSpeed Pro 16	1	0.98
	LightSpeed Ultra	2	1.96
	LightSpeed VCT	14	13.73
	BrightSpeed	1	0.98
Philips	Brilliance 64	1	0.98
	Brilliance 16	1	0.98
	iCT 128	1	0.98
	iCT 256	2	1.96
	Gemini TN TOF 64	1	0.98
Canon	Aquilion	5	4.90
		102	100

Open in a new tab

2.4. Radiomic Biopsy Process

For this study, we hosted and organized all files using ePAD, a web-based platform for DICOM image management and analysis.⁴⁰^,⁴¹ To perform RBs on the cohort, we migrated the cohort from ePAD to a web server running a custom-built version of Fovia’s FAST interactive segmentation platform (Fovia Inc., Palo Alto, California). As discussed in Sec. 1, we defined an RB as a spherical sample, or a cluster of connected spherical samples, of a VOI. To collect these samples quickly, we built a click-and-drag RB tool within FAST that allows users to click the center of a VOI and then drag the mouse to set the radius of each sphere (Fig. 2) while observing the growth in multiple planes. The size of each spherical RB in ${mm}^{3}$ was set interactively by the user clicking at its center and then dragging to set its radius, whereas the display showed its extent superimposed on in the patient volume in three planes.

Fig. 2 — RBs simplify the radiologist interface. This tool offers four views: axial, coronal, sagittal, and a 3D view (clockwise from top right). The coronal and sagittal views are scaled vertically to maintain isotropic pixels. The user clicks in any one of the three image panels, then drags until a sphere of desired size is grown. The highlighted red region is an RB being collected in a sample kidney.

Given the importance of quantifying tumor heterogeneity,²⁵^,⁴² the goal of an RB is to capture a large representative portion of the interior of the tumor volume without concern for capturing tissue at or near the lesion edge, generally the most time-intensive step and source of inter-reader variability.²⁹^,³⁰^,⁴³ Therefore, a fellowship-trained abdominal radiologist was blinded to the pathological identity of the tumors (oncocytoma versus chromophobe), instructed to find a lesion, provided no clinical information other than the presence of a lesion, and instructed to create RBs by placing overlapping spheres covering much of the interior of each lesion without focusing on or having to capture tissue near its edge. This single reader conducted two rounds of RB on the kidney cohort with a month between rounds to avoid recall bias. The entire kidney cohort received one RB per patient per round ( $102 * 2 = 204$ total RBs).

2.5. Feature Computation

We conducted quantitative image processing of the images using the quantitative image feature engine (QIFE), a MATLAB-based image processing pipeline that calculates radiomics features from VOIs in image data.⁴⁴ We ran the QIFE with all default parameters⁴⁴ (see Appendix for configuration information). As discussed in Sec. 1, since we are using RBs that do not adhere to the true shape or size of the ROI, we focused only on using intensity and texture radiomics features (CTTA). We selected only intensity and texture features from the QIFE, resulting in 6206 features for further analysis (see Appendix for general feature descriptions).

2.6. Machine Learning Workflow

Here, we define our general modeling workflow with individual components described in more detail (Fig. 3). We repeated the entire workflow for each RB round. Since no true holdout validation set was available, to simulate validation, we repeated 70% to 30% random development-holdout splitting of the data 10 times. Within the development split, we performed 10-fold cross validation to optimize feature selection, model choice, and model hyperparameters. On each training fold, we performed two rounds of feature selection: intraclass correlation coefficient (ICC) filtering and maximum relevance minimal redundancy selection (mRMR). Then, we performed a model hyperparameter sweep for both a random forest and adaBoost classifier. We chose the ICC cutoff, mRMR count, model, and hyperparameters that had the highest median performance on the 10 testing folds. Finally, we trained a single model on all the development data and tested on the 30% held-out data. We used the R software package caret for model workflow management.⁴⁵

2.7. Feature Selection

We performed two rounds of feature selection within each training fold of the cross validation: first filtering by ICC values then filtering using mRMR selection. Since two rounds of RBs were collected for the RCC cohort and radiomics features are subject to low repeatability,³⁵ we filtered out features with low ICC values to remove features not stable to repeat measurement. We filtered features below successive cutoffs from 0.7 to 0.95 with increments of 0.05, producing six total cutoffs. We computed ICC values as defined by Bartko using the irr R software package.⁴⁶^,⁴⁷ For ICC parameters, we considered only subjects as random effects (oneway) where radiomic values were compared as single units of analysis.⁴⁸^,⁴⁹

After ICC filtering, we implemented mRMR, set to keep a specified number of features (to be called the mRMR count). We chose mRMR since radiomics features are highly correlated and mRMR optimizes for maximally uncorrelated features that are associated with the outcome.⁵⁰^–⁵² Since previous work has shown that rules of thumb for number of features (e.g., 10 examples per feature) are classifier- and data-dependent,⁵³^,⁵⁴ we did not use a rule of thumb to set our feature number. Instead, we tested mRMR counts from 5 to 30 with increments of five features,⁵⁰^,⁵⁵ creating a total of six tested mRMR counts. We implemented mRMR using the mRMRe R software package.⁵⁶ We tested all combinations of ICC cutoff and mRMR count for a total of 36 different feature selection combinations.

2.8. Predictive Modeling

There is a wide range of available and effective classification algorithms used with radiomics data;⁵⁵ we chose to compare performance between two tree-based algorithms: random forest and AdaBoost. Tree-based approaches have been successful when compared with other approaches on radiomics data,⁵⁵ and boosting (implemented by AdaBoost) is a common approach to maximizing tree classifier performance while minimizing overfitting.⁵⁷^,⁵⁸

We trained all classifiers using caret and optimized the default hyperparameter grid for each model exposed by caret.⁴² For random forests, we used the randomForest R software package and optimized “mtry”: the number of variables (minimum: 1, maximum: number of features, num: 3) considered for each split.⁵⁹ For AdaBoost, we used the ada R software package and optimized “iter”: the number of boosting iterations (min: 50; max: 150; num: 3).⁶⁰ All other parameters were set to the package defaults. For all models, we optimized for the area under the receiver operating characteristic curve (ROC AUC), commonly used for evaluating binary classification performance.⁶¹

2.9. Model Evaluation

Within each development dataset, we produced (six ICC cutoffs) ∗ (six mRMR counts) ∗ (two model types) ∗ (three model hyperparameter values) = 216 models. We chose the model with the highest median test performance across all 10-folds of the cross validation. We then trained this best model on all of the development data and tested on the held-out data. We repeated this 10 times for the 10 development-holdout splits. We repeated this entire procedure separately for both biopsy rounds.

To test if model performance is the same between biopsy rounds, we needed to perform an ROC AUC comparison test. Within a single development-holdout split, the models are tested on the same patients. Therefore, the appropriate ROC AUC comparison is DeLong’s test for the AUC of correlated ROCs.⁶² Since we repeat our development-holdout splits 10 times, we perform 10 such DeLong tests and correct for multiple corrections with a Benjamini–Hochberg correction.⁶³ We are interested in how often the model performances are significantly different across the splits.

To test if the model performance generalizes to data from the other biopsy round, we take the optimal model chosen in the development set for RB1 and then test it on data from the holdout set for RB2. Therefore, we have two scores: (1) trained on RB1 and tested on RB1 and (2) trained RB1 and tested on RB2. We then compare these performances using our repeated DeLong test as before. We repeat this analysis but train on RB2 instead.

2.10. Feature Importance and Model Interpretation

To test for the relationship between the most important features across splits and to identify related clusters of features, we performed consensus clustering of all features selected by mRMR at least once.⁵¹ Consensus clustering quantifies the consensus across 10,000 resampled clustering iterations.⁶⁴ We performed hierarchical clustering with agglomerative ward linkage and distance between features measured with Pearson correlation ( $1 - r$ ). We computed cluster consensus, the average consensus between all pairs of features belonging to the same cluster. Cluster consensus indicates the stability of each cluster across all resampling iterations.

We repeated this consensus clustering for the number of clusters from 1 to 20. From this range, we visualized the optimal number of clusters with a delta area plot, which plots the relative change in the overall consensus between number of clusters. We identified the number of clusters at the elbow of this plot or when adding more clusters does not improve the overall consensus within the clusters. All clustering was performed with ConsensusClusterPlus.⁶⁴

We repeated clustering in both the first and second round of biopsies. To determine if similar clusters of related features were important in models trained on the first and second rounds of biopsies, we computed 1000 bootstrapped estimations of the adjusted Rand index for all clustered features that appeared in both rounds.⁵¹^,⁶⁵ The Rand index computes the agreement between two separate clusterings: in this case, RB round 1 and RB round 2. We bootstrapped by sampling cluster assignments with replacement from the clustered features. For the adjusted Rand index, there is a known value of zero for random clustering. We compared the bootstrapped adjusted Rand index to zero with a one-sample $t$ -test.

To compare the identified cluster and interpret model performance, we computed three metrics for each feature. The first metric is “stability,” which is the number of times (out of 10) that the feature surpassed the ICC cutoff. The second metric is “number of votes,” which is the number of times (out of 10) that mRMR selected the feature. The third metric is the importance of the feature to the trained model. This importance corresponds to the relative increase in tree accuracy provided by the addition of this feature to the tree. Importance values are normalized to a score out of 100. We then found the average and standard deviation of each metric for each cluster. We repeated this entire analysis on models generated from both rounds of RBs separately.

To identify significant clusters, we performed a two-way ANOVA of each cluster metric with two independent variables (RB round and cluster number) and one dependent variable (the chosen metric, either stability, votes, or importance). We followed this ANOVA with a post-hoc Tukey test to identify specific clusters of interest. We performed this analysis in R and generated all plots in ggplot.⁶⁶^,⁶⁷ For all analyses in Sec. 2 Methods, alpha is set to 0.05.

3. Results

3.1. CTTA from Radiomic Biopsies Distinguish Oncocytoma and Chromophobe RCC

We compared the holdout performance of the top classifier from each of the 10 development datasets for the two RB rounds. The top performing models achieved a mean ROC AUC of $0.700 \pm 0.039$ and $0.740 \pm 0.034$ in the two RB rounds, respectively. We found no evidence for a difference in the aggregate model performances between the two RB rounds (Fig. 4 and Table 4). Since it could be argued that the models are overfit to their specific RB round, we also computed the test ROC AUC of the optimal classifier on test data from the other RB round (Sec. 2). We found no evidence for a difference in model performance in this case (Fig. 4 and Table 4). These experiments suggest performance of models built using CTTA from RBs are insensitive to variations between rounds of RBs and perform well above chance.

Fig. 4 — Random forest classifier from RB CTTA distinguishes oncocytoma and chromophobe RCC. Box plots of 10 measures of ROC AUC. RB1: classifier trained on radiomics features from first RB round; RB2: second RB round. $X$ axis labels explain how the testing was performed for each set of models. Differences between these models performed using repeated DeLong testing reported in Table 4.

Table 4.

Repeated DeLong testing for difference in ROC between models for same holdout split.

First ROC	Trained on RB1, tested on RB1		Trained on RB1, tested on RB1		Trained on RB2, tested on RB2
Second ROC	Trained on RB2, tested on RB2		Trained on RB1, tested on RB2		Trained on RB2, tested on RB1
	Statistic	Adjusted p-value	Statistic	Adjusted p-value	Statistic	Adjusted p-value
1	0.2677772	1	0.39310916	0.850690577	0.4508265	0.909250217
2	−0.8395503	1	−1.83967693	0.658156829	0.2743063	0.909250217
3	0.3288441	1	−1.02898504	0.850690577	0.7418381	0.909250217
4	0.0525216	1	−0.78331747	0.850690577	0.4038627	0.909250217
5	0.2171598	1	0.4147415	0.850690577	0.7957154	0.909250217
6	−0.1618376	1	0.41694538	0.850690577	−0.4684048	0.909250217
7	−0.8429964	1	−0.53367379	0.850690577	0.1139843	0.909250217
8	0	1	1.03717982	0.850690577	0.1957541	0.909250217
9	0.2936751	1	0.29810697	0.850690577	−0.7879317	0.909250217
10	−0.1059234	1	0.05620401	0.955179288	−0.5974284	0.909250217

Open in a new tab

3.2. Separate Rounds of Radiomic Biopsy Select Similar Clusters of Discriminative Features

It could be argued that our optimal models trained on the separate rounds of biopsies have by chance identified two unique and distinct radiomic signatures. To demonstrate that both rounds of biopsies generate similar radiomic signatures, we needed to compare the features selected during selection and their importance in the trained models.

Since feature selection was repeated for each train-test split during model training, our final models were trained on different features in each split. Therefore, the important features in the model trained on the first split could differ from the important features in the model trained on the last. Since there is high correlation among radiomics features,⁵¹ we hypothesized that the most important features across splits might be correlated. Therefore, if the two rounds of biopsies have identified similar radiomic signatures, these signatures would appear as similar clusters of selected features rather than similar singularly important features.

To identify clusters and test the similarity of the two models, we performed consensus clustering of all features ever selected by mRMR. Because we focus on features selected by mRMR, all features analyzed during clustering are known to be relevant to the classification task. We identified seven clusters of important radiomics features (Figs. 5 and 6). All clusters except for RB2 cluster 7 show good consensus (Table 1).

Fig. 5 — Consensus clustering of important texture features identifies seven clusters for each RB round. Heat maps of Pearson correlation of every radiomics feature chosen by mRMR at least once with all other such radiomics features. Heatmap color corresponds to correlation according to key. Consensus clustering of these features generated through resampled hierarchical clustering with ward linkage. Left color bar denotes clusters. Top is clustering for features from the first RB round, bottom is clustering for features from the second RB round.

Fig. 6 — Clustering of selected radiomics features shows high similarity between RB rounds; (a) and (b) are delta area plots produced by ConsensusClusterPlus. $X$ axis is the number of clusters, $y$ axis is the change in area under consensus CDF. The elbow in this plot indicates that additional clusters are not improving the overall consensus. Elbow occurs at around seven clusters. (c) Rand index histogram for the cluster comparison with an accompanying simple $t$ -test between the distribution and zero ( $p < 0.00001$ ).

Table 1.

Clusters of discriminative features and metrics for importance.

	Cluster	Cluster consensus	Importance mean	Importance SD	Stability mean	Stability SD	Vote mean	Vote SD
RB1	1	0.774	48.784	30.331	1.750	3.059	1.500	1.235
	2	1.000	33.007	26.390	1.000	2.490	1.455	0.934
	3	0.933	80.752	19.002	6.500	3.629	1.200	0.632
	4	0.731	35.941	20.367	3.320	3.838	1.480	0.963
	5	1.000	45.641	27.648	4.531	2.782	1.500	1.414
	6	0.827	32.454	21.932	0.550	1.605	1.550	1.276
	7	1.000	55.610	29.345	0.100	0.316	2.100	1.729
RB2	1	1.000	91.664	12.681	6.909	3.885	1.182	0.603
	2	0.990	37.475	27.716	0.941	1.919	1.235	0.664
	3	1.000	25.770	28.552	3.667	3.331	1.333	0.816
	4	0.595	47.321	28.631	1.296	2.447	1.222	0.424
	5	1.000	51.604	25.308	2.125	2.357	2.250	2.375
	6	0.914	59.556	29.669	4.091	2.743	1.545	0.938
	7	0.677	52.390	28.725	0.500	0.964	1.091	0.294

Open in a new tab

To confirm that similar clusters of important features were selected between the two rounds of biopsies, we computed the adjusted Rand index to measure how many important features appeared in the same cluster in both RB rounds (Fig. 6). Important feature clustering was significantly more repeatable than the random baseline of zero, and overall bootstrapped, adjusted Rand index of the two final clusters was $0.981 \pm 0.002$ ( $p < 0.00001$ , Fig. 6). Therefore, clustering analysis reveals that both rounds of biopsies produce models using similar clusters of related features.

3.3. Cluster Comparison Analysis Identifies a Cluster of Important Texture Features

To compare the identified clusters and interpret if any cluster was uniquely important in the trained models, we computed three metrics for all features and found their cluster average: stability, votes, and importance (see Sec. 2). While there were no significant differences in votes between clusters (Tables 1 and 7), RB round 1 cluster 3 and RB round 2 cluster 1 had significantly greater average votes and importance than nearly all other clusters (Tables 8 and 9). RB round 1 cluster 3 contains a number of texture features including gray-level co-occurrence matrix (GLCM) sum mean (Table 5). RB round 2 cluster 1 contains a number of texture features including the median of the intensity histogram (Table 6).

Table 7.

Two-way ANOVA results with no Tukey’s post-hoc test for effect of RB round and cluster assignment on feature votes.

Source of variation	$F$ value	$P$ value	Significant
Cluster assignment	0.591	0.737	n
RB round	1.090	0.298	n
Interaction	1.688	0.124	n

Open in a new tab

Table 8.

Two-way ANOVA results with Tukey’s post-hoc test for effect of RB round and cluster assignment on feature stability.

Source of variation		F value	p-value		Significant?
Cluster assignment		10.71	<0.0001		y
RB round		1.38	0.241		n
Interaction		10.72	<0.0001		y
Pair	Difference	Lower bound	Upper bound	Adjusted p-value	Significant
7:RB2-1:RB2	−6.409	−9.806	−3.012	1.00E-07	y
1:RB2-6:RB1	6.359	2.906	9.812	2.00E-07	y
7:RB2-3:RB1	−6.000	−9.508	−2.492	1.90E-06	y
4:RB2-1:RB2	−5.613	−8.903	−2.322	2.00E-06	y
1:RB2-7:RB1	6.809	2.790	10.828	2.50E-06	y
2:RB2-1:RB2	−5.968	−9.528	−2.408	3.40E-06	y
6:RB1-3:RB1	−5.950	−9.513	−2.387	3.80E-06	y
7:RB2-5:RB1	−4.031	−6.579	−1.484	1.68E-05	y
7:RB1-3:RB1	−6.400	−10.514	−2.286	2.61E-05	y
4:RB2-3:RB1	−5.204	−8.609	−1.798	4.10E-05	y
6:RB1-5:RB1	−3.981	−6.603	−1.359	4.80E-05	y
2:RB2-3:RB1	−5.559	−9.225	−1.893	4.96E-05	y
1:RB2-2:RB1	5.909	1.987	9.832	5.80E-05	y
1:RB2-1:RB1	5.159	1.706	8.612	7.08E-05	y
7:RB2-6:RB2	−3.591	−6.123	−1.059	2.30E-04	y
3:RB1-2:RB1	5.500	1.481	9.519	4.83E-04	y
6:RB2-6:RB1	3.541	0.934	6.148	0.001	y
4:RB2-5:RB1	−3.235	−5.639	−0.831	0.001	y
3:RB1-1:RB1	4.750	1.187	8.313	0.001	y
7:RB1-5:RB1	−4.431	−7.764	−1.099	0.001	y
2:RB2-5:RB1	−3.590	−6.351	−0.829	0.001	y
6:RB2-7:RB1	3.991	0.670	7.312	0.005	y
6:RB2-4:RB2	2.795	0.407	5.182	0.007	y
6:RB2-2:RB2	3.150	0.403	5.896	0.010	y
5:RB2-1:RB2	−4.784	−9.059	−0.510	0.013	y
5:RB1-2:RB1	3.531	0.316	6.746	0.017	y
1:RB2-4:RB1	3.589	0.261	6.917	0.021	y
5:RB1-1:RB1	2.781	0.159	5.403	0.026	y
7:RB2-4:RB1	−2.820	−5.509	−0.131	0.030	y
7:RB2-3:RB2	−3.167	−6.247	−0.086	0.037	y
6:RB1-4:RB1	−2.770	−5.530	−0.010	0.048	y
5:RB2-3:RB1	−4.375	−8.738	−0.012	0.049	y
3:RB2-6:RB1	3.117	−0.025	6.259	0.054	n
6:RB2-2:RB1	3.091	−0.112	6.294	0.071	n
3:RB2-7:RB1	3.567	−0.189	7.322	0.082	n
7:RB1-4:RB1	−3.220	−6.662	0.222	0.094	n
4:RB1-3:RB1	−3.180	−6.622	0.262	0.104	n
6:RB2-1:RB1	2.341	−0.266	4.948	0.131	n
3:RB2-1:RB2	−3.242	−6.894	0.409	0.143	n
6:RB2-1:RB2	−2.818	−6.021	0.385	0.152	n
3:RB2-2:RB2	2.725	−0.533	5.984	0.215	n
2:RB2-4:RB1	−2.379	−5.271	0.513	0.238	n
4:RB2-3:RB2	−2.370	−5.333	0.592	0.279	n
4:RB2-4:RB1	−2.024	−4.577	0.530	0.294	n
3:RB2-3:RB1	−2.833	−6.589	0.922	0.375	n
1:RB2-5:RB1	2.378	−0.837	5.593	0.409	n
3:RB2-2:RB1	2.667	−0.985	6.318	0.431	n
6:RB2-3:RB1	−2.409	−5.730	0.912	0.442	n
4:RB1-2:RB1	2.320	−1.008	5.648	0.512	n
5:RB2-5:RB1	−2.406	−6.042	1.230	0.600	n
3:RB2-1:RB1	1.917	−1.225	5.059	0.723	n
5:RB1-3:RB1	−1.969	−5.301	1.364	0.766	n
4:RB1-1:RB1	1.570	−1.190	4.330	0.810	n
6:RB2-5:RB2	1.966	−1.659	5.591	0.857	n
5:RB1-4:RB1	1.211	−1.244	3.667	0.924	n
5:RB2-7:RB1	2.025	−2.338	6.388	0.952	n
7:RB1-1:RB1	−1.650	−5.213	1.913	0.952	n
7:RB2-1:RB1	−1.250	−4.092	1.592	0.968	n
7:RB2-5:RB2	−1.625	−5.423	2.173	0.975	n
6:RB1-1:RB1	−1.200	−4.109	1.709	0.982	n
5:RB2-6:RB1	1.575	−2.273	5.423	0.983	n
5:RB2-3:RB2	−1.542	−5.569	2.486	0.990	n
4:RB2-7:RB1	1.196	−2.209	4.602	0.996	n
5:RB2-4:RB1	−1.195	−4.932	2.542	0.998	n
6:RB2-4:RB1	0.771	−1.668	3.210	0.999	n
7:RB2-4:RB2	−0.796	−3.438	1.846	0.999	n
3:RB2-5:RB1	−0.865	−3.743	2.014	0.999	n
5:RB2-2:RB2	1.184	−2.760	5.128	0.999	n
4:RB2-6:RB1	0.746	−1.968	3.460	1.000	n
2:RB2-1:RB1	−0.809	−3.843	2.226	1.000	n
5:RB2-2:RB1	1.125	−3.149	5.399	1.000	n
2:RB2-7:RB1	0.841	−2.825	4.507	1.000	n
7:RB1-2:RB1	−0.900	−4.919	3.119	1.000	n
5:RB2-4:RB2	0.829	−2.874	4.532	1.000	n
2:RB1-1:RB1	−0.750	−4.203	2.703	1.000	n
6:RB2-5:RB1	−0.440	−2.723	1.842	1.000	n
4:RB2-1:RB1	−0.454	−3.168	2.260	1.000	n
7:RB2-2:RB1	−0.500	−3.897	2.897	1.000	n
7:RB2-2:RB2	−0.441	−3.412	2.529	1.000	n
6:RB2-3:RB2	0.424	−2.440	3.289	1.000	n
5:RB2-1:RB1	0.375	−3.473	4.223	1.000	n
6:RB1-2:RB1	−0.450	−3.903	3.003	1.000	n
2:RB2-2:RB1	−0.059	−3.618	3.501	1.000	n
4:RB2-2:RB1	0.296	−2.994	3.587	1.000	n
1:RB2-3:RB1	0.409	−3.610	4.428	1.000	n
3:RB2-4:RB1	0.347	−2.658	3.351	1.000	n
7:RB1-6:RB1	−0.450	−4.013	3.113	1.000	n
2:RB2-6:RB1	0.391	−2.643	3.426	1.000	n
7:RB2-6:RB1	−0.050	−2.892	2.792	1.000	n
7:RB2-7:RB1	0.400	−3.108	3.908	1.000	n
4:RB2-2:RB2	0.355	−2.493	3.203	1.000	n

Open in a new tab

Table 9.

Two-way ANOVA results with Tukey’s post-hoc test for effect of RB round and cluster assignment on feature importance.

Source of variation		F value	p-value		Significant?
Cluster assignment		3.580	0.002		y
RB round		6.173	0.01365		y
Interaction		8.966	<0.0001		y
Pair	Difference	Lower bound	Upper bound	Adjusted p-value	Significant
3:RB2-1:RB2	−65.895	−101.671	−30.118	2.00E-07	y
1:RB2-6:RB1	59.210	25.378	93.041	9.00E-07	y
1:RB2-4:RB1	55.723	23.114	88.332	1.90E-06	y
2:RB2-1:RB2	−54.189	−89.064	−19.314	2.70E-05	y
1:RB2-2:RB1	58.657	20.227	97.087	4.22E-05	y
3:RB2-3:RB1	−54.982	−91.776	−18.188	7.04E-05	y
1:RB2-5:RB1	46.023	14.523	77.524	1.19E-04	y
6:RB1-3:RB1	−48.297	−83.203	−13.391	3.86E-04	y
4:RB2-1:RB2	−44.343	−76.581	−12.105	4.35E-04	y
4:RB1-3:RB1	−44.811	−78.533	−11.089	0.001	y
1:RB2-1:RB1	42.880	9.048	76.711	0.002	y
3:RB1-2:RB1	47.745	8.366	87.124	0.004	y
2:RB2-3:RB1	−43.277	−79.195	−7.359	0.005	y
6:RB2-3:RB2	33.786	5.721	61.852	0.005	y
7:RB2-1:RB2	−39.274	−72.556	−5.993	0.006	y
5:RB1-3:RB1	−35.111	−67.762	−2.459	0.022	y
6:RB2-6:RB1	27.102	1.562	52.642	0.026	y
6:RB2-1:RB2	−32.108	−63.486	−0.730	0.039	y
4:RB2-3:RB1	−33.430	−66.794	−0.067	0.049	y
6:RB2-4:RB1	23.615	−0.282	47.512	0.056	n
5:RB2-1:RB2	−40.060	−81.939	1.818	0.077	n
3:RB1-1:RB1	31.967	−2.939	66.873	0.112	n
1:RB2-7:RB1	36.055	−3.325	75.434	0.112	n
7:RB2-3:RB2	26.620	−3.558	56.799	0.150	n
6:RB2-2:RB1	26.549	−4.829	57.927	0.200	n
7:RB2-3:RB1	−28.362	−62.735	6.011	0.234	n
6:RB2-2:RB2	22.081	−4.826	48.987	0.242	n
3:RB2-7:RB1	−29.840	−66.634	6.954	0.259	n
3:RB2-1:RB1	−23.015	−53.799	7.769	0.390	n
4:RB2-3:RB2	21.552	−7.472	50.575	0.402	n
7:RB2-6:RB1	19.936	−7.910	47.781	0.465	n
3:RB2-5:RB1	−19.872	−48.074	8.331	0.493	n
5:RB2-3:RB1	−29.148	−71.899	13.603	0.549	n
7:RB1-6:RB1	23.155	−11.751	58.061	0.595	n
5:RB2-3:RB2	25.834	−13.623	65.292	0.617	n
6:RB2-3:RB1	−21.196	−53.729	11.338	0.625	n
7:RB2-4:RB1	16.449	−9.897	42.795	0.690	n
7:RB1-3:RB1	−25.142	−65.448	15.164	0.692	n
6:RB2-5:RB1	13.915	−8.445	36.275	0.695	n
7:RB1-4:RB1	19.669	−14.054	53.391	0.781	n
7:RB2-2:RB1	19.383	−13.899	52.664	0.783	n
7:RB1-2:RB1	22.602	−16.777	61.982	0.800	n
6:RB1-1:RB1	−16.330	−44.831	12.171	0.802	n
4:RB2-6:RB1	14.867	−11.722	41.456	0.828	n
6:RB2-4:RB2	12.235	−11.153	35.622	0.886	n
6:RB1-5:RB1	−13.187	−38.877	12.503	0.900	n
7:RB2-2:RB2	14.915	−14.189	44.019	0.901	n
5:RB2-6:RB1	19.150	−18.553	56.852	0.907	n
2:RB2-7:RB1	−18.134	−54.052	17.784	0.911	n
4:RB1-1:RB1	−12.843	−39.881	14.195	0.942	n
2:RB1-1:RB1	−15.777	−49.609	18.054	0.950	n
4:RB2-4:RB1	11.380	−13.635	36.396	0.959	n
5:RB2-2:RB1	18.597	−23.281	60.475	0.966	n
4:RB2-2:RB1	14.314	−17.924	46.552	0.966	n
5:RB2-4:RB1	15.663	−20.947	52.273	0.975	n
6:RB2-1:RB1	10.772	−14.768	36.312	0.978	n
5:RB1-4:RB1	9.700	−14.357	33.757	0.985	n
5:RB1-2:RB1	12.634	−18.866	44.134	0.986	n
2:RB2-1:RB1	−11.309	−41.041	18.422	0.991	n
3:RB2-2:RB2	−11.706	−43.633	20.222	0.994	n
5:RB2-2:RB2	14.129	−24.513	52.770	0.994	n
4:RB2-2:RB2	9.846	−18.058	37.751	0.996	n
3:RB2-4:RB1	−10.171	−39.607	19.264	0.996	n
7:RB1-5:RB1	9.968	−22.683	42.620	0.999	n
2:RB2-5:RB1	−8.166	−35.215	18.883	0.999	n
7:RB2-6:RB2	−7.166	−31.973	17.641	0.999	n
1:RB2-3:RB1	10.912	−28.467	50.292	1.000	n
7:RB2-5:RB1	6.749	−18.212	31.710	1.000	n
4:RB2-7:RB1	−8.288	−41.652	25.076	1.000	n
6:RB2-5:RB2	7.952	−27.566	43.470	1.000	n
3:RB2-6:RB1	−6.685	−37.469	24.099	1.000	n
3:RB2-2:RB1	−7.237	−43.014	28.539	1.000	n
7:RB2-4:RB2	5.069	−20.817	30.954	1.000	n
7:RB1-1:RB1	6.825	−28.081	41.731	1.000	n
2:RB2-6:RB1	5.021	−24.711	34.752	1.000	n
5:RB2-5:RB1	5.963	−29.663	41.589	1.000	n
5:RB1-1:RB1	−3.143	−28.833	22.547	1.000	n
4:RB2-1:RB1	−1.463	−28.052	25.126	1.000	n
5:RB2-1:RB1	2.820	−34.883	40.522	1.000	n
7:RB2-1:RB1	3.606	−24.240	31.451	1.000	n
4:RB1-2:RB1	2.934	−29.675	35.543	1.000	n
6:RB1-2:RB1	−0.553	−34.384	33.279	1.000	n
2:RB2-2:RB1	4.468	−30.407	39.343	1.000	n
6:RB1-4:RB1	−3.487	−30.525	23.551	1.000	n
2:RB2-4:RB1	1.534	−26.798	29.867	1.000	n
4:RB2-5:RB1	1.680	−21.871	25.232	1.000	n
5:RB2-7:RB1	−4.006	−46.756	38.745	1.000	n
6:RB2-7:RB1	3.946	−28.587	36.480	1.000	n
7:RB2-7:RB1	−3.220	−37.593	31.153	1.000	n
5:RB2-4:RB2	4.283	−31.997	40.562	1.000	n
7:RB2-5:RB2	0.786	−36.424	37.996	1.000	n

Open in a new tab

Table 5.

Cluster membership and importance metrics for all selected features for RB round 1.

Feature	Stability	Votes	Importance mean	Importance SD (if vote >1)	Cluster
texture.glcm.distance.1mm.correlation.skewness	1	1	100.000	NA	1
texture.glcm.distance.1mm.correlation.variance	0	4	38.156	12.030	1
texture.glcm.distance.1mm.energy.skewness	10	1	55.378	NA	1
texture.glcm.distance.2mm.correlation.kurtosis	3	1	87.691	NA	1
texture.glcm.distance.2mm.correlation.variance	0	5	28.978	14.550	1
texture.glcm.distance.2mm.energy.variance	2	1	72.147	NA	1
texture.glcm.distance.2mm.maxProbability.variance	6	1	74.128	NA	1
texture.laws.resolution.1.5mm.L5S5L5.min	8	1	11.666	NA	1
texture.laws.resolution.1mm.aggregated.R5R5W5.trimmedMean.90.	0	1	18.478	NA	1
texture.laws.resolution.1mm.E5W5E5.skewness	0	1	65.762	NA	1
texture.laws.resolution.2mm.E5R5E5.skewness	0	1	68.895	NA	1
texture.laws.resolution.2mm.L5S5S5.median	0	4	73.082	19.105	1
texture.laws.resolution.2mm.R5E5R5.mean	5	1	51.754	NA	1
texture.laws.resolution.2mm.S5E5S5.skewness	0	1	38.803	NA	1
texture.laws.resolution.2mm.S5S5S5.mean	0	1	100.000	NA	1
texture.laws.resolution.2mm.W5R5E5.skewness	0	1	25.272	NA	1
texture.laws.resolution.2mm.W5R5L5.median	0	1	0.000	NA	1
texture.laws.resolution.2mm.W5R5R5.mean	0	1	29.334	NA	1
texture.laws.resolution.2mm.W5S5L5.mean	0	1	11.047	NA	1
texture.laws.resolution.2mm.W5S5R5.mean	0	1	25.116	NA	1
texture.glcm.distance.1mm.contrast.interquartileRange	8	1	6.395	NA	2
texture.glcm.distance.1mm.contrast.variance	3	2	2.853	4.035	2
texture.laws.resolution.1.5mm.aggregated.S5R5R5.mean	0	4	72.273	7.870	2
texture.laws.resolution.1.5mm.E5L5R5.trimmedMean.90	0	1	53.441	NA	2
texture.laws.resolution.1.5mm.E5R5R5.trimmedMean.90.	0	1	5.669	NA	2
texture.laws.resolution.1.5mm.E5S5W5.mean	0	2	11.790	14.175	2
texture.laws.resolution.1.5mm.E5S5W5.trimmedMean.90	0	1	42.691	NA	2
texture.laws.resolution.1.5mm.L5E5E5.mean	0	1	55.182	NA	2
texture.laws.resolution.1mm.R5L5S5.median	0	1	66.475	NA	2
texture.laws.resolution.1mm.R5S5S5.mean	0	1	11.109	NA	2
texture.laws.resolution.1mm.S5R5S5.mean	0	1	35.201	NA	2
texture.glcm.distance.1mm.sumMean.max	10	3	94.580	9.387	3
texture.laws.resolution.1.5mm.aggregated.L5W5W5.trimmedMean.90.	9	1	70.177	NA	3
texture.laws.resolution.1.5mm.L5L5W5.min	1	1	59.857	NA	3
texture.laws.resolution.1.5mm.L5L5W5.trimmedMean.90.	9	1	69.870	NA	3
texture.laws.resolution.1mm.L5L5L5.median	10	1	100.000	NA	3
texture.laws.resolution.1mm.L5L5W5.min	0	1	75.872	NA	3
texture.laws.resolution.1mm.W5W5W5.mean	8	1	100.000	NA	3
texture.laws.resolution.1mm.W5W5W5.trimmedMean.90.	8	1	46.673	NA	3
texture.laws.resolution.2mm.aggregated.L5W5W5.median	5	1	100.000	NA	3
texture.laws.resolution.2mm.L5W5L5.mean	5	1	90.489	NA	3
texture.glcm.distance.1mm.clusterShade.variance	2	1	6.105	NA	4
texture.glcm.distance.1mm.clusterTendency.variance	1	1	45.928	NA	4
texture.laws.resolution.1.5mm.aggregated.E5E5R5.variance	10	1	0.000	NA	4
texture.laws.resolution.1.5mm.aggregated.L5L5E5.meanAbsoluteDeviation	5	2	64.353	14.138	4
texture.laws.resolution.1.5mm.L5W5W5.variance	1	1	27.210	NA	4
texture.laws.resolution.1.5mm.S5E5E5.variance	10	2	10.352	6.112	4
texture.laws.resolution.1.5mm.S5S5W5.variance	3	1	43.637	NA	4
texture.laws.resolution.1.5mm.S5W5E5.variance	7	1	32.329	NA	4
texture.laws.resolution.1mm.E5R5R5.meanAbsoluteDeviation	10	1	32.580	NA	4
texture.laws.resolution.1mm.E5R5R5.variance	10	1	17.442	NA	4
texture.laws.resolution.1mm.L5L5E5.interquartileRange	3	1	58.085	NA	4
texture.laws.resolution.1mm.R5W5E5.median	1	5	22.064	13.938	4
texture.laws.resolution.1mm.S5E5S5.trimmedMean.90.	0	2	79.784	2.689	4
texture.laws.resolution.1mm.S5W5E5.variance	10	1	15.132	NA	4
texture.laws.resolution.2mm.aggregated.E5E5W5.median	0	1	30.523	NA	4
texture.laws.resolution.2mm.E5E5W5.median	1	1	43.448	NA	4
texture.laws.resolution.2mm.E5L5L5.standardDeviation	3	1	59.656	NA	4
texture.laws.resolution.2mm.E5L5L5.variance	4	2	64.484	0.351	4
texture.laws.resolution.2mm.L5E5L5.median	0	1	22.011	NA	4
texture.laws.resolution.2mm.L5E5R5.mean	0	1	38.410	NA	4
texture.laws.resolution.2mm.L5E5R5.median	0	3	30.585	31.898	4
texture.laws.resolution.2mm.L5L5L5.interquartileRange	2	1	54.070	NA	4
texture.laws.resolution.2mm.S5L5S5.meanAbsoluteDeviation	0	1	12.993	NA	4
texture.laws.resolution.2mm.S5R5L5.mean	0	3	45.245	9.147	4
texture.laws.resolution.2mm.W5S5E5.trimmedMean.90.	0	1	42.101	NA	4
Dimension2.features2D.largestSlice. Proportion.of.pixels.with.intensity.larger.than.618	1	7	0.423	1.119	5
Dimension2.features2D.middleSlice. Proportion.of.pixels.with.intensity.larger.than.618	0	6	2.839	3.158	5
texture.glcm.distance.1mm.clusterTendency.min	2	1	16.101	NA	5
texture.glcm.distance.1mm.clusterTendency.trimmedMean.90.	2	1	23.038	NA	5
texture.laws.resolution.1.5mm.aggregated.L5L5L5.kurtosis	6	1	23.718	NA	5
texture.laws.resolution.1.5mm.L5L5L5.kurtosis	6	1	11.720	NA	5
texture.laws.resolution.1.5mm.L5R5W5.skewness	5	1	34.539	NA	5
texture.laws.resolution.1.5mm.R5L5E5.kurtosis	10	2	63.758	0.626	5
texture.laws.resolution.1.5mm.R5R5R5.skewness	6	1	61.458	NA	5
texture.laws.resolution.1.5mm.S5L5L5.skewness	0	1	100.000	NA	5
texture.laws.resolution.1.5mm.S5L5S5.skewness	4	1	44.820	NA	5
texture.laws.resolution.1.5mm.S5W5R5.kurtosis	9	1	38.043	NA	5
texture.laws.resolution.1.5mm.W5L5R5.kurtosis	4	1	2.823	NA	5
texture.laws.resolution.1.5mm.W5R5W5.skewness	1	1	86.716	NA	5
texture.laws.resolution.1mm.aggregated.L5L5R5.kurtosis	4	1	57.690	NA	5
texture.laws.resolution.1mm.aggregated.L5R5W5.kurtosis	5	1	66.706	NA	5
texture.laws.resolution.1mm.aggregated.L5S5S5.kurtosis	4	1	50.000	NA	5
texture.laws.resolution.1mm.aggregated.S5R5W5.kurtosis	6	1	56.303	NA	5
texture.laws.resolution.1mm.L5S5W5.kurtosis	5	1	34.593	NA	5
texture.laws.resolution.1mm.L5W5E5.kurtosis	0	1	27.149	NA	5
texture.laws.resolution.1mm.L5W5S5.max	4	1	23.998	NA	5
texture.laws.resolution.1mm.R5R5S5.skewness	4	1	98.684	NA	5
texture.laws.resolution.1mm.S5E5L5.kurtosis	9	1	17.733	NA	5
texture.laws.resolution.1mm.S5S5R5.kurtosis	4	1	33.103	NA	5
texture.laws.resolution.2mm.aggregated.L5L5W5.kurtosis	6	1	89.531	NA	5
texture.laws.resolution.2mm.E5L5E5.kurtosis	4	1	54.651	NA	5
texture.laws.resolution.2mm.L5L5E5.kurtosis	5	1	61.852	NA	5
texture.laws.resolution.2mm.L5L5L5.kurtosis	4	3	82.003	8.375	5
texture.laws.resolution.2mm.R5R5S5.skewness	4	1	63.587	NA	5
texture.laws.resolution.2mm.S5L5R5.kurtosis	10	3	31.523	5.594	5
texture.laws.resolution.2mm.S5R5E5.kurtosis	9	1	50.872	NA	5
texture.laws.resolution.2mm.S5R5W5.skewness	2	1	50.543	NA	5
texture.glcm.distance.3mm.entropy.variance	7	6	22.596	18.858	6
texture.laws.resolution.1.5mm.aggregated.L5E5R5.mean	0	1	16.398	NA	6
texture.laws.resolution.1.5mm.E5E5E5.median	0	1	29.934	NA	6
texture.laws.resolution.1.5mm.E5W5E5.mean	0	2	34.891	23.046	6
texture.laws.resolution.1mm.aggregated.E5E5S5.mean	0	1	56.345	NA	6
texture.laws.resolution.1mm.aggregated.L5E5E5.mean	2	1	1.974	NA	6
texture.laws.resolution.1mm.E5S5L5.median	0	1	8.140	NA	6
texture.laws.resolution.1mm.E5S5R5.mean	0	1	52.750	NA	6
texture.laws.resolution.1mm.L5W5S5.trimmedMean.90.	1	2	70.868	7.239	6
texture.laws.resolution.1mm.R5L5R5.median	1	2	70.600	4.153	6
texture.laws.resolution.1mm.W5L5R5.median	0	1	49.351	NA	6
texture.laws.resolution.2mm.E5L5E5.mean	0	1	24.709	NA	6
texture.laws.resolution.2mm.E5L5L5.mean	0	1	36.905	NA	6
texture.laws.resolution.2mm.E5L5L5.trimmedMean.90.	0	4	41.187	18.517	6
texture.laws.resolution.2mm.E5S5S5.mean	0	1	0.000	NA	6
texture.laws.resolution.2mm.E5W5L5.mean	0	1	15.625	NA	6
texture.laws.resolution.2mm.E5W5W5.median	0	1	46.046	NA	6
texture.laws.resolution.2mm.R5E5E5.median	0	1	4.348	NA	6
texture.laws.resolution.2mm.R5W5S5.trimmedMean.90.	0	1	50.303	NA	6
texture.laws.resolution.2mm.S5E5L5.median	0	1	16.118	NA	6
texture.laws.resolution.1.5mm.R5S5S5.median	0	1	16.844	NA	7
texture.laws.resolution.1.5mm.S5L5W5.median	0	1	2.471	NA	7
texture.laws.resolution.1mm.aggregated.L5L5R5.mean	0	1	100.000	NA	7
texture.laws.resolution.1mm.aggregated.L5S5W5.mean	1	1	74.185	NA	7
texture.laws.resolution.1mm.L5S5R5.median	0	2	35.390	26.970	7
texture.laws.resolution.1mm.L5S5W5.median	0	1	61.767	NA	7
texture.laws.resolution.1mm.S5L5R5.median	0	1	55.435	NA	7
texture.laws.resolution.1mm.S5S5R5.mean	0	4	75.109	9.066	7
texture.laws.resolution.1mm.S5S5R5.median	0	3	66.901	5.974	7
texture.laws.resolution.1mm.S5S5R5.trimmedMean.90.	0	6	67.994	11.107	7

Open in a new tab

Table 6.

Cluster membership and importance metrics for all selected features for RB round 2.

Feature	Stability	Votes	Importance Mean	Importance SD (if vote >1)	Cluster
intensity.intensity.histogram.median	10	1	73.294	NA	1
texture.glcm.distance.1mm.sumMean.min	10	1	90.476	NA	1
texture.glcm.distance.2mm.sumMean.min	10	3	92.237	3.365	1
texture.glcm.distance.3mm.sumMean.max	10	1	100.000	NA	1
texture.laws.resolution.1mm.L5L5W5.mean	9	1	100.000	NA	1
texture.laws.resolution.1mm.L5W5W5.min	0	1	93.393	NA	1
texture.laws.resolution.1mm.W5L5W5.mean	7	1	100.000	NA	1
texture.laws.resolution.1mm.W5L5W5.median	9	1	100.000	NA	1
texture.laws.resolution.1mm.W5L5W5.min	0	1	61.843	NA	1
texture.laws.resolution.1mm.W5W5W5.mean	7	1	100.000	NA	1
texture.laws.resolution.2mm.W5L5W5.median	4	1	97.064	NA	1
texture.glcm.distance.1mm.contrast.meanAbsoluteDeviation	6	3	21.506	23.399	2
texture.glcm.distance.1mm.contrast.variance	2	3	9.184	7.748	2
texture.glcm.distance.3mm.contrast.meanAbsoluteDeviation	5	1	58.263	NA	2
texture.laws.resolution.1.5mm.aggregated.E5S5S5.median	0	1	25.291	NA	2
texture.laws.resolution.1.5mm.aggregated.R5R5R5.mean	0	1	19.872	NA	2
texture.laws.resolution.1.5mm.E5L5R5.trimmedMean.90.	0	1	25.397	NA	2
texture.laws.resolution.1.5mm.E5L5S5.trimmedMean.90.	0	1	26.299	NA	2
texture.laws.resolution.1.5mm.E5S5L5.mean	0	1	19.897	NA	2
texture.laws.resolution.1.5mm.E5S5R5.trimmedMean.90.	0	1	65.950	NA	2
texture.laws.resolution.1.5mm.E5S5W5.trimmedMean.90.	0	1	44.678	NA	2
texture.laws.resolution.1.5mm.R5L5R5.trimmedMean.90.	0	1	41.278	NA	2
texture.laws.resolution.1.5mm.R5R5R5.mean	0	1	18.357	NA	2
texture.laws.resolution.1mm.aggregated.E5S5S5.median	0	1	12.345	NA	2
texture.laws.resolution.1mm.L5L5R5.meanAbsoluteDeviation	3	1	58.398	NA	2
texture.laws.resolution.1mm.L5S5E5.trimmedMean.90.	0	1	86.563	NA	2
texture.laws.resolution.1mm.R5S5S5.mean	0	1	3.800	NA	2
texture.laws.resolution.1mm.S5S5E5.mean	0	1	100.000	NA	2
texture.glcm.distance.1mm.clusterShade.standardDeviation	6	1	0.000	NA	3
texture.glcm.distance.1mm.clusterShade.variance	3	1	3.154	NA	3
texture.glcm.distance.1mm.clusterTendency.variance	1	1	25.448	NA	3
texture.laws.resolution.1.5mm.aggregated.L5L5R5.variance	2	1	3.584	NA	3
texture.laws.resolution.1.5mm.L5L5E5.interquartileRange	1	2	85.130	12.236	3
texture.laws.resolution.1.5mm.L5L5E5.variance	2	1	8.824	NA	3
texture.laws.resolution.1.5mm.L5L5R5.meanAbsoluteDeviation	2	1	45.403	NA	3
texture.laws.resolution.1.5mm.L5L5R5.variance	2	2	19.122	7.932	3
texture.laws.resolution.1.5mm.R5R5W5.variance	2	1	45.043	NA	3
texture.laws.resolution.1.5mm.W5R5E5.variance	9	4	5.325	7.501	3
texture.laws.resolution.1mm.E5E5E5.variance	10	1	6.202	NA	3
texture.laws.resolution.1mm.L5L5E5.interquartileRange	2	1	83.721	NA	3
texture.laws.resolution.1mm.R5E5E5.variance	10	1	40.069	NA	3
texture.laws.resolution.1mm.W5L5E5.variance	2	1	15.520	NA	3
texture.laws.resolution.2mm.R5R5E5.variance	1	1	0.000	NA	3
texture.glcm.distance.2mm.contrast.kurtosis	5	2	35.545	12.160	4
texture.glcm.distance.2mm.energy.variance	2	1	79.328	NA	4
texture.glcm.distance.2mm.inverseVariance.kurtosis	7	1	55.023	NA	4
texture.glcm.distance.2mm.maxProbability.interquartileRange	3	1	84.629	NA	4
texture.laws.percentageCovered	10	1	71.429	NA	4
texture.laws.resolution.1.5mm.aggregated.E5E5R5.trimmedMean.90.	0	2	65.267	37.091	4
texture.laws.resolution.1.5mm.E5L5L5.interquartileRange	3	1	70.270	NA	4
texture.laws.resolution.1.5mm.L5L5E5.skewness	0	1	20.072	NA	4
texture.laws.resolution.1.5mm.L5S5R5.trimmedMean.90.	0	1	53.022	NA	4
texture.laws.resolution.1.5mm.S5E5L5.trimmedMean.90.	0	1	42.717	NA	4
texture.laws.resolution.1.5mm.W5S5R5.mean	0	1	62.612	NA	4
texture.laws.resolution.1mm.L5S5R5.mean	0	1	52.998	NA	4
texture.laws.resolution.1mm.L5W5R5.mean	1	1	18.605	NA	4
texture.laws.resolution.1mm.L5W5R5.trimmedMean.90.	1	1	19.897	NA	4
texture.laws.resolution.1mm.R5L5E5.skewness	0	1	12.711	NA	4
texture.laws.resolution.1mm.S5E5S5.median	0	1	70.866	NA	4
texture.laws.resolution.2mm.aggregated.S5S5W5.median	0	2	47.037	30.030	4
texture.laws.resolution.2mm.E5E5L5.interquartileRange	1	2	79.951	13.122	4
texture.laws.resolution.2mm.E5R5L5.skewness	0	1	66.149	NA	4
texture.laws.resolution.2mm.E5W5W5.mean	0	1	6.563	NA	4
texture.laws.resolution.2mm.R5R5E5.skewness	0	1	67.006	NA	4
texture.laws.resolution.2mm.R5S5E5.skewness	0	1	0.000	NA	4
texture.laws.resolution.2mm.R5W5S5.mean	1	2	12.829	0.420	4
texture.laws.resolution.2mm.R5W5S5.trimmedMean.90.	1	1	9.173	NA	4
texture.laws.resolution.2mm.S5E5L5.trimmedMean.90.	0	2	89.636	14.657	4
texture.laws.resolution.2mm.S5E5R5.skewness	0	1	5.556	NA	4
texture.laws.resolution.2mm.S5L5W5.trimmedMean.90.	0	1	78.789	NA	4
texture.glcm.distance.2mm.variance.variance	0	1	58.722	NA	5
texture.glcm.distance.3mm.entropy.variance	5	8	23.473	15.938	5
texture.glcm.distance.3mm.inverseVariance.variance	6	1	23.240	NA	5
texture.glcm.distance.3mm.sumMean.standardDeviation	3	1	92.248	NA	5
texture.glcm.distance.3mm.sumMean.variance	2	2	81.418	9.246	5
texture.laws.resolution.1mm.E5S5L5.median	0	2	51.204	7.675	5
texture.laws.resolution.1mm.R5E5E5.trimmedMean.90.	0	2	34.860	12.458	5
texture.laws.resolution.1mm.W5L5S5.mean	1	1	47.668	NA	5
Dimension2.features2D.largestSlice. Proportion.of.pixels.with.intensity.larger.than.618	1	4	0.000	0.000	6
Dimension2.features2D.middleSlice. Proportion.of.pixels.with.intensity.larger.than.618	0	2	1.938	2.741	6
texture.glcm.distance.3mm.clusterTendency.median	2	1	5.579	NA	6
texture.laws.resolution.1.5mm.aggregated.L5L5R5.kurtosis	10	2	86.951	2.740	6
texture.laws.resolution.1.5mm.L5L5E5.kurtosis	4	3	64.031	34.667	6
texture.laws.resolution.1.5mm.L5L5L5.kurtosis	5	1	25.797	NA	6
texture.laws.resolution.1.5mm.L5L5S5.kurtosis	5	2	36.160	7.803	6
texture.laws.resolution.1.5mm.W5L5W5.max	7	1	32.985	NA	6
texture.laws.resolution.1.5mm.W5W5W5.kurtosis	8	2	51.753	5.057	6
texture.laws.resolution.1mm.aggregated.L5E5S5.kurtosis	4	1	50.535	NA	6
texture.laws.resolution.1mm.aggregated.L5R5R5.kurtosis	7	1	55.556	NA	6
texture.laws.resolution.1mm.aggregated.L5R5W5.kurtosis	5	1	99.827	NA	6
texture.laws.resolution.1mm.aggregated.L5S5W5.kurtosis	8	1	84.222	NA	6
texture.laws.resolution.1mm.aggregated.L5W5W5.skewness	3	1	27.118	NA	6
texture.laws.resolution.1mm.E5E5E5.range	6	1	45.878	NA	6
texture.laws.resolution.1mm.L5L5E5.kurtosis	0	1	97.309	NA	6
texture.laws.resolution.1mm.L5L5S5.kurtosis	1	1	93.955	NA	6
texture.laws.resolution.1mm.L5L5W5.kurtosis	3	2	70.132	31.248	6
texture.laws.resolution.1mm.L5S5W5.skewness	1	1	82.913	NA	6
texture.laws.resolution.1mm.L5W5E5.kurtosis	0	1	74.419	NA	6
texture.laws.resolution.1mm.R5R5L5.kurtosis	6	1	59.690	NA	6
texture.laws.resolution.1mm.R5R5R5.kurtosis	5	1	60.215	NA	6
texture.laws.resolution.1mm.R5S5W5.skewness	0	1	90.691	NA	6
texture.laws.resolution.1mm.W5L5S5.kurtosis	5	1	59.173	NA	6
texture.laws.resolution.2mm.aggregated.L5L5E5.kurtosis	5	2	79.882	10.621	6
texture.laws.resolution.2mm.aggregated.S5R5R5.skewness	5	2	97.598	3.398	6
texture.laws.resolution.2mm.E5L5E5.kurtosis	4	1	55.742	NA	6
texture.laws.resolution.2mm.E5L5R5.kurtosis	9	2	33.348	12.692	6
texture.laws.resolution.2mm.L5E5W5.variance	4	2	62.156	19.384	6
texture.laws.resolution.2mm.L5L5E5.kurtosis	3	5	78.804	12.509	6
texture.laws.resolution.2mm.L5L5E5.variance	0	1	15.851	NA	6
texture.laws.resolution.2mm.R5R5S5.skewness	5	1	85.142	NA	6
texture.laws.resolution.2mm.R5S5R5.skewness	4	1	100.000	NA	6
texture.laws.resolution.1.5mm.aggregated.L5E5W5.trimmedMean.90.	0	1	4.134	NA	7
texture.laws.resolution.1.5mm.L5E5L5.trimmedMean.90.	1	1	47.670	NA	7
texture.laws.resolution.1.5mm.L5E5W5.trimmedMean.90.	0	1	66.527	NA	7
texture.laws.resolution.1.5mm.W5E5W5.median	0	1	96.670	NA	7
texture.laws.resolution.1mm.L5E5L5.mean	2	1	29.880	NA	7
texture.laws.resolution.1mm.L5E5W5.trimmedMean.90.	1	1	76.381	NA	7
texture.laws.resolution.1mm.R5E5W5.skewness	0	1	30.353	NA	7
texture.laws.resolution.1mm.S5E5L5.median	1	1	0.000	NA	7
texture.laws.resolution.2mm.aggregated.E5S5S5.mean	0	1	20.801	NA	7
texture.laws.resolution.2mm.aggregated.L5L5R5.mean	0	1	67.229	NA	7
texture.laws.resolution.2mm.aggregated.L5L5R5.trimmedMean.90.	1	1	15.339	NA	7
texture.laws.resolution.2mm.E5W5S5.trimmedMean.90.	0	1	41.577	NA	7
texture.laws.resolution.2mm.R5E5L5.median	0	1	73.118	NA	7
texture.laws.resolution.2mm.R5E5R5.trimmedMean.90.	4	1	52.326	NA	7
texture.laws.resolution.2mm.R5E5W5.median	0	1	100.000	NA	7
texture.laws.resolution.2mm.R5R5L5.median	0	2	57.628	25.014	7
texture.laws.resolution.2mm.S5E5W5.mean	0	1	52.570	NA	7
texture.laws.resolution.2mm.S5E5W5.trimmedMean.90.	0	1	46.377	NA	7
texture.laws.resolution.2mm.S5L5E5.mean	0	1	58.510	NA	7
texture.laws.resolution.2mm.S5L5E5.median	0	2	36.927	3.101	7
texture.laws.resolution.2mm.S5S5L5.mean	0	1	84.953	NA	7
texture.laws.resolution.2mm.S5S5R5.median	1	1	93.610	NA	7

Open in a new tab

4. Discussion

In our study, we report a machine learning classifier that can distinguish between oncocytomas and chromophobe RCCs with an average AUC across both RBs of $0.71 \pm 0.024$ using features computed from RBs.

Because imaging alone cannot reliably differentiate benign from malignant solid renal masses, patients either undergo biopsy or surgical resection. Historically, surgically fit patients would undergo resection for solid renal masses, with reported rates of 12.8% being benign (i.e., oncocytoma) on pathology.⁹ However, there has been a shift in renal mass management with increasing use of image-guided percutaneous biopsies.¹ The benefits of biopsy include appropriate risk stratification based on histologic subtype of RCC for patients who may not be ideal surgical candidates or to identify benign solid renal masses such as fat-poor angiomyolipomas and oncocytomas.⁶⁸ However, a diagnostic conundrum at biopsy that may occur is one in which reliable distinction between an oncocytoma and RCC, most commonly the chromophobe subtype, cannot be made histologically.⁶⁹^–⁷¹ Our model could be used in such scenarios to potentially avoid a physical biopsy while directing appropriate patient management. Alternatively, RB could be used in those cases where a biopsy has already been performed with a resultant “oncocytic renal neoplasm” as the histologic diagnosis. In this scenario, patients are likely to undergo resection due to the possibility of chromophobe RCC.⁷²^,⁷³ However, our model could potentially prevent a patient from undergoing unnecessary surgery by ruling in oncocytoma.

Little work has been done to apply quantitative imaging techniques to differentiating oncocytoma and chromophobe RCC. One study employed quantitative imaging to distinguish many different subtypes of kidney cancer.³² While this study only used a small number of samples (20) for five different subtypes (not including chromophobe RCC), texture and intensity features alone could distinguish oncocytoma from other subtypes (not including chromophobe RCC).³² That work suggested that texture differences in kidney cancer subtypes are sufficient for discrimination. Focusing on oncocytoma and chromophobe RCC, this work shows that, with just intensity and texture features, the RBs produce a predictive model discriminating oncocytoma and chromophobe RCC (Fig. 4). Importantly, the radiomic signature identified was repeatable across two rounds of biopsies (Fig. 6). Two clusters of features, one from each round, showed higher stability and importance than the rest. These clusters contained features that likely quantify aspects of texture brightness (Tables 1, 5, 6, and 9). Future work could relate these specific features to known histological and imaging differences between these subtypes.

Our work is supportive of results presented by Li et al.,¹⁷ showing AUCs between 0.85 and 0.95. While they minimized some overfitting using a leave-one-out paradigm during machine learning, their feature reduction step (LASSO) was based on all 61 cases, which could have inflated their results compared with ours. Our approach used RB, a method to avoid painstaking lesion segmentation and computation of morphological features. Finally, Li et al. used a proprietary radiomics package, whereas we used QIFE,⁴⁴ an open source radiomics package with IBSI-compliant features that could facilitate validation of our results.²⁶^,³⁵ Nonetheless, taken together, our studies support the notion that radiomics approaches can aid in the differentiation of these two lesion types.

Since the two subtypes of interest were known to show little difference in shape or size,¹³^,¹⁴ this task provided a demonstration for the utility of the RB technique. When intensity and texture are known to be important for a classification task, the RB tool offers a fast, simple interface for collecting information about the interior of regions of interest. While Echegaray et al. demonstrated that considering the interior subsets of two-(2D) and three-dimensional (3D) segmentations of tumors results in many segmentation-invariant texture features,²⁹^,³⁰ until the beginning of this study, there has been no practical way to obtain them. Our work offers a proof-of-concept for similar studies to reduce radiologist time in prospective radiomics-based predictive modeling tasks. Since this study’s initiation, several common segmentation tools have added features for similar “biopsy-like” segmentations.⁷⁴

This study has several limitations. First, beyond its retrospective design, it was performed in a single center and in a relatively small patient population. No large data set of similar images is publicly available, so collection and dissemination of imaging cohorts containing oncocytomas, chromophobe RCCs, and other renal cancer subtypes will be vital for external validation of future models. Multicenter studies with a larger patient population are warranted to confirm the generalizability of our findings. Second, the images were obtained from multiple CT scanners, which could affect some radiomics features. Nonetheless, the presence of multivendor CT scanners is common in similar large academic centers. Therefore, heterogeneous scanner type provides a realistic training scenario. Third, we performed our RBs only on a single-phase (i.e., nephrographic) as corticomedullary and excretory phases were not reliably available for all patients. Fourth, CT slice thickness was 5 mm, and larger slice thicknesses have been shown to decrease radiomics model performance in some cases.⁷⁵ While this warrants future studies with smaller slice thickness, 5 mm slices are still commonly used across institutions and thus provide a reasonable benchmark study. Fifth, a single reader annotated this cohort. While we recognize this limitation, prior work has shown that most inter- and intrareader segmentation variability occurs at lesion boundaries and that texture features are relatively insensitive to these variations.²⁵^,²⁶ Since RBs focus only on the interior of a lesion, we expect a study with more readers will produce more similar results than would be expected from a study conducted on full segmentations. However, because high inter-reader reliability is crucial for broad deployment, future work should certainly validate this with multiple readers across multiple institutions. Sixth, since correlation between the two rounds of RBs was used to filter out features, this step likely inflated the model performance when testing on the alternate RB round (Fig. 4) and similarity between the feature clusters (Fig. 6). Future work could add a third, unseen round of RBs to confirm that this approach truly identifies a repeatable, robust radiomic signature. Seventh and finally, since this model was only trained on a binary classification task, the identified radiomic signature can only be used when all renal carcinomas but chromophobe RCC and oncocytoma have been ruled out.

In conclusion, we developed a machine learning classifier that effectively uses quantitative features extracted from RBs to differentiate oncocytomas from chromophobe RCCs on contrast-enhanced CT. This study functions both as a classification scheme for two often indistinguishable renal masses and points to future work using RBs to simplify and shorten the radiologist workflow for some automated diagnostic tasks.

5. Selected QIFE Feature Descriptions

5.1. Intensity Features

Intensity features characterize the global distribution of intensity (voxel) values. The QIFE computes summary statistics (mean, standard deviation, minimum value, kurtosis, etc.) of the intensity value distribution contained within the segmentation. All computed valuables available in Ref. 40.

5.2. Texture Features

The QIFE computes a variety of texture features to capture the local variation in intensity values within the VOI. The QIFE generates GLCM to explore the relationship between nearby voxels. The QIFE then computes various Haralick’s texture features from the GLCM, and Echegaray et al. reported the full list of the computed values. These co-occurrence matrices can be computed in 2D or 3D. In addition, Laws defined a variety of local masks that filter for five types of texture features (level, edges, spots, ripples, waves).⁵¹ The QIFE can compute these additional texture features to drastically expand the number of available texture features.

5.3. Relevant QIFT Default Parameter Values

For GLCM features, images were first binned to 256 gray levels with a minimum intensity of $- 1000 HU$ and a maximum intensity of 3096 HU. GLCM features were computed at distances 1, 2, and 3 mm from the voxel of interest. For laws texture feature extraction, each feature was computed from five sample points and at resolutions 1, 1.5, and 2. All configuration parameters used in the experiments are available in the “default_config.ini” file within GitHub repository: https://github.com/riipl/rcc_ctta_code.

6. Appendix

The appendix provides configuration information and general feature descriptions. Figure 6 shows clustering of selected radiomics features and compares clusters. The checklist for radiomics quality score evaluation is given in Table 2. Table 3 shows the different scanner models and manufacturers on which patients were scanned. Differences between the models performed using repeated DeLong testing are reported in Table 4. Tables 5 and 6 detail the cluster membership and importance metrics for all selected features for RB round 1 and RB round 2, respectively. Table 7 provides the two-way ANOVA results with no Tukey’s post-hoc test for effect of RB round and cluster assignment on feature votes. Table 8 gives the two-way ANOVA results with Tukey’s post-hoc test for effect of RB round and cluster assignment on feature stability. Table 9 gives the two-way ANOVA results with Tukey’s post-hoc test for effect of RB round and cluster assignment on feature importance.

Acknowledgments

S. N. and A. J. were supported, in part, through funding from NIH/NCI (R01 CA160251 and U01 CA187947). A. J. was additionally funded by the Bio-X Undergraduate Summer Research Program and an Undergraduate Major Grant. The authors would like to thank Jarrett Rosenberg for statistics advising, Dev Gude and Emel Alkim for essential technical support, and Elizabeth Colvin for administrative support.

Biographies

Akshay Jaggi is an incoming MD-PhD student at Harvard Medical School where he applies machine vision and machine learning to study mouse behavior. He received his BS degree in biology from Stanford with an honors thesis in computational biology with Dr. Sandy Napel in 2019. He then pursued a Fulbright predoctoral research scholarship at the Universitat de Barcelona with Dr. Karim Lekadir from 2019 to 2020.

Sandy Napel received his BSES degree from SUNY Stony Brook in 1974 and his MSEE and PhD degrees in EE from Stanford University in 1976 and 1981, respectively. He was formerly VP of engineering at Imatron Inc. and is currently professor of radiology and, by courtesy, of electrical engineering and medicine (Biomedical Informatics Research) at Stanford University. He coleads the Stanford Radiology 3D and Quantitative Imaging Lab and leads the Radiology Department’s Division of Integrative Biomedical Imaging Informatics, where he is developing techniques for linkage of image features to molecular properties of disease.

Bhavik Patel received his medical degree from the University of Alabama School of Medicine in 2007. He completed his internship at Harvard’s Brigham and Women’s Hospital before returning to UAB to complete his residency in diagnostic radiology. He became board certified at the end of residency in 2012. He completed an abdominal imaging fellowship at Stanford. He served as the associate director for AI Evaluation & Implementation, director of Clinical Trials, and director of Body CT at Stanford. He now serves as the director of artificial intelligence at the Department of Radiology in Mayo Clinic Arizona.

Biographies of the other authors are not available.

Disclosures

B. P. receives research support from GE Healthcare as part of an institutional grant. S. N. is on the scientific advisory boards of EchoPixel Inc., Fovia Inc., and RadLogics, Inc. D. M. is a shareholder of Segmed, Inc., Consultant for Segmed, Inc. All other authors are not employees of or consultants for industry or had influence in the inclusion of any data or information that might present a conflict of interest. There was no industry support specifically for this study.

Contributor Information

Akshay Jaggi, Email: akshay.x.jaggi@gmail.com.

Domenico Mastrodicasa, Email: mastro@stanford.edu.

Gregory W. Charville, Email: gwc@stanford.edu.

R. Brooke Jeffrey, Jr., Email: bjeffrey@stanford.edu.

Sandy Napel, Email: snapel@stanford.edu.

Bhavik Patel, Email: patel.bhavik@mayo.edu.

Data, Materials, and Code Availability

Code available at: https://github.com/riipl/rcc_ctta_code.

References

1.Silverman S. G., et al. , “Renal masses in the adult patient: the role of percutaneous biopsy,” Radiology 240(1), 6–22 (2006). 10.1148/radiol.2401050061 [DOI] [PubMed] [Google Scholar]
2.Steven C., et al. , “Renal mass and localized renal cancer: AUA guideline,” J. Urol. 198(3), 520–529 (2017). 10.1016/j.juro.2017.04.100 [DOI] [PubMed] [Google Scholar]
3.Kang S. K., Bjurlin M. A., Huang W. C., “Management of small kidney tumors in 2019,” JAMA 321(16), 1622–1623 (2019). 10.1001/jama.2019.1672 [DOI] [PubMed] [Google Scholar]
4.Hara A. K., et al. , “Incidental extracolonic findings at CT colonography,” Radiology 215(2), 353–357 (2000). 10.1148/radiology.215.2.r00ap33353 [DOI] [PubMed] [Google Scholar]
5.Hollingsworth J. M., et al. , “Rising incidence of small renal masses: a need to reassess treatment effect,” J. Natl. Cancer Inst. 98(18), 1331–1334 (2006). 10.1093/jnci/djj362 [DOI] [PubMed] [Google Scholar]
6.Jayson M., Sanders H., “Increased incidence of serendipitously discovered renal cell carcinoma,” Urology 51(2), 203–205 (1998). 10.1016/S0090-4295(97)00506-2 [DOI] [PubMed] [Google Scholar]
7.Patel B. N., et al. , “Characterization of small (<4 cm) focal renal lesions: diagnostic accuracy of spectral analysis using single-phase contrast-enhanced dual-energy CT,” Am. J. Roentgenol. 209(4), 815–825 (2017). 10.2214/AJR.17.17824 [DOI] [PubMed] [Google Scholar]
8.Ozen H., Colowick A., Freiha F. S., “Incidentally discovered solid renal masses: what are they?” Br. J. Urol. 72(3), 274–276 (1993). 10.1111/j.1464-410x.1993.tb00716.x [DOI] [PubMed] [Google Scholar]
9.Frank I., et al. , “Solid renal tumors: an analysis of pathological features related to tumor size,” J. Urol. 170(6 Pt. 1), 2217–2220 (2003). 10.1097/01.ju.0000095475.12515.5e [DOI] [PubMed] [Google Scholar]
10.Caoili E. M., Davenport M. S., “Role of percutaneous needle biopsy for renal masses,” Semin. Intervent. Radiol. 31(1), 20–26 (2014). 10.1055/s-0033-1363839 [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Cho E., Adami H.-O., Lindblad P., “Epidemiology of renal cell cancer,” Hematol. Oncol. Clin. North Am. 25(4), 651–665 (2011). 10.1016/j.hoc.2011.04.002 [DOI] [PubMed] [Google Scholar]
12.Silverman S. G., Israel G. M., Trinh Q.-D., “Incompletely characterized incidental renal masses: emerging data support conservative management,” Radiology 275(1), 28–42 (2015). 10.1148/radiol.14141144 [DOI] [PubMed] [Google Scholar]
13.Uppot R. N., Harisinghani M. G., Gervais D. A., “Imaging-guided percutaneous renal biopsy: rationale and approach,” Am. J. Roentgenol. 194(6), 1443–1449 (2010). 10.2214/AJR.10.4427 [DOI] [PubMed] [Google Scholar]
14.Kryvenko O. N., et al. , “Diagnostic approach to eosinophilic renal neoplasms,” Arch. Pathol. Lab. Med. 138(11), 1531–1541 (2014). 10.5858/arpa.2013-0653-RA [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Rosenkrantz A. B., et al. , “MRI features of renal oncocytoma and chromophobe renal cell carcinoma,” Am. J. Roentgenol. 195(6), W421–W427 (2010). 10.2214/AJR.10.4718 [DOI] [PubMed] [Google Scholar]
16.Cochand-Priollet B., et al. , “Renal chromophobe cell carcinoma and oncocytoma. A comparative morphologic, histochemical, and immunohistochemical study of 124 cases,” Arch. Pathol. Lab. Med. 121(10), 1081–1086 (1997). [PubMed] [Google Scholar]
17.Li Y., et al. , “Value of radiomics in differential diagnosis of chromophobe renal cell carcinoma and renal oncocytoma,” Abdom. Radiol. 45(10), 3193–3201 (2020). 10.1007/s00261-019-02269-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Ishigami K., et al. , “Imaging spectrum of renal oncocytomas: a pictorial review with pathologic correlation,” Insights Imaging 6(1), 53–64 (2015). 10.1007/s13244-014-0373-x [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Choi J. H., et al. , “Comparison of computed tomography findings between renal oncocytomas and chromophobe renal cell carcinomas,” Korean J. Urol. 56(10), 695–702 (2015). 10.4111/kju.2015.56.10.695 [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Wu J., et al. , “Comparative study of CT appearances in renal oncocytoma and chromophobe renal cell carcinoma,” Acta Radiol. 57(4), 500–506 (2016). 10.1177/0284185115585035 [DOI] [PubMed] [Google Scholar]
21.Marin D., et al. , “Characterization of small focal renal lesions: diagnostic accuracy with single-phase contrast-enhanced dual-energy CT with material attenuation analysis compared with conventional attenuation measurements,” Radiology 284(3), 737–747 (2017). 10.1148/radiol.2017161872 [DOI] [PubMed] [Google Scholar]
22.Cohan R. H., et al. , “Renal masses: assessment of corticomedullary-phase and nephrographic-phase CT scans,” Radiology 196(2), 445–451 (1995). 10.1148/radiology.196.2.7617859 [DOI] [PubMed] [Google Scholar]
23.Napel S., et al. , “Quantitative imaging of cancer in the postgenomic era: radio(geno)mics, deep learning, and habitats,” Cancer 124(24), 4633–4649 (2018). 10.1002/cncr.31630 [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Gillies R., Kinahan P. E., Hricak H., “Radiomics: images are more than pictures, they are data,” Radiology 278(2), 563–577 (2016). 10.1148/radiol.2015151169 [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Lambin P., et al. , “Radiomics: extracting more information from medical images using advanced feature analysis,” Eur. J. Cancer 48(4), 441–446 (2012). 10.1016/j.ejca.2011.11.036 [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Zwanenburg A., et al. , “The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping,” Radiology 295(2), 328–338 (2020). 10.1148/radiol.2020191145 [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Stammberger T., et al. , “Interobserver reproducibility of quantitative cartilage measurements: comparison of B-spline snakes and manual segmentation,” J. Magn. Reson. Imaging 17(7), 1033–1042 (1999). 10.1016/S0730-725X(99)00040-5 [DOI] [PubMed] [Google Scholar]
28.Parmar C., et al. , “Robust radiomics feature quantification using semiautomatic volumetric segmentation,” PLoS One 9(7), e102107 (2014). 10.1371/journal.pone.0102107 [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Echegaray S., et al. , “Core samples for radiomics features that are insensitive to tumor segmentation: method and pilot study using CT images of hepatocellular carcinoma,” J. Med. Imaging 2(4), 041011 (2015). 10.1117/1.JMI.2.4.041011 [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Echegaray S., et al. , “A rapid segmentation-insensitive ‘digital biopsy’ method for radiomic feature extraction: method and pilot study using CT images of non-small cell lung cancer,” Tomography 2(4), 283–294 (2016). 10.18383/j.tom.2016.00163 [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Bakr S., et al. , “Noninvasive radiomics signature based on quantitative analysis of computed tomography images as a surrogate for microvascular invasion in hepatocellular carcinoma: a pilot study,” J. Med. Imaging 4(4), 041303 (2017). 10.1117/1.JMI.4.4.041303 [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Raman S. P., et al. , “CT texture analysis of renal masses: pilot study using random forest classification for prediction of pathology,” Academic Radiol. 21(12), 1587–1596 (2014). 10.1016/j.acra.2014.07.023 [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Sasaguri K., et al. , “Small <4 cm) renal mass: differentiation of oncocytoma from renal cell carcinoma on biphasic contrast-enhanced CT,” Am. J. Roentgenol. 205(5), 999–1007 (2015). 10.2214/AJR.14.13966 [DOI] [PubMed] [Google Scholar]
34.Collins G. S., et al. , “Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement,” Ann. Intern. Med. 162(1), 55 (2015). 10.7326/M14-0697 [DOI] [PubMed] [Google Scholar]
35.Lambin P., et al. , “Radiomics: the bridge between medical imaging and personalized medicine,” Nat. Rev. Clin. Oncol. 14(12), 749–762 (2017). 10.1038/nrclinonc.2017.141 [DOI] [PubMed] [Google Scholar]
36.Sanduleanu S., et al. , “Tracking tumor biology with radiomics: a systematic review utilizing a radiomics quality score,” Radiother. Oncol. 127(3), 349–360 (2018). 10.1016/j.radonc.2018.03.033 [DOI] [PubMed] [Google Scholar]
37.Ng C. S., et al. , “Renal cell carcinoma: diagnosis, staging, and surveillance,” Am. J. Roentgenol. 191(4), 1220–1232 (2008). 10.2214/AJR.07.3568 [DOI] [PubMed] [Google Scholar]
38.Birnbaum B. A., Jacobs J. E., Ramchandani P., “Multiphasic renal CT: comparison of renal mass enhancement during the corticomedullary and nephrographic phases,” Radiology 200(3), 753–758 (1996). 10.1148/radiology.200.3.8756927 [DOI] [PubMed] [Google Scholar]
39.McHugh M. L., “The chi-square test of independence,” Biochem. Med. 23(2), 143–149 (2013). 10.11613/BM.2013.018 [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Rubin D. L., et al. , “Automated tracking of quantitative assessments of tumor burden in clinical trials,” Transl. Oncol. 7(1), 23–35 (2014). 10.1593/tlo.13796 [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Rubin D. L., et al. , “ePAD: an image annotation and analysis platform for quantitative imaging,” Tomography 5(1), 170–183 (2019). 10.18383/j.tom.2018.00055 [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Yip S. S. F., Aerts H. J. W. L., “Applications and limitations of radiomics,” Phys. Med. Biol. 61(13), R150–R166 (2016). 10.1088/0031-9155/61/13/R150 [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Kalpathy-Cramer J., et al. , “A comparison of lung nodule segmentation algorithms: methods and results from a multi-institutional study,” J. Digital Imaging 29(4), 476–487 (2016). 10.1007/s10278-016-9859-z [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Echegaray S., et al. , “Quantitative image feature engine (QIFE): an open-source, modular engine for 3D quantitative feature extraction from volumetric medical images,” J. Digital Imaging 31(4), 403–414 (2018). 10.1007/s10278-017-0019-x [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Kuhn M., “Building predictive models in R using the caret package,” J. Stat. Software 28(5), 1–26 (2008). 10.18637/jss.v028.i05 [DOI] [Google Scholar]
46.Bartko J. J., “The intraclass correlation coefficient as a measure of reliability,” Psychol. Rep. 19(1), 3–11 (1966). 10.2466/pr0.1966.19.1.3 [DOI] [PubMed] [Google Scholar]
47.Gamer M., et al. , irr: Various Coefficients of Interrater Reliability and Agreement (2019). [Google Scholar]
48.Koo T. K., Li M. Y., “A guideline of selecting and reporting intraclass correlation coefficients for reliability research,” J. Chiropractic Med. 15(2), 155–163 (2016). 10.1016/j.jcm.2016.02.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
49.McGraw K. O., Wong S. P., “Forming inferences about some intraclass correlation coefficients,” Psychological Methods 1(1), 30–46 (1996). 10.1037/1082-989X.1.1.30 [DOI] [Google Scholar]
50.Wu W., et al. , “Exploratory study to identify radiomics classifiers for lung cancer histology,” Front. Oncol. 6, 71 (2016). 10.3389/fonc.2016.00071 [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Parmar C., et al. , “Radiomic feature clusters and prognostic signatures specific for lung and head & neck cancer,” Sci. Rep. 5(1), 11044 (2015). 10.1038/srep11044 [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Ding C., Peng H., “Minimum redundancy feature selection from microarray gene expression data,” J. Bioinf. Comput. Biol. 3(2), 185–205 (2005). 10.1142/S0219720005001004 [DOI] [PubMed] [Google Scholar]
53.Hua J., et al. , “Optimal number of features as a function of sample size for various classification rules,” Bioinformatics 21(8), 1509–1515 (2005). 10.1093/bioinformatics/bti171 [DOI] [PubMed] [Google Scholar]
54.van Smeden M., et al. , “No rationale for 1 variable per 10 events criterion for binary logistic regression analysis,” BMC Med. Res. Methodol. 16, 163 (2016). 10.1186/s12874-016-0267-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Parmar C., et al. , “Machine learning methods for quantitative radiomic biomarkers,” Sci. Rep. 5(1), 13087 (2015). 10.1038/srep13087 [DOI] [PMC free article] [PubMed] [Google Scholar]
56.De Jay N., et al. , “mRMRe: an R package for parallelized mRMR ensemble feature selection,” Bioinformatics 29(18), 2365–2368 (2013). 10.1093/bioinformatics/btt383 [DOI] [PubMed] [Google Scholar]
57.Freund Y., Schapire R. E., “A decision-theoretic generalization of on-line learning and an application to boosting,” J. Comput. Syst. Sci. 55(1), 119–139 (1997). 10.1006/jcss.1997.1504 [DOI] [Google Scholar]
58.Schapire R. E., “The boosting approach to machine learning: an overview,” in Nonlinear Estimation and Classification, Denison D. D., et al., Eds., pp. 149–171, Springer, New York: (2003). [Google Scholar]
59.Liaw A., Wiener M., “Classification and regression by randomForest,” R. News 2(3), 18–22 (2002). [Google Scholar]
60.Culp M., Johnson K., Michailides G., “ada: an R package for stochastic boosting,” J. Stat. Software 17(1), 1–27 (2006). 10.18637/jss.v017.i02 [DOI] [Google Scholar]
61.Bradley A. P., “The use of the area under the ROC curve in the evaluation of machine learning algorithms,” Pattern Recognit. 30(7), 1145–1159 (1997). 10.1016/S0031-3203(96)00142-2 [DOI] [Google Scholar]
62.DeLong E. R., DeLong D. M., Clarke-Pearson D. L., “Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach,” Biometrics 44(3), 837–845 (1988). 10.2307/2531595 [DOI] [PubMed] [Google Scholar]
63.Benjamini Y., Hochberg Y., “Controlling the false discovery rate: a practical and powerful approach to multiple testing,” J. R. Stat. Soc. Ser. B 57(1), 289–300 (1995). 10.1111/j.2517-6161.1995.tb02031.x [DOI] [Google Scholar]
64.Wilkerson M. D., Hayes D. N., “ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking,” Bioinformatics 26(12), 1572–1573 (2010). 10.1093/bioinformatics/btq170 [DOI] [PMC free article] [PubMed] [Google Scholar]
65.Theodoridis S., Koutroumbas K., Pattern Recognition, Elsevier, London: (2009). [Google Scholar]
66.Wickham H., ggplot2: Elegant Graphics for Data Analysis, Springer-Verlag, New York: (2016). [Google Scholar]
67.Warnes G. R., et al. , “gplots: various R programming tools for plotting data,” R package version 3.1.1 (2019). [Google Scholar]
68.Steven C., et al. , “Guideline for management of the clinical T1 renal mass,” J. Urol. 182(4), 1271–1279 (2009). 10.1016/j.juro.2009.07.004 [DOI] [PubMed] [Google Scholar]
69.Patel H. D., et al. , “Surgical histopathology for suspected oncocytoma on renal mass biopsy: a systematic review and meta-analysis,” BJU Int. 119(5), 661–666 (2017). 10.1111/bju.13763 [DOI] [PubMed] [Google Scholar]
70.Wobker S. E., Williamson S. R., “Modern pathologic diagnosis of renal oncocytoma,” J. Kidney Cancer VHL 4(4), 1–12 (2017). 10.15586/jkcvhl.2017.96 [DOI] [PMC free article] [PubMed] [Google Scholar]
71.Yusenko M., “Molecular pathology of chromophobe renal cell carcinoma: a review,” Int. J. Urol. 17(7), 592–600 (2010). 10.1111/j.1442-2042.2010.02558.x [DOI] [PubMed] [Google Scholar]
72.Neves J. B., et al. , “Contemporary surgical management of renal oncocytoma: a nation’s outcome,” BJU Int. 121(6), 893–899 (2018). 10.1111/bju.14159 [DOI] [PubMed] [Google Scholar]
73.Romis L., et al. , “Frequency, clinical presentation and evolution of renal oncocytomas: multicentric experience from a European database,” Eur. Urol. 45(1), 53–57 (2004). 10.1016/j.eururo.2003.08.008 [DOI] [PubMed] [Google Scholar]
74.Yushkevich P. A., et al. , “User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability,” NeuroImage 31(3), 1116–1128 (2006). 10.1016/j.neuroimage.2006.01.015 [DOI] [PubMed] [Google Scholar]
75.He L., et al. , “Effects of contrast-enhancement, reconstruction slice thickness and convolution kernel on the diagnostic performance of radiomics signature in solitary pulmonary nodule,” Sci. Rep. 6(1), 34921 (2016). 10.1038/srep34921 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r1] 1.Silverman S. G., et al. , “Renal masses in the adult patient: the role of percutaneous biopsy,” Radiology 240(1), 6–22 (2006). 10.1148/radiol.2401050061 [DOI] [PubMed] [Google Scholar]

[r2] 2.Steven C., et al. , “Renal mass and localized renal cancer: AUA guideline,” J. Urol. 198(3), 520–529 (2017). 10.1016/j.juro.2017.04.100 [DOI] [PubMed] [Google Scholar]

[r3] 3.Kang S. K., Bjurlin M. A., Huang W. C., “Management of small kidney tumors in 2019,” JAMA 321(16), 1622–1623 (2019). 10.1001/jama.2019.1672 [DOI] [PubMed] [Google Scholar]

[r4] 4.Hara A. K., et al. , “Incidental extracolonic findings at CT colonography,” Radiology 215(2), 353–357 (2000). 10.1148/radiology.215.2.r00ap33353 [DOI] [PubMed] [Google Scholar]

[r5] 5.Hollingsworth J. M., et al. , “Rising incidence of small renal masses: a need to reassess treatment effect,” J. Natl. Cancer Inst. 98(18), 1331–1334 (2006). 10.1093/jnci/djj362 [DOI] [PubMed] [Google Scholar]

[r6] 6.Jayson M., Sanders H., “Increased incidence of serendipitously discovered renal cell carcinoma,” Urology 51(2), 203–205 (1998). 10.1016/S0090-4295(97)00506-2 [DOI] [PubMed] [Google Scholar]

[r7] 7.Patel B. N., et al. , “Characterization of small (<4 cm) focal renal lesions: diagnostic accuracy of spectral analysis using single-phase contrast-enhanced dual-energy CT,” Am. J. Roentgenol. 209(4), 815–825 (2017). 10.2214/AJR.17.17824 [DOI] [PubMed] [Google Scholar]

[r8] 8.Ozen H., Colowick A., Freiha F. S., “Incidentally discovered solid renal masses: what are they?” Br. J. Urol. 72(3), 274–276 (1993). 10.1111/j.1464-410x.1993.tb00716.x [DOI] [PubMed] [Google Scholar]

[r9] 9.Frank I., et al. , “Solid renal tumors: an analysis of pathological features related to tumor size,” J. Urol. 170(6 Pt. 1), 2217–2220 (2003). 10.1097/01.ju.0000095475.12515.5e [DOI] [PubMed] [Google Scholar]

[r10] 10.Caoili E. M., Davenport M. S., “Role of percutaneous needle biopsy for renal masses,” Semin. Intervent. Radiol. 31(1), 20–26 (2014). 10.1055/s-0033-1363839 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r11] 11.Cho E., Adami H.-O., Lindblad P., “Epidemiology of renal cell cancer,” Hematol. Oncol. Clin. North Am. 25(4), 651–665 (2011). 10.1016/j.hoc.2011.04.002 [DOI] [PubMed] [Google Scholar]

[r12] 12.Silverman S. G., Israel G. M., Trinh Q.-D., “Incompletely characterized incidental renal masses: emerging data support conservative management,” Radiology 275(1), 28–42 (2015). 10.1148/radiol.14141144 [DOI] [PubMed] [Google Scholar]

[r13] 13.Uppot R. N., Harisinghani M. G., Gervais D. A., “Imaging-guided percutaneous renal biopsy: rationale and approach,” Am. J. Roentgenol. 194(6), 1443–1449 (2010). 10.2214/AJR.10.4427 [DOI] [PubMed] [Google Scholar]

[r14] 14.Kryvenko O. N., et al. , “Diagnostic approach to eosinophilic renal neoplasms,” Arch. Pathol. Lab. Med. 138(11), 1531–1541 (2014). 10.5858/arpa.2013-0653-RA [DOI] [PMC free article] [PubMed] [Google Scholar]

[r15] 15.Rosenkrantz A. B., et al. , “MRI features of renal oncocytoma and chromophobe renal cell carcinoma,” Am. J. Roentgenol. 195(6), W421–W427 (2010). 10.2214/AJR.10.4718 [DOI] [PubMed] [Google Scholar]

[r16] 16.Cochand-Priollet B., et al. , “Renal chromophobe cell carcinoma and oncocytoma. A comparative morphologic, histochemical, and immunohistochemical study of 124 cases,” Arch. Pathol. Lab. Med. 121(10), 1081–1086 (1997). [PubMed] [Google Scholar]

[r17] 17.Li Y., et al. , “Value of radiomics in differential diagnosis of chromophobe renal cell carcinoma and renal oncocytoma,” Abdom. Radiol. 45(10), 3193–3201 (2020). 10.1007/s00261-019-02269-9 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r18] 18.Ishigami K., et al. , “Imaging spectrum of renal oncocytomas: a pictorial review with pathologic correlation,” Insights Imaging 6(1), 53–64 (2015). 10.1007/s13244-014-0373-x [DOI] [PMC free article] [PubMed] [Google Scholar]

[r19] 19.Choi J. H., et al. , “Comparison of computed tomography findings between renal oncocytomas and chromophobe renal cell carcinomas,” Korean J. Urol. 56(10), 695–702 (2015). 10.4111/kju.2015.56.10.695 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r20] 20.Wu J., et al. , “Comparative study of CT appearances in renal oncocytoma and chromophobe renal cell carcinoma,” Acta Radiol. 57(4), 500–506 (2016). 10.1177/0284185115585035 [DOI] [PubMed] [Google Scholar]

[r21] 21.Marin D., et al. , “Characterization of small focal renal lesions: diagnostic accuracy with single-phase contrast-enhanced dual-energy CT with material attenuation analysis compared with conventional attenuation measurements,” Radiology 284(3), 737–747 (2017). 10.1148/radiol.2017161872 [DOI] [PubMed] [Google Scholar]

[r22] 22.Cohan R. H., et al. , “Renal masses: assessment of corticomedullary-phase and nephrographic-phase CT scans,” Radiology 196(2), 445–451 (1995). 10.1148/radiology.196.2.7617859 [DOI] [PubMed] [Google Scholar]

[r23] 23.Napel S., et al. , “Quantitative imaging of cancer in the postgenomic era: radio(geno)mics, deep learning, and habitats,” Cancer 124(24), 4633–4649 (2018). 10.1002/cncr.31630 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r24] 24.Gillies R., Kinahan P. E., Hricak H., “Radiomics: images are more than pictures, they are data,” Radiology 278(2), 563–577 (2016). 10.1148/radiol.2015151169 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r25] 25.Lambin P., et al. , “Radiomics: extracting more information from medical images using advanced feature analysis,” Eur. J. Cancer 48(4), 441–446 (2012). 10.1016/j.ejca.2011.11.036 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r26] 26.Zwanenburg A., et al. , “The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping,” Radiology 295(2), 328–338 (2020). 10.1148/radiol.2020191145 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r27] 27.Stammberger T., et al. , “Interobserver reproducibility of quantitative cartilage measurements: comparison of B-spline snakes and manual segmentation,” J. Magn. Reson. Imaging 17(7), 1033–1042 (1999). 10.1016/S0730-725X(99)00040-5 [DOI] [PubMed] [Google Scholar]

[r28] 28.Parmar C., et al. , “Robust radiomics feature quantification using semiautomatic volumetric segmentation,” PLoS One 9(7), e102107 (2014). 10.1371/journal.pone.0102107 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r29] 29.Echegaray S., et al. , “Core samples for radiomics features that are insensitive to tumor segmentation: method and pilot study using CT images of hepatocellular carcinoma,” J. Med. Imaging 2(4), 041011 (2015). 10.1117/1.JMI.2.4.041011 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r30] 30.Echegaray S., et al. , “A rapid segmentation-insensitive ‘digital biopsy’ method for radiomic feature extraction: method and pilot study using CT images of non-small cell lung cancer,” Tomography 2(4), 283–294 (2016). 10.18383/j.tom.2016.00163 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r31] 31.Bakr S., et al. , “Noninvasive radiomics signature based on quantitative analysis of computed tomography images as a surrogate for microvascular invasion in hepatocellular carcinoma: a pilot study,” J. Med. Imaging 4(4), 041303 (2017). 10.1117/1.JMI.4.4.041303 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r32] 32.Raman S. P., et al. , “CT texture analysis of renal masses: pilot study using random forest classification for prediction of pathology,” Academic Radiol. 21(12), 1587–1596 (2014). 10.1016/j.acra.2014.07.023 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r33] 33.Sasaguri K., et al. , “Small <4 cm) renal mass: differentiation of oncocytoma from renal cell carcinoma on biphasic contrast-enhanced CT,” Am. J. Roentgenol. 205(5), 999–1007 (2015). 10.2214/AJR.14.13966 [DOI] [PubMed] [Google Scholar]

[r34] 34.Collins G. S., et al. , “Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement,” Ann. Intern. Med. 162(1), 55 (2015). 10.7326/M14-0697 [DOI] [PubMed] [Google Scholar]

[r35] 35.Lambin P., et al. , “Radiomics: the bridge between medical imaging and personalized medicine,” Nat. Rev. Clin. Oncol. 14(12), 749–762 (2017). 10.1038/nrclinonc.2017.141 [DOI] [PubMed] [Google Scholar]

[r36] 36.Sanduleanu S., et al. , “Tracking tumor biology with radiomics: a systematic review utilizing a radiomics quality score,” Radiother. Oncol. 127(3), 349–360 (2018). 10.1016/j.radonc.2018.03.033 [DOI] [PubMed] [Google Scholar]

[r37] 37.Ng C. S., et al. , “Renal cell carcinoma: diagnosis, staging, and surveillance,” Am. J. Roentgenol. 191(4), 1220–1232 (2008). 10.2214/AJR.07.3568 [DOI] [PubMed] [Google Scholar]

[r38] 38.Birnbaum B. A., Jacobs J. E., Ramchandani P., “Multiphasic renal CT: comparison of renal mass enhancement during the corticomedullary and nephrographic phases,” Radiology 200(3), 753–758 (1996). 10.1148/radiology.200.3.8756927 [DOI] [PubMed] [Google Scholar]

[r39] 39.McHugh M. L., “The chi-square test of independence,” Biochem. Med. 23(2), 143–149 (2013). 10.11613/BM.2013.018 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r40] 40.Rubin D. L., et al. , “Automated tracking of quantitative assessments of tumor burden in clinical trials,” Transl. Oncol. 7(1), 23–35 (2014). 10.1593/tlo.13796 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r41] 41.Rubin D. L., et al. , “ePAD: an image annotation and analysis platform for quantitative imaging,” Tomography 5(1), 170–183 (2019). 10.18383/j.tom.2018.00055 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r42] 42.Yip S. S. F., Aerts H. J. W. L., “Applications and limitations of radiomics,” Phys. Med. Biol. 61(13), R150–R166 (2016). 10.1088/0031-9155/61/13/R150 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r43] 43.Kalpathy-Cramer J., et al. , “A comparison of lung nodule segmentation algorithms: methods and results from a multi-institutional study,” J. Digital Imaging 29(4), 476–487 (2016). 10.1007/s10278-016-9859-z [DOI] [PMC free article] [PubMed] [Google Scholar]

[r44] 44.Echegaray S., et al. , “Quantitative image feature engine (QIFE): an open-source, modular engine for 3D quantitative feature extraction from volumetric medical images,” J. Digital Imaging 31(4), 403–414 (2018). 10.1007/s10278-017-0019-x [DOI] [PMC free article] [PubMed] [Google Scholar]

[r45] 45.Kuhn M., “Building predictive models in R using the caret package,” J. Stat. Software 28(5), 1–26 (2008). 10.18637/jss.v028.i05 [DOI] [Google Scholar]

[r46] 46.Bartko J. J., “The intraclass correlation coefficient as a measure of reliability,” Psychol. Rep. 19(1), 3–11 (1966). 10.2466/pr0.1966.19.1.3 [DOI] [PubMed] [Google Scholar]

[r47] 47.Gamer M., et al. , irr: Various Coefficients of Interrater Reliability and Agreement (2019). [Google Scholar]

[r48] 48.Koo T. K., Li M. Y., “A guideline of selecting and reporting intraclass correlation coefficients for reliability research,” J. Chiropractic Med. 15(2), 155–163 (2016). 10.1016/j.jcm.2016.02.012 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r49] 49.McGraw K. O., Wong S. P., “Forming inferences about some intraclass correlation coefficients,” Psychological Methods 1(1), 30–46 (1996). 10.1037/1082-989X.1.1.30 [DOI] [Google Scholar]

[r50] 50.Wu W., et al. , “Exploratory study to identify radiomics classifiers for lung cancer histology,” Front. Oncol. 6, 71 (2016). 10.3389/fonc.2016.00071 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r51] 51.Parmar C., et al. , “Radiomic feature clusters and prognostic signatures specific for lung and head & neck cancer,” Sci. Rep. 5(1), 11044 (2015). 10.1038/srep11044 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r52] 52.Ding C., Peng H., “Minimum redundancy feature selection from microarray gene expression data,” J. Bioinf. Comput. Biol. 3(2), 185–205 (2005). 10.1142/S0219720005001004 [DOI] [PubMed] [Google Scholar]

[r53] 53.Hua J., et al. , “Optimal number of features as a function of sample size for various classification rules,” Bioinformatics 21(8), 1509–1515 (2005). 10.1093/bioinformatics/bti171 [DOI] [PubMed] [Google Scholar]

[r54] 54.van Smeden M., et al. , “No rationale for 1 variable per 10 events criterion for binary logistic regression analysis,” BMC Med. Res. Methodol. 16, 163 (2016). 10.1186/s12874-016-0267-3 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r55] 55.Parmar C., et al. , “Machine learning methods for quantitative radiomic biomarkers,” Sci. Rep. 5(1), 13087 (2015). 10.1038/srep13087 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r56] 56.De Jay N., et al. , “mRMRe: an R package for parallelized mRMR ensemble feature selection,” Bioinformatics 29(18), 2365–2368 (2013). 10.1093/bioinformatics/btt383 [DOI] [PubMed] [Google Scholar]

[r57] 57.Freund Y., Schapire R. E., “A decision-theoretic generalization of on-line learning and an application to boosting,” J. Comput. Syst. Sci. 55(1), 119–139 (1997). 10.1006/jcss.1997.1504 [DOI] [Google Scholar]

[r58] 58.Schapire R. E., “The boosting approach to machine learning: an overview,” in Nonlinear Estimation and Classification, Denison D. D., et al., Eds., pp. 149–171, Springer, New York: (2003). [Google Scholar]

[r59] 59.Liaw A., Wiener M., “Classification and regression by randomForest,” R. News 2(3), 18–22 (2002). [Google Scholar]

[r60] 60.Culp M., Johnson K., Michailides G., “ada: an R package for stochastic boosting,” J. Stat. Software 17(1), 1–27 (2006). 10.18637/jss.v017.i02 [DOI] [Google Scholar]

[r61] 61.Bradley A. P., “The use of the area under the ROC curve in the evaluation of machine learning algorithms,” Pattern Recognit. 30(7), 1145–1159 (1997). 10.1016/S0031-3203(96)00142-2 [DOI] [Google Scholar]

[r62] 62.DeLong E. R., DeLong D. M., Clarke-Pearson D. L., “Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach,” Biometrics 44(3), 837–845 (1988). 10.2307/2531595 [DOI] [PubMed] [Google Scholar]

[r63] 63.Benjamini Y., Hochberg Y., “Controlling the false discovery rate: a practical and powerful approach to multiple testing,” J. R. Stat. Soc. Ser. B 57(1), 289–300 (1995). 10.1111/j.2517-6161.1995.tb02031.x [DOI] [Google Scholar]

[r64] 64.Wilkerson M. D., Hayes D. N., “ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking,” Bioinformatics 26(12), 1572–1573 (2010). 10.1093/bioinformatics/btq170 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r65] 65.Theodoridis S., Koutroumbas K., Pattern Recognition, Elsevier, London: (2009). [Google Scholar]

[r66] 66.Wickham H., ggplot2: Elegant Graphics for Data Analysis, Springer-Verlag, New York: (2016). [Google Scholar]

[r67] 67.Warnes G. R., et al. , “gplots: various R programming tools for plotting data,” R package version 3.1.1 (2019). [Google Scholar]

[r68] 68.Steven C., et al. , “Guideline for management of the clinical T1 renal mass,” J. Urol. 182(4), 1271–1279 (2009). 10.1016/j.juro.2009.07.004 [DOI] [PubMed] [Google Scholar]

[r69] 69.Patel H. D., et al. , “Surgical histopathology for suspected oncocytoma on renal mass biopsy: a systematic review and meta-analysis,” BJU Int. 119(5), 661–666 (2017). 10.1111/bju.13763 [DOI] [PubMed] [Google Scholar]

[r70] 70.Wobker S. E., Williamson S. R., “Modern pathologic diagnosis of renal oncocytoma,” J. Kidney Cancer VHL 4(4), 1–12 (2017). 10.15586/jkcvhl.2017.96 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r71] 71.Yusenko M., “Molecular pathology of chromophobe renal cell carcinoma: a review,” Int. J. Urol. 17(7), 592–600 (2010). 10.1111/j.1442-2042.2010.02558.x [DOI] [PubMed] [Google Scholar]

[r72] 72.Neves J. B., et al. , “Contemporary surgical management of renal oncocytoma: a nation’s outcome,” BJU Int. 121(6), 893–899 (2018). 10.1111/bju.14159 [DOI] [PubMed] [Google Scholar]

[r73] 73.Romis L., et al. , “Frequency, clinical presentation and evolution of renal oncocytomas: multicentric experience from a European database,” Eur. Urol. 45(1), 53–57 (2004). 10.1016/j.eururo.2003.08.008 [DOI] [PubMed] [Google Scholar]

[r74] 74.Yushkevich P. A., et al. , “User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability,” NeuroImage 31(3), 1116–1128 (2006). 10.1016/j.neuroimage.2006.01.015 [DOI] [PubMed] [Google Scholar]

[r75] 75.He L., et al. , “Effects of contrast-enhancement, reconstruction slice thickness and convolution kernel on the diagnostic performance of radiomics signature in solitary pulmonary nodule,” Sci. Rep. 6(1), 34921 (2016). 10.1038/srep34921 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Quantitative image features from radiomic biopsy differentiate oncocytoma from chromophobe renal cell carcinoma

Akshay Jaggi

Domenico Mastrodicasa

Gregory W Charville

R Brooke Jeffrey Jr

Sandy Napel

Bhavik Patel

Abstract.

1. Introduction

2. Materials and Methods

2.1. TRIPOD and RQS

Table 2.

2.2. Patient Population

Fig. 1.

2.3. Multidetector CT Technique

Table 3.

2.4. Radiomic Biopsy Process

Fig. 2.

2.5. Feature Computation

2.6. Machine Learning Workflow

Fig. 3.

2.7. Feature Selection

2.8. Predictive Modeling

2.9. Model Evaluation

2.10. Feature Importance and Model Interpretation

3. Results

3.1. CTTA from Radiomic Biopsies Distinguish Oncocytoma and Chromophobe RCC

Fig. 4.

Table 4.

3.2. Separate Rounds of Radiomic Biopsy Select Similar Clusters of Discriminative Features

Fig. 5.

Fig. 6.

Table 1.

3.3. Cluster Comparison Analysis Identifies a Cluster of Important Texture Features

Table 7.

Table 8.

Table 9.

Table 5.

Table 6.

4. Discussion

5. Selected QIFE Feature Descriptions

5.1. Intensity Features

5.2. Texture Features

5.3. Relevant QIFT Default Parameter Values

6. Appendix

Acknowledgments

Biographies

Disclosures

Contributor Information

Data, Materials, and Code Availability

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases