Abstract.
Purpose
Medical imaging-based machine learning (ML) for computer-aided diagnosis of in vivo lesions consists of two basic components or modules of (i) feature extraction from non-invasively acquired medical images and (ii) feature classification for prediction of malignancy of lesions detected or localized in the medical images. This study investigates their individual performances for diagnosis of low-dose computed tomography (CT) screening-detected lesions of pulmonary nodules and colorectal polyps.
Approach
Three feature extraction methods were investigated. One uses the mathematical descriptor of gray-level co-occurrence image texture measure to extract the Haralick image texture features (HFs). One uses the convolutional neural network (CNN) architecture to extract deep learning (DL) image abstractive features (DFs). The third one uses the interactions between lesion tissues and X-ray energy of CT to extract tissue-energy specific characteristic features (TFs). All the above three categories of extracted features were classified by the random forest (RF) classifier with comparison to the DL-CNN method, which reads the images, extracts the DFs, and classifies the DFs in an end-to-end manner. The ML diagnosis of lesions or prediction of lesion malignancy was measured by the area under the receiver operating characteristic curve (AUC). Three lesion image datasets were used. The lesions’ tissue pathological reports were used as the learning labels.
Results
Experiments on the three datasets produced AUC values of 0.724 to 0.878 for the HFs, 0.652 to 0.965 for the DFs, and 0.985 to 0.996 for the TFs, compared to the DL-CNN of 0.694 to 0.964. These experimental outcomes indicate that the RF classifier performed comparably to the DL-CNN classification module and the extraction of tissue-energy specific characteristic features dramatically improved AUC value.
Conclusions
The feature extraction module is more important than the feature classification module. Extraction of tissue-energy specific characteristic features is more important than extraction of image abstractive and characteristic features.
Keywords: low-dose computed tomographic screening, diagnosis of in vivo lesions, early cancer detection, machine learning, computer-aided diagnosis
1. Introduction
Machine learning (ML)-based pipelines for computer-aided diagnosis (CADx) have witnessed major developments in recent years. One such example is the diagnosis of low-dose medical imaging computed tomography (CT) screening-detected lesions, e.g., pulmonary nodules1–4 and colorectal polyps.5–8 While modern technology tasked with the detection of lesions is very sensitive, most of the detected lesions are benign/non-neoplastic, known as false positives (FPs).9 Among the FPs, a significant portion cannot currently be identified by human experts and ML pipelines.10–12 These are called indeterminate lesions. Identifying malignant lesions from indeterminate ones is technically challenging, but it significantly affects the specific treatment that is to be used, as malignant lesions are the precursors of cancer. Thus, they must be identified from all the screening-detected lesions and treated immediately to prevent their further development into lung and colon cancers.13,14
A typical ML-CADx pipeline inputs the lesions’ CT images along with their pathological labels (e.g., malignant/neoplastic or benign/non-neoplastic) and outputs its diagnostic performance on the lesions using a figure of merit, such as the area under the receiver operating characteristics (ROCs) curve (AUC). While many prominent ML methods have distinguished themselves for their high performance within their pipelines,7,12 including the recent advancement of deep learning (DL) with convolutional neural network (CNN) architectures,3,8 the basic pipeline/architecture design of the methods consists of two major components or modules of (1) feature extraction from the inputted lesions’ CT images and (2) feature learning/classification using the extracted features and the lesions’ pathological labels.
This study investigates the effectiveness of each individual module using low-dose CT screening image data from nodules and polyps and their tissue pathological reports as ML labels, where the pathological reports were obtained after either a portion or the whole lesion were biopsied or resected. Regarding the first module, three different feature extraction methods are described below in Sec. 2 of this paper. In this same section, two different feature learning and classification methods are reported for the second module. In Sec. 3, experimental design and outcomes are reported. Finally, discussion and conclusions are given in Sec. 4.
2. Methods
2.1. Basic Components of Machine Learning
Machine learning or ML plays a vital role in modern-day computer-aided diagnosis or CADx due to the large amount of data in the form of series of volumetric medical images. Traditionally, an ML pipeline consists of four steps: data preprocessing for inputs, feature extraction, feature classification, and output of predictions of lesion malignancy, as shown in Fig. 1.
Fig. 1.
Diagram of a typical ML pipeline or architecture design.
Deep learning or DL, a recent advancement in ML, has the capability to extract its own image features—a step which was previously done manually as a separated module in ML models—in addition to classifying features with a classification algorithm, such as through a convolutional neural network or CNN architecture design, in a combined manner, i.e. both modules of feature extraction and classification are combined in what can be represented as a black box.
This study compares the performances of different ML model architectures tasked with the prediction for malignancy of pulmonary nodules and colorectal polyps within the feature extraction and feature classification steps (the two major modules of the ML pipeline as shown in Fig. 1). In this study, model accuracy was measured using the metric of AUC of the ROC plot.
2.2. Feature Extraction Methods
Numerous methods have been developed to achieve the feature extraction step of the ML pipeline. For one, there is three-dimensional (3D) gray-level co-occurrence (GLCM) texture analysis to extract the Haralick image texture features (HFs), which are related to the measures of contrast, correlation, homogeneity, local homogeneity, entropy, etc. from the GLCMs,15,16 as shown in Fig. 2. Classification of these HFs will be described in the next section about feature classification methods, e.g., random forest (RF) classifier.
Fig. 2.
Diagram of image texture feature extraction via the GLCM calculation.16
Another such method is the DL feature extraction method. This method is conducted through a 3D CNN architecture in correspondence to a method known as the DL feature classification method at a later stage. Figure 3 shows an example,17 where the DL features (DFs) are extracted at the Dense 512 layer (layer consisting of 512 neurons) prior to the feature classification layers. This feature extraction process starts with the Input layer (i.e., inputting images) and ends at the Dense 512 layer (i.e., outputting features). The process is repeated for each set of features from which the feature extraction model generates through the input of randomized samples of data taken from the CT image datasets. Classification of these extracted DFs will be performed by the ML classifier as shown by the box on the top right of Fig. 3. More details about this classifier will be described in the next section of feature classification methods, e.g., RF classifier.
Fig. 3.
Deep feature extraction and classification, where the combination of deep learning classifier layers on the bottom right with all the deep learning features layers on the left is thus generally viewed as a black box.
Note that the two modules of feature extraction and feature classification through the DL model architecture of Fig. 3, not requiring user intervention to combine the two separated modules, is thus generally viewed as a black box.3 In this study, we intend to study the two modules separately. The resulting features generated from the DL feature extraction method can be treated as drawn from the DL model architecture, because the CNN architecture works essentially the same as an artificial neural network and the CNN incorporates the process of convoluting the input data through a kernel to extract features within the data at least once.18 In this study, the input layer, consisting of the CT images of the pulmonary nodules and colorectal polyps, are processed through four such CNNs before going through the Dense 512 layer, from which features were extracted.17,19
In addition to the above two methods which extract the image abstractive or characteristic (e.g., texture) features, there is a method of extracting tissue biological characteristic features from the lesions’ CT images.20 Since CT image contrasts depend on the X-ray energy, this third feature extraction method shall incorporate the energy information into consideration.21 Thus, this method aims to use energy spectral information to improve the tissue characterization because the lesions’ CT images are reconstructed from the raw data acquired from an X-ray tube emission of a wide energy spectrum. In other words, the reconstructed lesion CT image is an average across the wide energy spectrum. Since CT image contrasts vary depending on different X-ray energies, improvement of the tissue characterization can be obtained by the energy-resolved CT images. Generation of virtual monoenergetic images (VMIs) from the conventional screening CT images is one choice22 and is used by this study. Further improvement for VMI generation may be gained by multiple energy spectral data acquisition or photon counting detector technologies. There are several tissue biological characteristics that can be considered to compute the features. This study used the tissue elasticity model20 to compute the corresponding tissue-energy specific characteristic features (TFs) from each VMI. Classification of the TFs from all VMIs is based on the RF classifier and will be described in Sec. 3.2.3: preparation inputs for TF extraction and classification.
2.3. Feature Classification Methods
2.3.1. Random forest classifier
The RF algorithm is an ML ensemble-learning method. It is widely used for feature classification, of which results from the grouping of several decision trees, each generated through random selections from the set of features produced in the feature extraction step in the ML pipeline.23 Each decision tree classifies its own prediction, and the most popular prediction of all decision trees results in the classifier’s output prediction. This common way of combining decision trees surpasses certain aspects of a single decision tree, as it is less prone to overfitting. Random forests or RF also have certain advantages over deep neural networks, e.g., it does not rely on finding local minimums through gradient descent for optimizations, which possibly misses the aspects of the absolute minimum. In this study, 100 different RF models are generated for each of three input datasets (two sets of polyps and one set of nodules) to classify the malignancies of the colorectal polyps and pulmonary nodules. Of the 512 extracted features presented in each model, a stepwise feature selection function was then implemented so as to conditionally test if the feature being iterated over benefitted—and thus should be added—to the RF classification model. This algorithm prevents redundancy and bias in the features by ensuring each one strictly benefits the model.
2.3.2. Deep learning classifier
The DL classifier, as shown on the bottom right in Fig. 3, takes in the generated 512 features of each model as input. This DL classifier and the DL feature extraction layers on the left in Fig. 3 make up the so-called black box.3 It relies on a dropout regularization process followed by the widely used softmax activation layer in the neural network to apply probabilities to the predictions being classified. The dropout regularization process helps to further prevent overfitting of the model. The generated probabilities from the softmax activation layer are then used to compute the model’s output predictions to be evaluated.
3. Experimental Design and Results
3.1. Lesion Image Datasets
The three sets of lesion image data in this study were obtained from a database that were acquired under patient-informed consent. The patient informed consent forms were reviewed and approved by Institutional Review Board. The use of the three image datasets for this study was waived from the patient informed consent. All the identifiable patient information in the CT scans were removed before the CT image datasets were loaded onto the computer for data processing. All the CT image datasets were grouped into two sets of abdominal CT images and one set of chest CT images.
3.1.1. Two sets of colorectal polyps
The abdominal CT images were taken from 282 patients; each patient has one polyp detected. One set of the abdominal CT images includes 100 polyps of sizes greater than 3 cm, called polyp masses. While these polyp masses are in an advanced stage of development, their pathological condition of malignancy versus benign remains indeterminate. Based on their pathological reports after resection and pathological analysis, 48 were labeled as malignant and 52 as benign. The other set of the abdominal CT images contains 182 polyps of sizes in the range from 1 to 3 cm, called large polyps. In these large polyps, the clinical task is to differentiate neoplastic versus non-neoplastic. Based on their pathological reports after resection, this set of large polyps includes 158 neoplastic and 24 non-neoplastic.
3.1.2. One set of pulmonary nodules
This set of chest CT images was acquired from 68 patients, each including an indeterminate pulmonary nodule. The nodules’ sizes are in the range from 1 to 13 cm. Based on their pathological reports after tissue biopsies and pathological analysis, 50 were labeled as malignant and 18 as benign.
3.2. Experimental Design
3.2.1. Preparation inputs for HF extraction and classification
The border of each lesion was drawn on each CT image slice manually by a lab researcher and approved by a radiologist. From the drawn volume of each lesion, the gray-level co-occurrence, or GLCM, was generated along each of the 13 directions defined by the neighboring image voxels. A total of 14 statistical measures were computed from each GLCM.15,16 By selecting the average of each measure and the range of the measure over the 13 directions, a total of 28 HFs were extracted. The extracted HFs were classified by the RF classifier for the outcomes of prediction of lesion malignancy.
3.2.2. Preparation inputs for DF extraction and classification
A rectangular volume mask was generated to include the largest lesion in each of the three lesion datasets. For smaller lesions in each dataset, the lesions’ surrounding spaces in the mask were filled with zeroes. The masked lesion CT volume images were inputted to the 3D DL-CNN pipeline17 to generate the outcomes of prediction of lesion malignancy. Two classification outcomes were obtained.
One outcome is the output from the 3D DL-CNN pipeline where the DL classification is built inside a box including both feature extraction and classification modules. This 3D DL-CNN pipeline is shown in Fig. 3 by combining the DL Classifier layers on the bottom right with all the DL features layers on the left in a single box.3,19
The other classification outcomes were obtained by outputting the DFs before entering the DL classification layers, see Fig. 3, and then classifying the output DFs by the RF classifier.
3.2.3. Preparation inputs for TF extraction and classification
From the drawn volume of each lesion in the CT image in Sec. 3.2.1, a series of (e.g., ) VMIs of that lesion were generated at corresponding energy values.21,22 From each VMI, a volumetric tissue elastic map is computed.20 The computation uses a formula, which is a function of the first- and second-order derivatives of the volumetric VMI with respect to spatial coordinates. The formula is derived based on the elastic Affine transformation on soft tissue movement.20 By applying the GLCM analysis to each tissue elastic map, a set of tissue-energy specific characteristic features, or TFs, was extracted from the corresponding VMI. Using the RF classifier to classify the N sets of TFs, the outcomes predict the lesion malignancy.
In the classification implementation stage, a -fold () cross-validation was used to test the robustness and avoid data bias. For each of the three lesion datasets, the concerned dataset was randomly divided into three groups of 60% for training, 20% for testing, and 20% for validation, maintaining the same ratio of lesion pathology classes (e.g., malignant versus benign) in all three groups. The randomization was repeated 100 times. The average and standard deviation of the 100 outcomes were outputted as the results from that dataset.
3.3. Results
The experimental outcomes from the three lesion CT image datasets using the three feature extraction methods and two classifiers are reported in Table 1. Using the method of deep feature extraction and RF classifier or DF-RF as a reference, a two-tailed t-test was performed on the other three methods with the outcomes of the values. The graphical presentation of the quantitative results is shown in Fig. 4.
Table 1.
AUCs (mean ± standard deviation) of the baseline (DL-CNN) and the three different feature extraction methods (HFs, DFs, and TFs) from the three lesion CT image datasets, where the total number of samples: malignant/benign or neoplastic/non-neoplastic are provided.
| ML model | Nodules #1 (68: 50/18) | Polyp masses (100: 48/52) | Large polyps (182: 158/24) |
|---|---|---|---|
| DF-RF | 0.652 ± 0.131 | 0.965 ± 0.031 | 0.682 ± 0.107 |
| DL-CNN | 0.694 ± 0.101 | 0.964 ± 0.025 | 0.712 ± 0.131 |
| HF-RF | 0.727 ± 0.134 | 0.878 ± 0.070 | 0.724 ± 0.122 |
| TF-RF | 0.993 ± 0.023 | 0.996 ± 0.008 | 0.985 ± 0.025 |
Note that for large colorectal polyps, the comparison is not malignant versus benign but neoplastic versus nonneoplastic.
Fig. 4.

Graphical presentation of the quantitative results in Table 1.
Comparing the reference of DF-RF (i.e., feature extraction by CNN, followed by RF classification) with the method of DL-CNN, where different classifiers were used to classify the same DL features, the outcomes do not show significant difference.
Comparing the reference with the method of HF-RF (i.e., feature extraction by GLCM, followed by RF classification), where the same classifier was used to classify different learning features, the outcomes approach to significant difference.
Comparing the reference with the method of TF-RF (i.e., extraction of tissue-energy specific characteristic features, followed by RF classification), where the same classifier was used to classify different learning features, the outcomes show significant difference.
Based on the three numbers of AUC mean, standard deviation and value, we have the following observations.
-
1.
Since both DF-RF and HF-RF apply the same classifier to different sets of features, the gain in classification performance is mainly due to the improved feature extraction. In other words, the DL feature extraction method can extract more distinguishable image characteristics features than the GLCM-based image texture extraction method. The gain approaches to significance.
-
2.
When the same DL features are classified by both the above DL and RF classifiers, the outcomes do not show significant difference. These outcomes further indicate that the feature extraction module plays more important role than the feature classification module in the diagnosis of lesions.
-
3.
When the tissue-energy specific characteristic features or TFs are classified by the above RF classifier, the outcomes show dramatical difference. These outcomes indicate that extraction of tissue biological characteristic features is necessary to diagnose lesion of in vivo tissues.
4. Discussion and Conclusions
This study reveals that the DL-CNN feature classification module produced a similar performance to that of the RF classifier when classifying the same image abstractive features extracted by DL-CNN architecture. It further reveals that the DL-CNN extraction of image abstractive features outperformed the GLCM-based extraction of image texture characteristic features, indicating that the DL-CNN feature extraction module produced more distinguishable image abstract features than the image texture characteristic features. Overall, the tissue-energy specific characteristic feature extraction vastly outperformed their respective competition.
For diagnosis of in vivo tissues, DL can extract more distinguishable image abstractive features than traditional ML does, which usually extracts image texture characteristic features. However, extraction of tissue-energy specific characteristic features is necessary for CADx of lesions consisting of in vivo tissues. While the DL feature extraction method can significantly surpass other popular modern ML feature extraction method (such as the GLCM-based image texture characteristic feature extraction method), the DL classifier proves to have little to no benefit when compared to other popular modern ML classification techniques (such as the RF classifier) under the task of CADx. Hence, this study suggests that the ML pipeline benefits more significantly when using DL for tissue-energy specific characteristic feature extraction than feature classification (in the form of CNNs) in comparison to other modern ML techniques, as tissue-energy specific characteristic feature extraction vastly surpasses all other feature extraction techniques presented in this study. To conclude, as DL classification does not significantly benefit the ML pipeline compared to other modern-day classification methods, this difference is not enough to compete with the difference in benefit offered by the tissue characteristic feature extraction method compared to other modern-day feature extraction methods. In other words, diagnosis of in vivo tissues shall be based on the tissue characteristic features, rather than the image characteristic features.
Acknowledgments
Part of the contents of this paper was presented in the SPIE Medical Imaging Conference 2024. This work was partially supported by the Professional Staff Congress - City University of New York (PSC-CUNY) Research Award Program award (Grant No. 65231-00 53).
Biographies
Daniel D. Liang is currently a student at Ward Melville High School and is scheduled to graduate in 2025. He has been conducting research in medical imaging for the past 2 years. He has made a poster and oral presentations at conferences and published the presentations as conference papers accordingly.
David D. Liang is currently a sophomore, in completing his BS degree in computer science and mathematics at the University of Chicago. He had 1 year of internship experience in medical imaging research. Other past research experiences and interests over the years include AI machine learning and computational chemistry.
Marc J. Pomeroy is a diagnostic medical physics resident at Columbia University Irving Medical Center and a PhD candidate at Stony Brook University. His research interests are in machine learning models and computer-aided diagnosis for early cancer diagnosis. His current research projects involve using machine learning models to diagnose colorectal polyps from computed tomographic colonography (virtual colonoscopy) images.
Yongfeng Gao completed her PhD in the School of Physics from Peking University, Beijing, China in 2018. She was a visiting scholar in Colliding and Accelerator Department of Brookhaven National Lab from 2016 to 2017. In 2017, she joined Stony Brook University furthering her research in medical imaging. Since 2022, she has been with United Imaging Healthcare America, serving as a collaboration research scientist. Her research interests include low-dose CT reconstruction and artificial intelligence.
Licheng R. Kuo received BS degree in radiologic technology and MS degree in radiological sciences from National Yang Ming University of Taiwan, R.O.C. He had one and a half years of service as medical physicist in the University of Pittsburgh Medical Center before joining Memorial Sloan-Kettering Cancer Center in 2011. He currently serves as senior medical physicist at MSKCC and pursues a PhD in the Department of Biomedical Engineering of Stony Brook University.
Lihong C. Li is a professor in the Department of Engineering and Environmental Science at the City University of New York/College of Staten Island. She received her PhD in electrical engineering from Stony Brook University. Her research focuses on image processing, computer aided diagnosis, and medical informatics. She currently serves on the editorial boards of the Journal of X-Ray Science and Technology and the Journal of Imaging.
Contributor Information
Daniel D. Liang, Email: dliang1670@gmail.com.
David D. Liang, Email: dliang7234@gmail.com.
Marc J. Pomeroy, Email: marc.pomeroy@stonybrook.edu.
Yongfeng Gao, Email: yongfenggao08@gmail.com.
Licheng R. Kuo, Email: ryan.kuo@stonybrook.edu.
Lihong C. Li, Email: lihong.li@csi.cuny.edu.
Disclosures
The authors declare that they have no conflicts of interests.
Code and Data Availability
The data supporting the findings of this study are available upon request.
References
- 1.Han F., et al. , “Texture feature analysis for CADx on pulmonary nodules,” J. Digit. Imaging 28(1), 99–115 (2015). 10.1007/s10278-014-9718-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Wang H., et al. , “A hybrid CNN feature model for pulmonary nodule malignancy risk differentiation,” J. X-ray Sci. Technol. 25(6), 673–688 (2017). 10.3233/XST-17302 [DOI] [PubMed] [Google Scholar]
- 3.Ardila D., et al. , “End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography,” Nat. Med. 25, 954–961 (2019). 10.1038/s41591-019-0447-x [DOI] [PubMed] [Google Scholar]
- 4.Baldwin D., et al. , “External validation of a convolutional neural network artificial intelligence tool to predict malignancy in pulmonary nodules,” Thorax 75, 306–312 (2020). 10.1136/thoraxjnl-2019-214104 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Pooler B., et al. , “Volumetric textural analysis of colorectal masses at CT colonography: differentiating benign versus malignant pathology and comparison with human reader performance,” Acad. Radiol. 26(1), 30–37 (2018). 10.1016/j.acra.2018.03.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Tan J., et al. , “3D-GLCM CNN: a 3-dimensional gray-level co-occurrence matrix-based CNN model for polyp classification via CT colonography,” IEEE Trans. Med. Imaging 39(6), 2013–2024 (2019). 10.1109/TMI.2019.2963177 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Grosu S., et al. , “Machine learning–based differentiation of benign and premalignant colorectal polyps detected with CT colonography in an asymptomatic screening population: a proof-of-concept study,” Radiology 299, 326–335 (2021). 10.1148/radiol.2021202363 [DOI] [PubMed] [Google Scholar]
- 8.Wesp P., et al. , “Deep learning in CT colonography: differentiating premalignant from benign colorectal polyps,” Eur. Radiol. 32(7), 4749–4759 (2022). 10.1007/s00330-021-08532-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Aberle D., et al. , “Reduced lung-cancer mortality with LdCT screening,” New Engl. J. Med. 365, 395–409 (2011). 10.1056/NEJMoa1102873 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.McWilliams A., et al. , “Probability of cancer in pulmonary nodules detected on first screening CT,” New Engl. J. Med. 369, 910–919 (2013). 10.1056/NEJMoa1214726 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Massion P., et al. , “Assessing the accuracy of a deep learning method to risk stratify indeterminate pulmonary nodules,” Am. J. Respir. Crit. Care Med. 202, 241–249 (2020). 10.1164/rccm.201903-0505OC [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Adams S., et al. , “Clinical impact and generalizability of a computer-assisted diagnostic tool to risk-stratify lung nodules with CT,” J. Am. Coll. Radiol. 20(2), 232–242 (2022). 10.1016/j.jacr.2022.08.006 [DOI] [PubMed] [Google Scholar]
- 13.The U.S. Preventive Services Task Force (US-PSTF), Guide to Clinical Preventive Services, Williams &Wilkins; (1996). [Google Scholar]
- 14.American Cancer Society (ACS), Cancer Facts and Figures, American Cancer Society (ACS), Atlanta, GA, USA: (2022). [Google Scholar]
- 15.Mansour I. R., Thomson R. M., “Haralick texture feature analysis for characterization of specific energy and absorbed dose distributions across cellular to patient length scales,” Phys. Med. Biol. 68, 075006 (2023). 10.1088/1361-6560/acb885 [DOI] [PubMed] [Google Scholar]
- 16.Löfstedt T., et al. , “Gray-level invariant haralick texture features,” PLoS ONE 14(2), e0212110 (2019). 10.1371/journal.pone.0212110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Kuo L., et al. , “Development of 3D CNN models for colorectal polyps classification in CT colonography,” in Annu. Meet. of AAPM (2023). [Google Scholar]
- 18.Goodfellow I., Bengio Y., Courville A., Deep Learning, MIT Press; (2016). [Google Scholar]
- 19.Zunair H., et al. , “Uniformizing techniques to process CT scans with 3D CNNs for tuberculosis prediction,” arXiv:2007.13224 (2020).
- 20.Cao W., et al. , “Lesion classification by model-based feature extraction: a differential affine invariant model of soft tissue elasticity,” arXiv:2205.14029 (2023).
- 21.Chang S., et al. , “Exploring dual-energy CT spectral information for machine learning-driven lesion diagnosis in pre-log domain,” IEEE Trans. Med. Imaging 42(6), 1835–1845 (2023). 10.1109/TMI.2023.3240847 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Chang S., et al. , “Using tissue-energy response to generate virtual monoenergetic images from conventional CT for computer-aided diagnosis of lesions,” Proc. SPIE 12304, 123041L (2022). 10.1117/12.2646551 [DOI] [Google Scholar]
- 23.Breiman L., “Random forests,” Mach. Learn. 45, 5–32 (2001). 10.1023/A:1010933404324 [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The data supporting the findings of this study are available upon request.



