Abstract.
Purpose: To clarify whether and to what extent three-dimensional (3D) convolutional neural network (CNN) is superior to 2D CNN when applied to reduce false-positive nodule detections in the scenario of low-dose computed tomography (CT) lung cancer screening.
Approach: We established a dataset consisting of 1600 chest CT examinations acquired on different subjects from various sources. There were in total 18,280 candidate nodules in these CT examinations, among which 9185 were nodules and 9095 were not nodules. For each candidate nodule, we extracted a number of cubic subvolumes with a dimension of by rotating the CT examinations randomly for 25 times prior to the extraction of the axis-aligned subvolumes. These subvolumes were split into three groups in a ratio of for training, validation, and independent testing purposes. We developed a multiscale CNN architecture and implemented its 2D and 3D versions to classify pulmonary nodules into two categories, namely true positive and false positive. The performance of the 2D/3D-CNN classification schemes was evaluated using the area under the receiver operating characteristic curves (AUC). The -values and the 95% confidence intervals (CI) were calculated.
Results: The AUC for the optimal 2D-CNN model is 0.9307 (95% CI: 0.9285 to 0.9330) with a sensitivity of 92.70% and a specificity of 76.21%. The 3D-CNN model with the best performance had an AUC of 0.9541 (95% CI: 0.9495 to 0.9583) with a sensitivity of 89.98% and a specificity of 87.30%. The developed multiscale CNN architecture had a better performance than the vanilla architecture did.
Conclusions: The 3D-CNN model has a better performance in false-positive reduction compared with its 2D counterpart; however, the improvement is relatively limited and demands more computational resources for training purposes.
Keywords: pulmonary nodule, classification, convolutional neural network, 3D/2D comparison
1. Introduction
Lung cancer is the leading cause of death associated with cancer, exceeding the next three cancers combined (i.e., colon, breast, and prostate cancer).1 The high mortality is partly attributed to the lack of early stage symptoms and limited access to screening.2 The landmark National Lung Screening Trial (NLST) reported a 20% reduction in lung cancer mortality in smokers screened with low-dose computed tomography (LDCT) scans compared to chest radiograph.3–6 The Center for Medicare and Medicaid Services decision to cover annual lung cancer screening with chest LDCT scans for asymptomatic adults with a history of tobacco smoking was in part motivated by the NLST findings. The strength of LDCT scanning is primarily attributed to the high spatial resolution in visualizing small nodules, which may be the manifestation of early lung cancer. In practice, a single chest CT examination contains hundreds of CT images. It is time-consuming for a radiologist to visually interpret this large volume of images, which can lead to missed suspicious nodules. To facilitate an accurate and efficient interpretation of the chest CT examinations, a large number of computer algorithms have been developed to automatically detect nodules depicted on CT scans.7–12 Unfortunately, few algorithms have demonstrated satisfactory performance in practice, thereby limiting their clinical application. The primary issue is the relatively high rate of false-positive nodule detections in order to maintain a satisfactory sensitivity, which can increase the time to interpret an LDCT scan without any benefit.
A variety of image features were explored to reduce the false-positive rate.13–22 It is not easy to accurately and consistently identify the critical nodule features associated with true nodules. Despite extensive investigative effort dedicated to false-positive reduction, limited progress has been made in this regard. Deep learning technology, such as the convolutional neural network (CNN), has emerged as a novel solution for many medical image analysis problems and demonstrated remarkable performance, particularly in lung nodule detection and false-positive reduction.23–30 Both two- and three-dimensional (2D and 3D) deep learning architectures have been developed to reduce false-positive nodule detections by classifying a candidate nodule as nodule and non-nodule.31–44 It is somewhat intuitive that the 3D CNN could achieve better performance than its 2D counterpart could because 3D images contain significantly more information. However, the performance reported in the literature varied significantly.22,25 The underlying variations could be caused by the differences in the training/testing datasets and/or the CNN architectures. Training and inference of a 3D CNN require much greater computational resources than those of a 2D CNN do. It remains unclear whether the performance improvement of 3D CNN as compared with 2D CNN outweighs the additional computational cost.31 In this study, we established a relatively large dataset consisting of 1600 chest CT examinations acquired on different subjects from various sources, which had 18,280 nodule candidates. Of these candidates, 9185 were nodules (“true positive”) and 9095 were not nodules (“false positive”). Our objective is to verify whether the 3D-CNN architectures perform better than the 2D-CNN architectures do for classifying candidate nodules as nodules or non-nodules.
2. Methods and Materials
2.1. Image Datasets
We established an image dataset consisting of 1600 chest LDCT examinations from different sources, including NLST,3 the Lung Image Database Consortium image collection,45 and our own database (Table 1).46 These CT scans were acquired at different medical centers using different protocols, which can be found in Refs. 3, 45, and 46. The image slice thickness ranged from 0.625 to 2.5 mm. The size of the identified nodules ranged from 3 to 30 mm. We excluded the CT scans with a slice thickness and the nodules (or masses) . Under the help of a previously developed computer algorithm,12 we processed these CT examinations and identified a large number of suspicious candidates. The computer detected candidate nodules were reviewed by radiologists separately in the past years and grouped into true positive (true nodule) and false positive (non-nodule) categories. To balance the number of cases in the two categories, we selected 9185 and 9095 nodules, respectively, from the two categories. For each candidate nodule, we extracted a cubic subvolume of to fully enclose a suspicious nodule. To provide a way for data augmentation and thus increase the diversity, we randomly rotated the CT examinations based on the center of the candidate nodules and then extracted the axis-aligned subvolumes. The random rotations were performed times. As a result, we have a total of 227,051 subvolumes in the false-positive group and 229,404 in the true positive group. For each subvolume, we also generated its 2D version as a pseudo-RGB image by stacking the three images along the axes (Fig. 1).
Table 1.
Summary of the datasets.
| NLST () | LIDC-IDRI () | PLuSS () | Total | |
|---|---|---|---|---|
| Nodules | 2994 | 3771 | 2420 | 9185 |
| Non-nodules | 2755 | 3962 | 2378 | 9095 |
| 3D/2D postaugmented nodules | 74,717 | 94,215 | 60,472 | 229,404 |
| 3D/2D postaugmented non-nodules | 68,693 | 98,997 | 59,361 | 227,051 |
Fig. 1.
(a) Illustration of the 2D image generation procedure based on the 3D subvolume that fully encloses a nodule; (b) three images were extracted along the orthogonal axes and (c) their stacking formed a pseudo-RGB image (red, green, and blue channels).
2.2. CNN Architectures
For comparison purposes, we developed two types of CNN architectures. The first one is a vanilla CNN architecture (Fig. 2), where the 2D pseudo-RGB image or 3D image with the same dimension are passed to a backbone network followed by a flattening layer and the combination of a fully connected (FC) layer and a rectified linear unit (ReLu). The output layer was activated by a Softmax function to obtain the predicted probability of each class. This network does not consider the significant variation of the pulmonary nodule size, which may range from 3 to 50 mm in diameter. For classification purposes, we used a cubic subvolume of to ensure that a nodule could be fully enclosed. The convolution or the pooling operations may smear out the critical details related to the enclosed nodule while smoothing the images, thereby weakening the discriminative capability of the network toward small nodules. Hence, we developed a multiscale CNN architecture (Fig. 3).
Fig. 2.

A vanilla CNN architecture for the classification of the detected pulmonary nodules.
Fig. 3.
Illustration of the developed multiscale CNN architecture formed by three branches. The volumetric input image is cropped to create two additional branches that are centered on a given nodule with different dimensions, namely and . The average-pooling and up-sampling operations are used to resize these subvolumes into a uniform dimension, namely .
The multiscale CNN architecture is formed by three branches (Fig. 3). These branches have image inputs with the same center but different dimensions, namely , , . We adjusted all these images uniformly to the same size, namely via pooling and up-sampling operations on the fly. The normalized images are passed separately to a backbone network and later concatenated and flattened, followed by the FC layers and the ReLu. Similarly, the output layer is activated by a Softmax function to obtain the predicted probability of each class.
The backbone network is a critical component of the aforementioned CNN architectures. A variety of backbone networks have been developed.47,48 In this study, we used a basic backbone network that is formed by a sequence of convolution blocks with different numbers of filters (Fig. 4). For fair comparisons, the 3D convolutional kernels were replaced with the 2D counterparts.
Fig. 4.
Illustration of the basic backbone network formed by a sequence of convolution blocks with different numbers of filters (without padding). The convolutional layer has a filter size of ( in the 2D version) and different numbers of filters. The batch normalization layer, the ReLU, and a pooling size of 2 were used.
2.3. Training of the 2D- and 3D-CNN Models
On the basis of the aforementioned CNN architectures, we developed and trained four models (Table 2). We split the generated 2D/3D samples into three groups: (1) training, (2) validation, and (3) independent test in a ratio of , respectively, at the patient level. The samples were first clipped between [] Hounsfield and then normalized to [0, 1]. To improve the reliability and prevent overfitting, we performed online data augmentation, such as affine transformation (e.g., flip, scale, translation, and rotation), Gaussian additive noise, and intensity shift, during training. All learnable parameters were initialized by the Xavier initialization49 and updated by the Adam stochastic optimization algorithm.50 Reducing the learning rate on plateau and early stopping were adopted during training. The parameters for training the 2D- and 3D-CNN models were summarized in Table 2. The development of these models was performed on a PC with a 3.50-GHz Intel® Xeon® E5-1620 CPU, 8 GB RAM, and NVIDIA Quadro P5000.
Table 2.
Developed classification models.
| Model-VB-2D/model-MB-2D | Model-VB-3D/model-MB-3D | |
|---|---|---|
| Batch size | 128 | 8 |
| Initial learning rate | 0.001 | 0.001 |
| Learning rate decay patience | 2 | 1 |
| Learning rate decay factor | 0.2 | 0.1 |
| Early stop patience | 15 | 5 |
Note. VB: vanilla framework with basic backbone; and MB: multiscale framework with basic backbone.
2.4. Performance Validation
We assessed the performance of the developed classification models by testing on the independent testing datasets using the nonparametric receiver operating characteristic analysis. We note that no image in the testing datasets was involved in the algorithm development and the training procedures. We also summarized the average training/inferring time, memory requirement, and floating-point operations (FLOPs) of the involved 3D- and 2D-CNN models for comparison. The number of FLOPs was used as an index for the computational complexity of the CNN models. Whether there was a significant difference in the performance among the developed CNN models was verified using the DeLong’s test.51 The 95% confidence interval (CI) and -values were computed in all analyses. We considered a statistically significant. The analyses were performed using IBM SPSS Version 24.
3. Results
Model-MB-3D had the best performance in classification with an area under the receiver operating characteristic curves (AUC) of 0.9541 (CI: 0.9495 to 0.9583), a sensitivity of 89.98%, and a specificity of 87.30% (Table 3). In contrast, the best 2D-CNN model was model-VB-2D, which has an AUC of 0.9307, a sensitivity of 92.70%, and a specificity of 76.21%. Although the DeLong’s test showed that there was a significant difference in classification performance between the 2D- and 3D-CNN models (), the performance difference between the 2D-CNN models and the 3D-CNN models was very limited. Also the proposed multiscale strategy had a better performance than its counterpart with only a uniform dimension did.
Table 3.
Comparison of the classification performance of the 2D-/3D-CNN models and the involved numbers of the parameters.
| Model | Sensitivity (%) | Specificity (%) | Accuracy (%) | AUC (95% CI) |
|---|---|---|---|---|
| Model-VB-2D | 92.70 | 76.21 | 84.63 | 0.9307 (0.9285, 0.9330) |
| Model-MB-2D | 77.48 | 88.21 | 82.73 | 0.9205 (0.9181, 0.9229) |
| Model-VB-3D | 85.12 | 88.87 | 86.99 | 0.9375 (0.9324, 0.9430) |
| Model-MB-3D | 89.98 | 87.30 | 88.64 | 0.9541 (0.9495, 0.9583) |
We also summarized the computational cost of the 2D/3D-CNN models based on our desktop (GPU: NVIDIA RTX 2070S, CPU: 3.00 GHz Intel® Core™ i7-9700) in Table 4. The number of FLOPs was much higher for the 3D models than for the 2D models. The batch size for training the two proposed 2D models can be set at the maximum number of 800 and 1000, respectively, whereas the maximum batch size for the proposed two 3D models can only be at 32 and 24, respectively. It took more than 10 h to train a 3D model, whereas the training time of 2D models was only about 4 to 6 h. Once the trained models were used for prediction, the difference of the computation costs between the 2D and 3D models was narrow, namely 0.02 to 0.03 s for processing a single image.
Table 4.
Summary of the computational cost of the 2D-/3D-CNN models.
| Model | Maximum batch size in training | Training time (h) | Prediction time (per image) (s) | #Params (M) | #FLOPs (M) |
|---|---|---|---|---|---|
| Model-VB-2D | 0.0158 | 0.65 | 1.31 | ||
| Model-MB-2D | 0.0200 | 1.37 | 2.74 | ||
| Model-VB-3D | 0.0210 | 1.69 | 3.39 | ||
| Model-MB-3D | 0.0260 | 3.69 | 7.40 |
4. Discussion
In this study, we used a relatively large and diverse dataset ( CT scans), which was acquired using different protocols and from various medical centers, to compare the performance difference between the 3D-CNN models and their 2D counterparts to classify computer detected nodule candidates. To our knowledge, no investigation has ever used such a large dataset as we did in this study. We expect that this relatively large and diverse dataset could support a reliable conclusion. In practice, we always need to decide which type of CNN models (i.e., 2D or 3D) should be used for a specific application. Our results demonstrate that a 3D-CNN model narrowly outperformed a 2D-CNN model to classify computer detected candidate nodules as true or false positive detections. Since it is much easier to train and implement a 2D-CNN model than a 3D-CNN model (Table 4), it is desirable to understand the trade-off between the performance gain, which may or may not be statistically significant, and the increased computational cost, such as memory and training time.
In addition, we developed a multiscale strategy in this study in an attempt to maximize the potential of both 2D and 3D images for classification. As demonstrated by our experiments (Table 3), the multiscale strategy had much better performance for both 2D and 3D classification tasks. Notably, the objective of this study is not to develop a novel CNN model to maximize the classification performance but to clarify whether and to what extent 3D CNN is superior to 2D CNN when applied to reduce false-positive detections in the scenario of LDCT lung cancer screening.
We are aware that different implementations of the CNN architectures and the training methods (e.g., data and parameters) may lead to different results. Hence, to perform a relatively fair performance comparison of the 2D and 3D models, we used the same dataset and the same CNN architecture. In the past years, there have been a large number of investigations performed to classify 2D or 3D images,31–35,38,39,44 as summarized in Table 5. Although these studies used different datasets, they did not report a better performance for the 3D CNNs as compared with their 2D counterparts as expected, which is consistent with our findings. Our study used a much larger dataset than other investigations did, but the differences between 2D- and 3D-CNN architectures in classifying lung nodules are limited. Given the much richer information in 3D images, we expect that there should be obvious differences in classification performance between 3D-CNN architecture and 2D-CNN architecture. Unfortunately, our experimental results do not support this. Additional investigative efforts are needed to clarify the underlying reasons and develop novel 3D-CNN architectures that could maximize the potential of the rich information in 3D volumetric images.
Table 5.
Comparison of the classification performance of the previous 2D-/3D-CNN models and the involved number of nodules.
| Model | Dim | #Candidates/#nodules/#non-nodules | Sensitivity (%) | Specificity (%) | Accuracy (%) | AUC (%) |
|---|---|---|---|---|---|---|
| MV-ResNet38 | 2D | 831/421/410 | 91.1 | 88.6 | 89.9 | 94.6 |
| SL-CNN31 | 2D | 1882/—/— | 78.6 | 91.2 | 86.7 | — |
| NL-CNN31 | 2D | 1882/—/— | 88.5 | 86.0 | 87.3 | — |
| CNN31 | 3D | 1882/—/— | 89.4 | 85.2 | 87.4 | — |
| PN-SAMP32 | 3D | 1404/898/506 | — | — | 97.6 | — |
| MC-CNN33 | 3D | 1375/496/880 | 77 | 93 | 87.1 | 93 |
| MS-CNN34 | 3D | 1375/496/880 | — | — | 86.8 | — |
| HS-CNN35 | 3D | 4252/3212/1040 | 70.5 | 88.9 | 84.2 | 85.6 |
| CNN39 | 3D | 326/163/163 | — | — | — | 73.2 |
| MO-DenseNet44 | 3D | 686/316/370 | 90.5 | 90.3 | 90.4 | — |
Note: — denotes not available.
There are limitations with this study. First, in the past decade, we have been actively working on medical image analysis related to lung cancer. More than 10 thoracic radiologists have been involved in reviewing the chest CT images from different sources as presented in this study to establish a dataset with “ground truth.” Considering that the interpretation of chest CT images for the purpose of nodule detection is not very challenging for a thoracic radiologist, we did not ask them to read the images as a group but independently. As a result, there could be some errors with the interpretation results unavoidably; however, the errors and their effect on the results should be very limited and should not change our conclusion. Second, the developed multiscale strategy may not be the optimal solution for the nodule classification problem. As we explained, our emphasis is not on the development of classification models but on the clarification of the performance difference between 3D CNN and 2D CNN in classification. Hence, we did not perform extensive compassion with other available CNN architectures. However, compared with other available methods (Table 5), our developed models demonstrated a very promising classification performance. Third, the parameters involved in the CNN models may not be fully optimized for this particular classification task. The training of a CNN model involves a number of parameters, such as learning rate, batch size, filter numbers, and the choice of optimizer. Tuning these parameters may somehow affect model performance. However, the most important factor that affects the performance of a CNN model is the scale and diversity of the dataset for model developing. Given the relatively large and diverse dataset in this study, we believe that the impact of the training parameter choices on our conclusion should be limited.
5. Conclusions
We developed a multiscale CNN framework to perform binary classification of computer detected lung nodule candidates (i.e., false positive and true positive) depicted on LDCT scans. We implemented the 2D and 3D versions of this framework to clarify whether and to what extent a 3D CNN outperforms its 2D counterpart in classifying lung nodules. Our results on a relatively large and diverse dataset showed that the proposed 2D-CNN and 3D-CNN models had comparable performance to classify candidate lung nodules, albeit the latter one yielded slightly better performance. We believe that the existing CNN architectures may not fully utilize the rich information in 3D images. Additional investigative effort may be needed to develop sophisticated CNN models to maximize the rich information in 3D images.
Acknowledgments
This work was supported in part by the National Institutes of Health (NIH) (Grant Nos. R01-CA237277 and R01-HL096613).
Biographies
Juezhao Yu is a research assistant in the Department of Radiology at the University of Pittsburgh. His research interests are in machine learning, medical image analysis, and computer-aided diagnosis.
Bohan Yang is a research assistant in the Department of Radiology at the University of Pittsburgh. His research interest is primarily related to the development of machine learning algorithms for quantitative image analysis.
Jing Wang is a research assistant in the Department of Radiology at the University of Pittsburg. Her research interest focuses on quantitative medical image analysis and computer graphics.
Joseph Leader is a research assistant professor of radiology at the University of Pittsburgh. His primary research emphasis is on quantitative lung image analysis.
David Wilson is an associate professor of medicine at the University of Pittsburgh. His primary clinical practice and research interest include lung cancer screening and chemoprevention, diagnosis, staging, and treatment; COPD, especially as it relates to lung cancer.
Jiantao Pu is an associate professor of radiology and bioengineering at the University of Pittsburgh. The primary objective of his research is to develop innovative computational tools for non-invasive, quantitative, and accurate assessment of pathological conditions.
Disclosures
The authors have no relevant financial interests in the manuscript and no other potential conflicts of interest to disclose.
Contributor Information
Juezhao Yu, Email: yujuezhao1997@gmail.com.
Bohan Yang, Email: boy18@pitt.edu.
Jing Wang, Email: vwjing@163.com.
Joseph Leader, Email: leaderjk@upmc.edu.
David Wilson, Email: wilsondo@upmc.edu.
Jiantao Pu, Email: puj@upmc.edu.
References
- 1.Siegel R. L., Miller K. D., Jemal A., “Cancer statistics,” CA Cancer J. Clin. 69(1), 7–34 (2019). 10.3322/caac.21551 [DOI] [PubMed] [Google Scholar]
- 2.Midthun D. E., “Early diagnosis of lung cancer,” F1000Prime Rep. 5, 12 (2013). 10.12703/P5-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.National Lung Screening Trial Research Team et al. , Reduced lung-cancer mortality with low-dose computed tomographic screening. N. Engl. J. Med. 365(5), 395–409 (2011). 10.1056/NEJMoa1102873 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Pinsky P. F., et al. , “Performance of lung-RADS in the National Lung Screening Trial: a retrospective assessment,” Ann. Intern Med. 162(7), 485–491 (2015). 10.7326/M14-2086 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.National Lung Screening Trial Research Team et al. “The National Lung Screening Trial: overview and study design,” Radiology 258(1), 243–253 (2011). 10.1148/radiol.10091808 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Aberle D. R., et al. , “Results of the two incidence screenings in the National Lung Screening Trial,” N. Engl. J. Med. 369(10), 920–931 (2013). 10.1056/NEJMoa1208962 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Murphy K., et al. , “A large-scale evaluation of automatic pulmonary nodule detection in chest CT using local image features and k-nearest-neighbour classification,” Med. Image Anal. 13(5), 757–770 (2009). 10.1016/j.media.2009.07.001 [DOI] [PubMed] [Google Scholar]
- 8.Choi W. J., Choi T. S., “Automated pulmonary nodule detection based on three-dimensional shape-based feature descriptor,” Comput. Methods Prog. Biomed. 113(1), 37–54 (2014). 10.1016/j.cmpb.2013.08.015 [DOI] [PubMed] [Google Scholar]
- 9.Setio A. A., et al. , “Pulmonary nodule detection in CT images: false positive reduction using multi-view convolutional networks,” IEEE Trans. Med. Imaging 35(5), 1160–1169 (2016). 10.1109/TMI.2016.2536809 [DOI] [PubMed] [Google Scholar]
- 10.Sim Y., et al. , “Deep convolutional neural network-based software improves radiologist detection of malignant lung nodules on chest radiographs,” Radiology 294(1), 199–209 (2020). 10.1148/radiol.2019182465 [DOI] [PubMed] [Google Scholar]
- 11.Halder A., Dey D., Sadhu A. K., “Lung nodule detection from feature engineering to deep learning in thoracic CT images: a comprehensive review,” J. Digital Imaging 33, 655–677 (2020). 10.1007/s10278-020-00320-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Pu J., et al. , “An automated CT based lung nodule detection scheme using geometric analysis of signed distance field,” Med. Phys. 35(8), 3453–3461 (2008). 10.1118/1.2948349 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Gurcan M. N., et al. , “Lung nodule detection on thoracic computed tomography images: preliminary evaluation of a computer-aided diagnosis system,” 29(11), 2552–2558 (2002). 10.1118/1.1515762 [DOI] [PubMed] [Google Scholar]
- 14.Messay T., Hardie R. C., Rogers S. K., “A new computationally efficient CAD system for pulmonary nodule detection in CT imagery,” Med. Image Anal. 14(3), 390–406 (2010). 10.1016/j.media.2010.02.004 [DOI] [PubMed] [Google Scholar]
- 15.Tan M., et al. , “A novel computer-aided lung nodule detection system for CT images,” 38(10), 5630–5645 (2011). 10.1118/1.3633941 [DOI] [PubMed] [Google Scholar]
- 16.Choi W.-J., Choi T.-S., “Automated pulmonary nodule detection based on three-dimensional shape-based feature descriptor,” Comput. Methods Prog. Biomed. 113(1), 37–54 (2014). 10.1016/j.cmpb.2013.08.015 [DOI] [PubMed] [Google Scholar]
- 17.Jacobs C., et al. , “Automatic detection of subsolid pulmonary nodules in thoracic computed tomography images,” Med. Image Anal. 18(2), 374–384 (2014). 10.1016/j.media.2013.12.001 [DOI] [PubMed] [Google Scholar]
- 18.Han F., et al. , “Texture feature analysis for computer-aided diagnosis on pulmonary nodules,” J. Digital Imaging 28(1), 99–115 (2015). 10.1007/s10278-014-9718-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Teramoto A., et al. , “Automated detection of pulmonary nodules in PET/CT images: ensemble false-positive reduction using a convolutional neural network technique,” 43(6), 2821–2827 (2016). 10.1118/1.4948498 [DOI] [PubMed] [Google Scholar]
- 20.Firmino M., et al. , “Computer-aided detection (CADe) and diagnosis (CADx) system for lung cancer with likelihood of malignancy,” Biomed. Eng. Online 15, 2 (2016). 10.1186/s12938-015-0120-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Pehrson L. M., Nielsen M. B., Ammitzbøl Lauridsen C., “Automatic pulmonary nodule detection applying deep learning or machine learning algorithms to the LIDC-IDRI database: a systematic review,” Diagnostics (Basel) 9(1), 29 (2019). 10.3390/diagnostics9010029 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Wang X., et al. , “An appraisal of lung nodules automatic classification algorithms for CT images,” Sensors (Basel) 19(1), 194 (2019). 10.3390/s19010194 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Zheng S., et al. , “Automatic pulmonary nodule detection in CT scans using convolutional neural networks based on maximum intensity projection,” IEEE Trans. Med. Imaging 39(3), 797–805 (2020). 10.1109/TMI.2019.2935553 [DOI] [PubMed] [Google Scholar]
- 24.Xie H., et al. , “Automated pulmonary nodule detection in CT images using deep convolutional neural networks,” Pattern Recognit. 85, 109–119 (2019). 10.1016/j.patcog.2018.07.031 [DOI] [Google Scholar]
- 25.Wu J., Qian T., “A survey of pulmonary nodule detection, segmentation and classification in computed tomography with deep learning techniques,” 2(8), 1–12 (2019). 10.21037/jmai.2019.04.01 [DOI] [Google Scholar]
- 26.Winkels M., Cohen T. S., “Pulmonary nodule detection in CT scans with equivariant CNNs,” Med. Image Anal. 55, 15–26 (2019). 10.1016/j.media.2019.03.010 [DOI] [PubMed] [Google Scholar]
- 27.Liao F., et al. , “Evaluate the malignancy of pulmonary nodules using the 3-D deep leaky noisy-OR network,” IEEE Trans. Neural Networks Learn. Syst. 30(11), 3484–3495 (2019). 10.1109/TNNLS.2019.2892409 [DOI] [PubMed] [Google Scholar]
- 28.Jiang H., et al. , “An automatic detection system of lung nodule based on multigroup patch-based deep learning network,” IEEE J. Biomed. Health Inf. 22(4), 1227–1237 (2018). 10.1109/JBHI.2017.2725903 [DOI] [PubMed] [Google Scholar]
- 29.Dou Q., et al., Eds., Automated Pulmonary Nodule Detection via 3D ConvNets with Online Sample Filtering and Hybrid-Loss Residual Learning, Springer International Publishing, Cham: (2017). [Google Scholar]
- 30.da Silva G. L. F., et al. , “Convolutional neural network-based PSO for lung nodule false positive reduction on CT images,” Comput. Methods Prog. Biomed. 162, 109–118 (2018). 10.1016/j.cmpb.2018.05.006 [DOI] [PubMed] [Google Scholar]
- 31.Yan X., et al., Eds., Classification of Lung Nodule Malignancy Risk on Computed Tomography Images Using Convolutional Neural Network: A Comparison Between 2D and 3D Strategies, Springer International Publishing, Cham: (2017). [Google Scholar]
- 32.Wu B., et al., Eds., “Joint learning for pulmonary nodule segmentation, attributes and malignancy prediction,” in IEEE 15th Int. Symp. Biomed. Imaging (ISBI 2018) (2018). 10.1109/ISBI.2018.8363765 [DOI] [Google Scholar]
- 33.Shen W., et al. , “Multi-crop convolutional neural networks for lung nodule malignancy suspiciousness classification,” Pattern Recognit. 61, 663–673 (2017). 10.1016/j.patcog.2016.05.029 [DOI] [Google Scholar]
- 34.Shen W., et al., Eds., Multi-scale Convolutional Neural Networks for Lung Nodule Classification, Springer International Publishing, Cham: (2015). [DOI] [PubMed] [Google Scholar]
- 35.Shen S., et al. , “An interpretable deep hierarchical semantic convolutional neural network for lung nodule malignancy classification,” Expert Syst. Appl. 128, 84–95 (2019). 10.1016/j.eswa.2019.01.048 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Setio A. A. A., et al. , “Automatic detection of large pulmonary solid nodules in thoracic CT images,” 42(10), 5642–5653 (2015). 10.1118/1.4929562 [DOI] [PubMed] [Google Scholar]
- 37.Onishi Y., et al. , “Automated pulmonary nodule classification in computed tomography images using a deep convolutional neural network trained by generative adversarial networks,” Biomed. Res. Int. 2019, 6051939 (2019). 10.1155/2019/6051939 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Nibali A., He Z., Wollersheim D., “Pulmonary nodule classification with deep residual networks,” Int. J. Comput. Assist. Radiol. Surg. 12(10), 1799–1808 (2017). 10.1007/s11548-017-1605-6 [DOI] [PubMed] [Google Scholar]
- 39.Liu S., et al. , “Pulmonary nodule classification in lung cancer screening with three-dimensional convolutional neural networks,” J. Med. Imaging (Bellingham) 4(4), 041308 (2017). 10.1117/1.JMI.4.4.041308 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Kim B.-C., et al. , “Multi-scale gradual integration CNN for false positive reduction in pulmonary nodule detection,” Neural Networks 115, 1–10 (2019). 10.1016/j.neunet.2019.03.003 [DOI] [PubMed] [Google Scholar]
- 41.Jung H., et al. , “Classification of lung nodules in CT scans using three-dimensional deep convolutional neural networks with a checkpoint ensemble method,” BMC Med. Imaging 18(1), 48 (2018). 10.1186/s12880-018-0286-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Jin H., et al. , “A deep 3D residual CNN for false-positive reduction in pulmonary nodule detection,” 45(5), 2097–2107 (2018). 10.1002/mp.12846 [DOI] [PubMed] [Google Scholar]
- 43.Dou Q., et al. , “Multilevel contextual 3-D CNNs for false positive reduction in pulmonary nodule detection,” IEEE Trans. Biomed. Eng. 64(7), 1558–1567 (2017). 10.1109/TBME.2016.2613502 [DOI] [PubMed] [Google Scholar]
- 44.Dey R., Lu Z., Hong Y., Eds., “Diagnostic classification of lung nodules using 3D neural networks,” in IEEE 15th Int. Symp. Biomed. Imaging (ISBI 2018) (2018). [Google Scholar]
- 45.Armato S. G., 3rd, et al. , “The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): a completed reference database of lung nodules on CT scans,” Med. Phys. 38(2), 915–931 (2011). 10.1118/1.3528204 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Wilson D. O., et al. , “The Pittsburgh Lung Screening Study (PLuSS): outcomes within 3 years of a first computed tomography scan,” Am. J. Respir. Crit. Care Med. 178(9), 956–961 (2008). 10.1164/rccm.200802-336OC [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Huang G., et al., Eds., “Densely connected convolutional networks,” in IEEE Conf. Comput. Vision and Pattern Recognit. (CVPR) (2017). [Google Scholar]
- 48.He K., Ed., “Deep residual learning for image recognition,” in IEEE Conf. Comput. Vision and Pattern Recognit. (CVPR) (2016). [Google Scholar]
- 49.Glorot X., Bengio Y., “Understanding the difficulty of training deep feedforward neural networks,” in Proc. Thirteenth Int. Conf. Artif. Intell. and Stat.; Proc. Mach. Learn. Res., Yee Whye T., Mike T., Eds., pp. 249–256 (2010). [Google Scholar]
- 50.Kingma D. P., Ba J., “Adam: a method for stochastic optimization,” arXiv:1412.6980 (2014)
- 51.DeLong E. R., DeLong D. M., Clarke-Pearson D. L., “Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach,” Biometrics 44(3), 837–845 (1988). 10.2307/2531595 [DOI] [PubMed] [Google Scholar]



