Abstract
Lung cancer is one of the leading causes of cancer-related fatality in the world. Patients display few or even no signs or symptoms in the early stages, resulting in up to 75% of patients diagnosed in the later stages of the disease. Consequently, there has been a call for lung cancer screening amongst at-risk populations. The early detection of malignant pulmonary nodules in CT is one of the suggested methods proposed to diagnose early-stage lung cancer; however, the reported sensitivity of radiologists’ ability to accurately detect pulmonary nodules ranges widely from 30 to 97%. 2012 saw Alex Krizhevsky present a paper titled “ImageNet Classification with Deep Convolutional Networks” in which a multilayered convolutional computational model known as a convolutional neural network (CNN) was confirmed competent in identifying and classifying 1.2 million images to a previously unseen level of accuracy. Since then, CNNs have gained attention as a potential tool in aiding radiologists’ detection of pulmonary nodules in CT imaging. This review found the use of CNN is a viable strategy to increase the overall sensitivity of pulmonary nodule detection. Small, non-validated data sets, computational constraints, and incomparable studies are currently limited factors of the existing research.
Introduction
Lung cancer is one of the most common causes of cancer-related death in the world.1 A number of surgical, percutaneous, and medical therapies can achieve a cure in many individuals provided the diagnosis is a small localised tumour.2 Unfortunately, since there are little to no symptoms in the early stages of the disease, diagnosis will occur in 75% of lung cancers during the later stages with advanced local disease, nodal spread, and/or metastatic disease.3 Thus, according to Australian research, patients diagnosed with lung cancer have an overall 5-year survival rate of 15%.4
Fundamental to the early diagnosis of lung cancer is the detection of malignant pulmonary nodules.3, 5 Pulmonary nodules have traditionally been defined radiologically as intraparenchymal circular opacities less than 3 cm in diameter. Unfortunately, not all pulmonary nodules represent malignancy, and not all lung cancers appear as well-defined nodules. Additionally, end-on pulmonary vessels can mimic the appearance of solitary nodules.6 It is due to these multifactorial components when detecting and classifying pulmonary nodules that false-positives have become problematic in large-scale screening programs, with one low dose CT program citing a round of false-positive results around 96.4%.7 Thus, the intrinsic appearance of the nodule, in conjunction with the patient's history and risk factors, must be considered.
It is essential to establish if a pulmonary nodule is benign or malignant while maintaining a low false-positive rate. Since the advent of lung cancer screening via CT, radiologists have spent countless working hours evaluating, characterising and detecting pulmonary nodules.8 This has resulted in targeted research relating to the sensitivity of radiologists to accurately identify and characterise pulmonary nodules, with varying sensitivities (dependent on departmental reference standards) ranging from 30 to 97% and a false positive rate as high as 2.1 per scan.8
CNN’s potential saw heightened attention in 2012, when the paper “ImageNet Classification with Deep Convolutional Networks” successfully identified and classified 1.2 million high-resolution images, sourced from the internet, to a previously unseen level of accuracy.9 Although the paper’s subject was not of a radiological nature; instead the classification of everyday images, it sparked interest for broader use.9–15
This review attempts to explore and delineate the current literature pertaining to the use of CNNs in improving pulmonary detection with CT.
Computer-aided detection
The escalation in radiology workload due to higher volumes of lung screening CT scans, combined with a variable sensitivity rate in assessing for pulmonary nodules, has resulted in substantial research into the use of computer-aided detection (CAD) systems.5, 16,17 These systems hope to reduce reporting time, increase sensitivity, and potentially reduce the rate of false-positives.18 CAD is utilised to "flag” pulmonary nodules that meet a pre-set criterion based on elements such as shape index and curvedness. Nodules flagged are then reviewed by a radiologist to determine whether or not they should be considered suspicious, requiring further assessment or follow-up.17
CAD in pulmonary nodule assessment with CT has been utilised in radiology departments sine as early as 2002.19 Traditional means of CAD have a low threshold for specificity, resulting in a considerable quantity of false positives requiring manual rejection, with some systems citing a false positive rate of up to 25.4 per scan.20 Although CAD has been proven to improve reporting efficiency, its high dependence on image processing and false-positives remains a crucial flaw.12, 17
Convolutional neural network
The application of CAD is an example of carefully crafted task-specific engineering, with a well-designed feature-set, and considerable human skill and time employed to teach the system how to interpret various features of a nodule, and when to flag it as suspicious.17–19,21
Recent advances in computer power and graphics processing have, however, allowed for a method of a computational modelling known as CNN, which is a form of neural network that exclusively processes multiple array data such as images.9
Unlike bespoke CAD systems, CNNs can self-determine previously unknown features, maximising classification with limited direct supervision.
Based on a hierarchical structure of analysis and retrospective adjustment, its primary advantage is in its ability to learn from abundant sources of verified data, differing from the fundamental processes of CAD.13
A CNN is purpose-built to compare and assess images portion by portion. Once provided an input (an image), the CNN will apply multiple feature-based convolution filters to the matrix, resulting in a set of filtered images otherwise known as feature maps. This is the first layer, known as the “convolution layer” (Figure 1).
Figure 1.
Typical steps of a simple CNN depicting (a) segmentation of the region of interest (b) convoluted maps and pooling of feature extraction (c) the various layers of the fully-connected layer (d) prediction based on a predetermined variable. CNN, convolutional neural network.
Once the set of filtered images is created, they are processed through the second layer, known as “pooling”, where they are downsized to a smaller matrix. An advantage of pooling is its ability to preserve important information via maintenance of the maximum pixel value from each filtered set. During the convolution and pooling processes, pixels may have negative values. The additional “rectified linear unit” layer ensures all negative values are set to zero, maintaining a mathematically sound matrix to analyse.22, 23
The final results of the convolution feed into a “fully connected layer” that presents each separate filtered image as a “vote”. Each vote has a “weight” in determining the category of an image. Before a CNN can detect nodules in practice; it is presented with a large number of input images where the answer is already known, allowing it to learn, via backpropagation. Each time the CNN makes an error, it will self-adjust, assigning higher or lower weights to the “vote” of each feature pixel in the process that feeds into the fully connected layer.24 Each CNN must experience this period of adjustment, known as the “training phase”.23 This is, of course, easier said than done, as it is no small feat to establish an extensive image database with genuinely accurate diagnoses for a CNN to learn. However, the advent of lung cancer screening has consequently provided researchers with a substantial amount of validated data, much of which has been used to train CNNs in the learning phase of pulmonary nodule detection.
Training a CNN is known as supervised learning, a process that has succeeded in achieving accurate large-scale photographic image classification.23
The recent success of a CNN in photographic image classification9 has lead researchers to explore its viability as a diagnostic tool to detect and characterise pulmonary nodules on CT images.
CNN methods
Each paper featuring the use of CNNs as a tool for detecting and evaluating pulmonary nodules has employed a different methodology to train, assess, and validate data sets. The architecture of a CNN is dependent on the size of the input image, often referred to as the region of interest (ROI). The ROI must be pre-determined and all images resized to fit that value before beginning the study. Factors that differ from study to study include the number of convolution layers in the CNN, differences in CNN architecture, the layer depth, and testing methods. These are summarised in Table 1.
Table 1.
| Study | Region of interest size | Convolution layers | Testing method |
| Li et al12 | 32 * 32 pixels | 2 | 10-fold cross-validation test and a data set divided into both training data and testing data |
| Hussein et al27 | 0.5 * 0.5 mm | 5 + Gaussian process regression | 10-fold cross-validation test and a data set divided into both training data and testing data |
| Cheng et al26 | 28 * 28 pixels | 2 | 10-fold cross-validation test |
| Gruetzemacher and Gupta25 | 36 * 36 pixels | 3–6 | Cross-validation test |
| Nibali et al13 | 64 * 64 pixels | 3-column network that is “fully convolutional.” | Modified k-fold cross-validation |
CNN, convolutional neural networks.
Lung image database consortium image collection
The Lung Image Database Consortium (LIDC) Image Collection is an open source globally available resource of 1018 chest CTs, collected during lung cancer screening in the USA. The purpose of the database is to provide a web-accessible resource of a format suitable to aid and test the development of CAD of pulmonary nodules.
Each image on the database has undergone a two-phased blinded then unblinded process conducted by four experienced consultant thoracic radiologists.
For each nodule, information regarding how the diagnosis was made is provided, ranging from unknown diagnosis, review of radiological appearance over two years, biopsy, surgical resection, to the progression or response. Where possible, diagnosis at both the patient level and the nodule level was recorded and included in the database.28
Each nodule that measured 3 mm or more in diameter included freehand annotations on each CT slice, including a subjective rating on a 5 to 6 point scale regarding the calcification, internal structure, population, margin sharpness, texture, and spiculation.28
This wealth of validated annotated data is the foundation of the majority of the literature reviewed regarding supervised training of a CNN to detect pulmonary nodules.
Current literature
Throughout the literature evaluated, a primary motive remained evident: to achieve a sensitive yet accurate CNN that can detect and classify pulmonary nodules on chest CT scans to a similar level to that of the expert radiologist results in the LIDC.12,13,25–27
The results of the individual CNN papers in comparison to the current state-of-the-art CAD can be observed in Table 2.
Table 2.
Summary of CNN performance in accurately detecting pulmonary nodules compared to the industry leaders in CAD12,13,18,25–27,29
| Study | System used | Database used | Output value | Results |
| Li et al12 | CNN | LIDC | Nodule vs Non-nodule | 87.1% sensitivity with a false-positive rate of 4.622 per scan |
| Hussein et al27 | TumorNet (CNN) | LIDC | Malignancy score 1–5 | 92.31% regression accuracy with a standard error of 1.59% |
| Cheng et al26 | OverFeat (CNN) | LIDC | Benign vs malignant | 90.8% ± 5.3 sensitivity |
| Gruetzemacher and Gupta25 | CNN | LIDC | Benign vs malignant | 78.2% sensitivity with a specificity of 86.13% and a classification accuracy of 82.10% |
| Nibali et al13 | Modified residual network (CNN) | LIDC | Malignancy probability | 91.07% sensitivity with an 89.90% accuracy |
| Murphy et al29 | Shape index and curvedness algorithm (CAD) | The Nelson Trial | Nodule vs non-nodule | 80% sensitivity with a false-positive of 4.2 per case |
| Ye et al18 | Shape-based detection (CAD) | A validated clinical data set | Nodule vs non-nodule | 90.2% sensitivity with a false-positive of 8.2 per case |
CAD, computer-aided detection; CNN, convolutional neural networks; LIDC, Lung Image Database Consortium.
Simplified, assessed the sensitivity and accuracy of the computational models on the LIDC data set, segmenting ROIs into training, validation and testing portions. Because the LIDC is an already validated source of data, measuring the effectiveness of a CNN is somewhat more straightforward.23, 28 The method by which these data were inputted, scaled and measured demonstrate a high degree of variability throughout the research, with some studies focusing solely on accurate nodule detection, and others nodule detection in conjunction with malignancy classification.
A notable observation seen throughout the literature was the inherent difficulty in comparing different studies due to the differences in algorithms and input data style. Although most studies drew from the same image database (LIDC), each study utilised varying amounts of the data in different ways. For example, a study by Li et al produced promising results. However, when attempting to compare their research with other methods it was found similar studies used a comparatively smaller portion of the data for training and validation, and hence comparison would be inaccurate.12
In contrast, other research yielded superficially less accurate results. However, a detailed examination of their methodology reveals that the manner in which they tested their algorithm was significantly more stringent, not only exploring the detection of pulmonary nodules but accurate characterisation as well.30
The overall accuracy in detection and classification of pulmonary nodules on CT using a CNN was comparable and in most cases superior to the traditional methods of CAD used in lung screening today. Most notable is the reduction of false-positives, a factor that has the potential to improve radiologists’ reporting workflow by decreasing the volume of examinations that require detailed analysis and ultimately rejection.26
An advantage of a CNN over CAD is its ability to “learn from data.22–24 When provided with validated data, a CNN will undergo automated iterations to adjust individual weights within layers based on the parameters of the supervised learning. This review found that in all papers it was concluded that as iterations increased, sensitivity initially increased considerably, i.e. during the learning phase, followed by a period of incremental improvements.11, 13
The work of Gruetzemacher & Gupta observed that a CNN with three convolution layers possessed an absolute classification accuracy of 81.08%, compared to a CNN with five convolution layers which had a classification accuracy of 82.50%.25 However, when that same model possessed six convolution layers, the classification accuracy dropped to 81.50%, raising the point that although there has been a general move towards more complicated architectures, merely adding additional layers does not necessarily impart an advantage.
While some articles explored the influence the number of layers had on learning, others investigated the effect of tailoring a CNN to address a particular element such as nodule shape or texture.
Discussion
This literature review found that using a CNN as a tool for image recognition and classification in medical imaging is a relatively new technology with most practical applications still in the proposed phase.
Given the burden lung cancer will have on society, a reliable detection system is needed to help address the rate of false-positives seen in the current lung screening literature.4, 7,8
The current literature confirms that if trained correctly in the context of lung cancer screening, CNNs could play an active role in reducing radiology workload via a precise detection system, as well as potentially increasing diagnostic accuracy. However, the defining feature of neural networks is also its limitation. Neural networks require an extraordinary amount of validated data to aid in the supervised learning phase of implementation. It is not an entirely automated process and still requires expert human input to verify.9,22–24 The use of the LIDC as an initial training tool for CNN research is practical, given the volume of the cases prepared and authorised by a specialist. Nevertheless, this may pose a problem in the future, as a well-documented flaw of CNNs is a phenomenon known as overfitting, a result of undertraining where a CNN functions well within the control data (LIDC), however, is unable to replicate similar results in test data.31 These results could have significant implications due to the variability of anatomy and scan quality at individual centres.
Another notable shortcoming in using CNNs is the inability of an observer to reasonably determine how a system came to a conclusion without analysing hundreds of thousands of weighted vectors. Here lies a conundrum, as a CNN requires large amounts of accurate, validated data to train the original algorithm. If this does not occur, the result may be highly inaccurate, but the reasons for this may be so complicated as to be practically imperceptible.23
The advantage that a CNN has over the traditional methods of CAD is in the design of the algorithm. CAD requires constant human input and engineering to ensure it functions at an acceptable level.19 In comparison, via the use of iterative self-learning, a CNN can improve the overall detection rate of pulmonary nodules every time an output is flagged as incorrect. A synergised program where radiologists correct disagreed-upon data has worked in different areas of medical imaging with impressive results.32 The ability of CNNs to adapt based on corrected data provide an opportunity for radiologists to use this method with an increasingly enlarging data set of true-positives and true-negatives.
Conclusion
Preliminary studies performed in the last decade have found the sensitivity of CNNs in detecting pulmonary nodules on CT to range from 73.4 to 92.31%.
One significant advantage in using a CNN as a diagnostic tool is its ability to learn as it is supplied validated data; however, the capacity for a CNN to learn is dependent on the quality and quantity of these data. Substantial research has gone into the use of CNN as a computer model to detect and characterise pulmonary nodules on CT with excellent results; however, to this date, most of the relevant research has been conducted on the same database, the LIDC.
Further research should be conducted over much larger validated databases under similar experimental conditions for ease of comparison.
Contributor Information
Andrew Murphy, Email: aandrewfmurphy@gmail.com.
Matthew Skalski, Email: docskalski@gmail.com.
Frank Gaillard, Email: frank.gaillard@gmail.com.
REFERENCES
- 1.Morampudi S, Das N, Gowda A, Patil A. Estimation of lung cancer burden in Australia, the Philippines, and Singapore: an evaluation of disability adjusted life years. Cancer Biol Med 2017; 14: 74–82. doi: 10.20892/j.issn.2095-3941.2016.0030 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Gadgeel SM, Ramalingam SS, Kalemkerian GP. Treatment of lung cancer. Radiol Clin North Am 2012; 50: 961–74. doi: 10.1016/j.rcl.2012.06.003 [DOI] [PubMed] [Google Scholar]
- 3.Humphrey LL, Deffebach M, Pappas M, Baumann C, Artis K, Mitchell JP, et al. Screening for lung cancer with low-dose computed tomography: a systematic review to update the US preventive services task force recommendation. Ann Intern Med 2013; 159: 411–20. doi: 10.7326/0003-4819-159-6-201309170-00690 [DOI] [PubMed] [Google Scholar]
- 4.Yang P. Epidemiology of lung cancer prognosis: quantity and quality of life. Methods Mol Biol 2009; 471: 469–86. doi: 10.1007/978-1-59745-416-2_24 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kazerooni EA, Armstrong MR, Amorosa JK, Hernandez D, Liebscher LA, Nath H, et al. ACR CT accreditation program and the lung cancer screening program designation. J Am Coll Radiol 2015; 12: 38–42. doi: 10.1016/j.jacr.2014.10.002 [DOI] [PubMed] [Google Scholar]
- 6.MacMahon H, Naidich DP, Goo JM, Lee KS, Leung ANC, Mayo JR, et al. Guidelines for management of incidental pulmonary nodules detected on CT images: from the fleischner society 2017. Radiology 2017; 284: 228–43. doi: 10.1148/radiol.2017161659 [DOI] [PubMed] [Google Scholar]
- 7.Aberle DR, Adams AM, Berg CD, Black WC, Clapp JD, Fagerstrom RM, et al. Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med 2011; 365: 395–409. doi: 10.1056/NEJMoa1102873 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Rubin GD. Lung nodule and cancer detection in CT screening. J Thorac Imaging 2015; 30: 130–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Proceedings of the 25th international conference on neural information processing systems. Lake Tahoe, Nevada: The British Institute of Radiology.; 2012. 1097–105. [Google Scholar]
- 10.Anirudh R, Thiagarajan JJ, Bremer T, Kim H. Lung nodule detection using 3D convolutional neural networks trained on weakly labeled data. Proc. SPIE 9785, Medical Imaging 2016: Computer-Aided Diagnosis 978532. [Google Scholar]
- 11.Cicero M, Bilbily A, Colak E, Dowdell T, Gray B, Perampaladas K, et al. Training and validating a deep convolutional neural network for computer-aided detection and classification of abnormalities on frontal chest radiographs. Invest Radiol 2017; 52: 281–7. doi: 10.1097/RLI.0000000000000341 [DOI] [PubMed] [Google Scholar]
- 12.Li W, Cao P, Zhao D, Wang J. Pulmonary nodule classification with deep convolutional neural networks on computed tomography images. Comput Math Methods Med 2016; 2016: 1–7. doi: 10.1155/2016/6215085 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Nibali A, He Z, Wollersheim D. Pulmonary nodule classification with deep residual networks. Int J Comput Assist Radiol Surg 2017; 12: 1799–808. doi: 10.1007/s11548-017-1605-6 [DOI] [PubMed] [Google Scholar]
- 14.Oda S, Awai K, Suzuki K, Yanaga Y, Funama Y, MacMahon H, et al. Performance of radiologists in detection of small pulmonary nodules on chest radiographs: effect of rib suppression with a massive-training artificial neural network. AJR Am J Roentgenol 2009; 193: W397–W402. doi: 10.2214/AJR.09.2431 [DOI] [PubMed] [Google Scholar]
- 15.Suzuki K. A supervised ‘lesion-enhancement’ filter by use of a massive-training artificial neural network (MTANN) in computer-aided diagnosis (CAD). Phys Med Biol 2009; 54: S31–45. doi: 10.1088/0031-9155/54/18/S03 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Rubin GD, Roos JE, Tall M, Harrawood B, Bag S, Ly DL, et al. Characterizing search, recognition, and decision in the detection of lung nodules on CT scans: elucidation with eye tracking. Radiology 2015; 274: 276–86. doi: 10.1148/radiol.14132918 [DOI] [PubMed] [Google Scholar]
- 17.Roos JE, Paik D, Olsen D, Liu EG, Chow LC, Leung AN, et al. Computer-aided detection (CAD) of lung nodules in CT scans: radiologist performance and reading time with incremental CAD assistance. Eur Radiol 2010; 20: 549–57. doi: 10.1007/s00330-009-1596-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Ye X, Lin X, Dehmeshki J, Slabaugh G, Beddoe G. Shape-based computer-aided detection of lung nodules in thoracic CT images. IEEE Trans Biomed Eng 2009; 56: 1810–20. doi: 10.1109/TBME.2009.2017027 [DOI] [PubMed] [Google Scholar]
- 19.Armato SG, Altman MB, La Rivière PJ. Automated detection of lung nodules in CT scans: effect of image reconstruction algorithm. Med Phys 2003; 30: 461–72. doi: 10.1118/1.1544679 [DOI] [PubMed] [Google Scholar]
- 20.Suzuki K, Armato SG, Li F, Sone S, Doi K. Massive training artificial neural network (MTANN) for reduction of false positives in computerized detection of lung nodules in low-dose computed tomography. Med Phys 2003; 30: 1602–17. doi: 10.1118/1.1580485 [DOI] [PubMed] [Google Scholar]
- 21.Tan M, Deklerck R, Jansen B, Bister M, Cornelis J. A novel computer-aided lung nodule detection system for CT images. Med Phys 2011; 38: 5630–45. doi: 10.1118/1.3633941 [DOI] [PubMed] [Google Scholar]
- 22.Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. Proc. International Conference on Learning Representations; 2014. Available from: http://arxiv.org/abs/1409.1556 [Google Scholar]
- 23.LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015; 521: 436–44. doi: 10.1038/nature14539 [DOI] [PubMed] [Google Scholar]
- 24.Karpathy A. Convolutional Neural Networks (CNNs / ConvNets). CS231n: Convolutional Neural Networks for Visual Recognition. 2017; Course notes. Stanford University. Available from: cs231n.github.io/convolutional-networks/.
- 25.Gruetzemacher R, Gupta A. Using deep learning for pulmonary nodule detection & diagnosis. Twenty-second Americas conference on information systems, San Diego; 2016. [Google Scholar]
- 26.Cheng JZ, Ni D, Chou YH, Qin J, Tiu CM, Chang YC, et al. Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Sci Rep 2016; 6: 24454. doi: 10.1038/srep24454 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Hussein S, Gillies R, Cao K, Song Q, Bagci U. TumorNet: lung nodule characterization using multi-view convolutional neural network with gaussian process. arXiv 2017; arXiv:1703.00645v1. [Google Scholar]
- 28.Armato SG, McLennan G, Bidaut L, McNitt-Gray MF, Meyer CR, Reeves AP, et al. The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): a completed reference database of lung nodules on CT scans. Med Phys 2011; 38: 915–31. doi: 10.1118/1.3528204 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Murphy K, van Ginneken B, Schilham AM, de Hoop BJ, Gietema HA, Prokop M. A large-scale evaluation of automatic pulmonary nodule detection in chest CT using local image features and k-nearest-neighbour classification. Med Image Anal 2009; 13: 757–70. doi: 10.1016/j.media.2009.07.001 [DOI] [PubMed] [Google Scholar]
- 30.Chen H, Wang XH, Ma DQ, Ma BR. Neural network-based computer-aided diagnosis in distinguishing malignant from benign solitary pulmonary nodules by computed tomography. Chin Med J 2007; 120: 1211–5. [PubMed] [Google Scholar]
- 31.Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 2014; 15: 1929–58. [Google Scholar]
- 32.Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology 2017; 284: 574–82. doi: 10.1148/radiol.2017162326 [DOI] [PubMed] [Google Scholar]

