Skip to main content
Thoracic Cancer logoLink to Thoracic Cancer
. 2022 Feb 8;13(5):664–677. doi: 10.1111/1759-7714.14333

Lung cancer risk prediction models based on pulmonary nodules: A systematic review

Zheng Wu 1, Fei Wang 1, Wei Cao 1, Chao Qin 1, Xuesi Dong 1, Zhuoyu Yang 1, Yadi Zheng 1, Zilin Luo 1, Liang Zhao 1, Yiwen Yu 1, Yongjie Xu 1, Jiang Li 1,2, Wei Tang 3, Sipeng Shen 4,5, Ning Wu 3,6, Fengwei Tan 7,, Ni Li 1,2,, Jie He 1,7,
PMCID: PMC8888150  PMID: 35137543

Abstract

Background

Screening with low‐dose computed tomography (LDCT) is an efficient way to detect lung cancer at an earlier stage, but has a high false‐positive rate. Several pulmonary nodules risk prediction models were developed to solve the problem. This systematic review aimed to compare the quality and accuracy of these models.

Methods

The keywords “lung cancer,” “lung neoplasms,” “lung tumor,” “risk,” “lung carcinoma” “risk,” “predict,” “assessment,” and “nodule” were used to identify relevant articles published before February 2021. All studies with multivariate risk models developed and validated on human LDCT data were included. Informal publications or studies with incomplete procedures were excluded. Information was extracted from each publication and assessed.

Results

A total of 41 articles and 43 models were included. External validation was performed for 23.2% (10/43) models. Deep learning algorithms were applied in 62.8% (27/43) models; 60.0% (15/25) deep learning based researches compared their algorithms with traditional methods, and received better discrimination. Models based on Asian and Chinese populations were usually built on single‐center or small sample retrospective studies, and the majority of the Asian models (12/15, 80.0%) were not validated using external datasets.

Conclusion

The existing models showed good discrimination for identifying high‐risk pulmonary nodules, but lacked external validation. Deep learning algorithms are increasingly being used with good performance. More researches are required to improve the quality of deep learning models, particularly for the Asian population.

Keywords: early detection and early diagnosis, lung cancer, prediction, pulmonary nodule, screening


Pulmonary nodules risk prediction models were developed to reduce the high false‐positive rate of lung cancer screening. A total of 41 articles and 43 models were systematically identified and assessed. The existing models showed good discrimination, but lacked external validation. Deep learning algorithms were increasingly being used with good performance. More researches were required to improve the quality of deep learning models, particularly for the Asian population.

graphic file with name TCA-13-664-g004.jpg

INTRODUCTION

Lung cancer causes a significant burden on health care systems. In 2020, lung cancer resulted in the death of 1.8 million people worldwide. In China, lung cancer remains the most commonly diagnosed cancer and the leading cause of cancer death. 1

The overall 5‐year survival rate of lung cancer ranges from 10% to 20% in most countries. 2 However, the prognosis of lung cancer largely depends on the stage of the disease at diagnosis. Although the 5‐year survival rate of lung cancer at stage I is above 80%, it is close to 0% for stage IV disease. 3 Therefore, early diagnosis and treatment are important to reduce mortality from lung cancer, improve the quality of life and reduce the economic burden from this disease.

Screening with low‐dose computed tomography (LDCT) has been shown to be an efficient way to detect lung cancer at an earlier stage and reduce lung cancer mortality. 4 Several lung cancer screening trials have been conducted worldwide. 4 , 5 , 6 , 7 , 8 , 9 The national lung cancer screening trial (NLST) of the United States has shown that early LDCT screening can detect potentially cancerous lung nodules at an early stage leading to a reduction in lung cancer mortality by 20%. Nevertheless, the false‐positive nodule detection rate by LDCT was extremely high at 96.4%, 4 eventually leading to unnecessary radiation exposure from further follow‐up imaging tests, invasive biopsies, medical expenses, and anxiety among patients. 6 Therefore, it is of paramount importance to identify the individuals at higher risk of developing lung cancer based on the pulmonary nodules identified on LDCT scans to recommend appropriate examination and management.

Further examinations in current lung cancer screening programs are recommended solely based on the nodule sizes on the LDCT scans. However, although this method of categorizing pulmonary nodules is easy to implement clinically, it may lead to a high rate of false‐positive results. On the contrary, risk prediction models based on pulmonary nodule size, calcification, density, and other relevant imaging information may facilitate the identification of high‐risk groups, significantly reduce the false positive rate, and improve the screening program's efficiency. 7 Therefore, this method is now recommended by several clinical guidelines to reduce the high false‐positive rate of LDCT screening. 8 , 9

As a result, several statistical models have been developed in recent years to predict the risk of developing lung cancer based on the identification of pulmonary nodules on LDCT. However, without a systematic evaluation of the relevant models, it remains unclear which, if any of these models should be used clinically. Therefore, in this study, we reviewed the contemporary published literature to identify current multivariable statistical models used to predict the risk of developing lung cancer from the pulmonary nodules identified on LDCT. In addition, the effectiveness, reliability, bias, and extrapolation of the different models used in these studies were also compared.

METHODS

Search strategy

A literature search was conducted using the PubMed, Cochrane, Embase, and Web of Science electronic databases. The keywords “lung cancer” or “lung neoplasms” or “lung tumor” or “lung carcinoma” and “predict” or “assessment” or “risk” and “nodule” were used to identify all relevant articles published in English from January 1960 to February 2021. We also hand‐searched the reference lists of eligible studies to identify additional relevant publications. Further detail about the search strategy used in this study is available in Table S1.

Review methods and selection criteria

Two reviewers independently screened all titles and abstracts and made decisions regarding the potential eligibility of the research articles for full text review. Discrepancies in judgment were resolved by a third reviewer. Studies were eligible if they reported on the development of multivariable risk prediction models for the development of lung cancer based on the pulmonary nodules identified on LDCT and included a detailed description of the procedures used to evaluate and validate the model. Studies with an incomplete description of the procedures used to develop, validate, and evaluate the model were excluded. Informal publications such as conference abstracts were also excluded.

Data extraction

The models used in the studies were divided into two categories; traditional and deep learning models. In the traditional models, raw data (i.e., original image features) were translated into a finite number of feature descriptors (i.e., size, type, or density of nodules) that could be used as predictors for lung cancer. The association between lung cancer risk and each descriptor was tested, quantified, and subsequently developed into an appropriate statistical risk model. In the deep learning algorithm‐based models, the use of raw data was allowed and representations needed for detection or classification were automatically discovered, and the association between lung cancer risk and descriptors is partly unexplainable. 10 , 11

For each of the included studies, basic information about the research methodology, variables used to develop the models, and the methods used to evaluate the models were extracted. The basic information included the first author, publication year, study design, study method, target population, inclusion criteria of participants and nodules, and the number of normal and lung cancer cases used for modeling. The model variables extracted from the studies included: basic information about the clinical and epidemiological characteristics, such as age, sex, smoking, family history, occupational exposure, or history of chronic respiratory diseases; and imaging nodule characteristics, like size, density or shape; other tumor biomarkers like neuron‐specific enolase (NSE), or carcinoembryonic antigen (CEA). For the studies based on the deep learning algorithm, it was not possible to extract these variables because of the method used to develop the risk model. The model evaluation criteria included the type of validation (external or internal), the sample size used for verification, the area under the curve (AUC), model calibration slope results, sensitivity, specificity, and the risk threshold. The findings of either the Hosmer‐Lemeshow test or the expected to observe ratio (excellent, poor, or uncalibrated) were also recorded. Furthermore, we used the same dataset to compare the performance (AUC, sensitivity, or specificity) of all deep learning models with existing prediction methods or clinically based guidelines published by professional bodies such as the American College of Radiology Lung Imaging Reporting and Data System (ACR Lung‐RADS) based on the conclusion in the original text.

Quality assessment

The Grading of Recommendations, Assessment, Development and Evaluation (GRADE) method 12 was used to evaluate the quality of evidence in traditional models. This method assesses the quality of the publication based on the risk of bias, consistency, accuracy, directness, and publication bias.

Data synthesis

The sample size used in each study was recorded when available and estimated for evaluation purposes when not available. If several models were used to train the algorithm on the same data set, the model with the highest AUC was selected.

Limited statistical power may lead to insufficient power to detect a significant association, resulting in unstable models. To overcome this problem, we calculated the events per variable (EPV) for traditional models. EPV was defined as the number of events divided by the number of predictor variables included in the multivariable model. An EPV value <10 suggests limited statistical power. 13 Because it was not possible to record and name the variables used in the deep learning models, 11 the EPV could not be calculated.

RESULTS

Study characteristics and quality assessment

The literature search revealed a total of 3230 publications, of which 630 were found to be duplicated and were, therefore, removed from the evaluation. A total of 2293 articles that did not meet our criteria were excluded from the screening. After evaluating the full texts of the remaining 307 articles, 41 articles met the eligibility criteria and were included for further analysis (Figure 1).

FIGURE 1.

FIGURE 1

Flow chart of literature search

After evaluating the articles, 43 models were identified. Overall the models were based on more than 20 000 Asian, North American, and European participants (Figure 2(a)). After 2018, the number of relevant studies grew rapidly. As a result, over half (67.4%, 29/43) of all models were released after 2018 (Figure 3).

FIGURE 2.

FIGURE 2

Characters of existing models; (a) size and distribution of training sets used for modeling; (b) number and distribution of existing models; (c) number and distribution of models seeking validation in different ways; (d) number and distribution of models from different regions and data sources; and (e) frequency of risk factors used in traditional final models

FIGURE 3.

FIGURE 3

AUCs and confidence intervals of existing models by regions and time periods

Most models (58.1%, 25/43) were developed based on deep learning algorithms, and the remaining (41.9%, 18/43) were developed using traditional models (Figure 2(b)) such as logistic regression. However, in recent years, the use of deep learning algorithms increased significantly (Table 2).

TABLE 2.

Basic information and development of models based on the deep learning algorithm

First author Year Study design Targeted population Inclusion criteria of participants Inclusion criteria of nodules Sample size Cases of lung cancer Data source
Yoganand Balagurunathan 14 2019 Screening trial American 55–74 years old and smoker ≥4 mm 244 78 Multicenter
Gerard A. Silvestri 15 2018 Cohort study American and Canadian >40 years old 8–30 mm 178 29 Multicenter
Chao Zhang 16 2019 Cohort study American and Chinese Unspecified Unspecified Multicenter
Johanna Uthoff 17 2019 Cohort study American 363 74 Multicenter
Ilaria Bonavita 18 2020 Cohort study American Unspecified Unspecified Multicenter
Parnian Afshar 19 2020 Cohort study American 1010 Unspecified Multicenter
Huafeng Wang 20 2018 Cohort study American 1018 Unspecified Multicenter
Jason L. Causey 21 2018 Cohort study American 1018 Unspecified Multicenter
Samuel Hawkins 1 22 2016 Screening trial American 55–74 years old and smoker ≥4 mm 600 200 Multicenter
Samuel Hawkins 2 22 2016 Screening trial American 55–74 years old and smoker ≥4 mm 600 200 Multicenter
Andrew V. Kossenkov 23 2019 Cohort study American smoker 6–20 mm 583 293 Multicenter
G. A. Soardi 24 2015 Cohort study American ≤30 mm 311 199 Single‐center
Zuohong Wu 25 2021 Cohort study Chinese ≤30 mm 995 772 Single‐center
Stéphane Chauvie 26 2020 Screening trial Chinese 45–75 years old and smoker 234 32 Multicenter
Shulong Li 27 2019 Cohort study American 1010 Unspecified Multicenter
Rekka Mastouri 28 2021 Cohort study American Unspecified Unspecified Multicenter
Yin‐Chen Hsu 29 2020 Cohort study Chinese 836 27 Single‐center
Jiabao Liu 30 2020 Cohort study Chinese 6–30 mm 879 601 Multicenter
Rahul Paul 31 2020 Cohort study American 55–74 years old and smoker ≥4 mm 261 85 Multicenter
Muahammad Bilal Zia 32 2020 Cohort study American 1010 Unspecified Multicenter
Yi‐Ming Xu 33 2020 Cohort study American 55–74 years old and smoker ≥4 mm 1109 926 Multicenter
Subba R. Digumarthy 34 2019 Cohort study American 36 Unspecified Single‐center
Yangwei Xiang 35 2019 Cohort study Chinese 588 462 Single‐center
Liting Mao 36 2019 Cohort study Chinese 294 61 Single‐center
Shaun Daly 37 2013 Cohort study American 136 69 Single‐center

Only 23% (10/43) of the models were externally validated (Figure 2(c)). Data from multiple sources were used to develop the models in half of the studies (Figure 2(d)). Thirty‐three studies used data from cohort studies to develop the models, whereas in eight studies, the models were constructed using the data from screening trials (Tables 3 and 4). Almost all studies (97.6%, 40/41) had medium to very low credibility, largely because of publication bias, indirectly, and imprecision (Table S2).

TABLE 3.

Validation of traditional models

First author Year Type of validation Calibration Sample size AUC a Thresholds Sensitivity Specificity
Annette McWilliams 38 2013 External Excellent 1090 0.970 0.05 0.71 0.96
Barbara Nemesure 39 2019 Internal Not calibrated 1455 0.860 0.73 0.81
Michael W. Marcus 40 2019 Internal Excellent 1013 0.882
Martin T. ammemagi 41 2018 External Excellent 3680 0.947
Vineet K. Raghu 42 2019 External Not calibrated 126 0.882 0.61 0.28 1.00
Joan E Walter 43 2018 Internal Excellent 809 0.850
Xianfeng Li 44 2017 Internal Not calibrated 39 0.921
Michal Reid 45 2019 External Excellent 45 0.810
Michael K. Gould 46 2007 Internal Excellent 375 0.790
Sungmin Zo 47 2020 Internal Excellent 157 0.952
Xiao‐Bo Chen 48 2019 External Excellent 216 0.848
Stephen J. Swensen 49 1997 Internal Excellent 210 0.833 0.10 0.93 0.47
0.40 0.51 0.90
Man Zhang 50 2015 Internal Not calibrated 120 0.910 0.55 0.87 0.85
Bin Zheng 1 51 2015 Internal Not calibrated 198 0.808
Bin Zheng 2 51 2015 Internal Not calibrated 84 0.845
Jingsi Dong 52 2014 Internal Not calibrated 1679 0.935
Yun Li 53 2012 External Not calibrated 145 0.874 0.46 0.95 0.70
Li Yang 54 2017 Internal Not calibrated 344 0.784 0.70 0.79
a

AUC, area under curve.

TABLE 4.

Validation of models based on the deep learning algorithm

First author Year Sample size Type of validation AUC a Threshold Sensitivity Specificity
Yogan and Balagurunathan 14 2019 235 Internal 0.850 0.54 0.91
Gerard A. Silvestri 15 2018 178 Internal 0.760 0.05 0.97 0.44
Chao Zhang 16 2019 Unspecified External 0.855 0.84 0.83
Johanna Uthoff 17 2019 100 External 0.965 0.38 1.00 0.96
Ilaria Bonavita 18 2020 Unspecified Internal Unspecified
Parnian Afshar 19 2020 1010 Internal 0.964 0.95 0.90
Huafeng Wang 20 2018 1018 Internal 0.970
Jason L. Causey 21 2018 1018 Internal 0.993
Samuel Hawkins 1 39 2016 600 Internal 0.83
Samuel Hawkins 2 39 2016 600 Internal 0.79
Andrew V. Kossenkov 23 2019 158 External 0.825 0.69 0.84
G. A. Soardi 24 2015 311 Internal 0.893
Zuohong Wu 25 2021 995 Internal 0.851 0.88 0.64
Stéphane Chauvie 26 2020 234 Internal Unspecified 0.90 1.00
Shulong Li 27 2019 1010 Internal 0.931 0.83 0.92
Rekka Mastouri 28 2021 Unspecified Internal 0.92 0.92 0.92
Yin‐Chen Hsu 29 2020 836 Internal 0.873 0.75 0.85
Jiabao Liu 30 2020 879 Internal 0.938 0.58 0.84 0.91
Rahul Paul 31 2020 261 Internal 0.960
Muahammad Bilal Zia 32 2020 1010 Internal Unspecified 0.91 0.91
Yi‐Ming Xu 33 2020 1109 Internal Unspecified 0.93 0.89
Subba R. Digumarthy 34 2019 36 Internal 0.708
Yangwei Xiang 35 2019 588 Internal 0.890 0.90 0.80
Liting Mao 36 2019 294 Internal 0.970 0.81 0.92
Shaun Daly 37 2013 81 External 0.676 0.95 0.25
a

AUC, area under curve.

Development and performance of traditional models

The model from the Mayo clinic in the United States published in 1997 49 was the first model used to predict the risk of developing cancer from pulmonary nodules. Since then, 18 traditional models have been developed to predict the pathological characteristics of pulmonary nodules. Seven of these models were based on the North American population; two models were based on the European population, and nine models were based on the Asian population. Of the nine Asian models evaluated in this review, eight models were based on the Chinese population (Table 1).

TABLE 1.

Basic information and development of traditional models

First author Year Study design Study method Target population Inclusion criteria of participants Inclusion criteria of nodules Sample size Cases of lung cancer EPVb Data source
Annette McWilliams 38 2013 Screen trial Logistic regression Canadian 50–74 years old ≥1 mm 1871 102 11.33 Multicenter
Barbara Nemesure 39 2019 Cohort study Cox regression American 1469 85 a 6.54 Single‐center
Michael W. Marcus 40 2019 Screen trial Logistic regression English 50–75 years old ≥3 mm 1013 52 2.60 Multicenter
Martin Tammemagi 41 2018 Screen trial Logistic regression Canadian 50–74 years old ≥1 mm 1871 111 10.10 Multicenter
Vineet K. Raghu 42 2019 Cohort study Logistic regression American Smoker 92 50 10.00 Multicenter
Joan E. Walter 43 2018 Screen trial Logistic regression Dutch/Belgian 50–75 years old and smoker 809 50 a 7.14 Multicenter
Xianfeng Li 44 2017 Cohort study Fisher discriminant analysis Chinese 20–80 years old 5–30 mm 39 20 1.00 Single‐center
Michal Reid 45 2019 Cohort study Logistic regression American ≥18 years old ≤30 mm 301 200 10.00 Single‐center
Michael K. Gould 46 2007 Cohort study Logistic regression American 7–30 mm 375 204 13.60 Multicenter
Sungmin Zo 47 2020 Cohort study Logistic regression Korean 157 90 5.29 Single‐center
Xiao‐Bo Chen 48 2019 Cohort study Logistic regression Chinese 8–20 mm 493 214 11.26 Single‐center
Stephen J. Swensen 49 1997 Cohort study Logistic regression American 4‐30 mm 419 145 a 8.06 Single‐center
Man Zhang 50 2015 Cohort study Logistic regression Chinese ≤30 mm 314 248 14.59 Multicenter
Bin Zheng 1 51 2015 Cohort study Logistic regression Chinese ≤30  mm and GCO b <50% 405 367 11.84 Single‐center
Bin Zheng 2 51 2015 Cohort study Logistic regression Chinese ≤30 mm and GCO ≥50% 159 166 5.35 Single‐center
Jingsi Dong 52 2014 Cohort study Logistic regression Chinese 1679 1296 58.91 Single‐center
Yun Li 53 2012 Cohort study Logistic regression Chinese 371 229 15.27 Unspecified
Li Yang 54 2017 Cohort study Logistic regression Chinese 1078 721 65.55 Single‐center
a

Approximate number.

b

EPV, events per variable; GCO, ground glass opacity.

Traditional models included numerous imaging features such as nodule size, type, location, shape, and margin to determine the pathological characteristics of the pulmonary nodules. In addition, basic information such as age, gender, family history of cancer, and smoking status was also commonly used. However, biomarkers were used in only seven models (Figure 2(e)).

Logistic regression analysis was used to develop most (16/18) traditional models. The models in the other two studies were developed using either Cox regression analysis or Fisher linear discriminant analysis. Most models (14/18) were cohort studies, and the remaining four were constructed using screening test results (Table 1). Based on the regression analysis, the size, margin of the nodules, smoking status, and age of patients were statistically significant in more than half of all models. The addition of biomarkers to tumor markers improved the AUC and statistical significance in three of the seven evaluated models, as shown in Table 5. These findings suggest that although biomarkers were not widely used to develop traditional models, they may have an important role in improving the accuracy of these models.

TABLE 5.

Variables of traditional models

Variables a First authors of models
Annette McWilliams 38 Barbara Nemesure 39 Michael W. Marcus 40 Martin Tammemagi 41 Vineet K. Raghu 42 Joan E. Walter 43 Xianfeng Li 44 Michal Reid 45 Michael K. Gould 46 Sungmin Zo 47 Xiao‐Bo Chen 48 Stephen J. Swensen 49 Man Zhang 50 Bin Zheng 1 51 Bin Zheng 2 51 Jingsi Dong 52 Yun Li 53 Li Yang 54
Basic character Age 0 1 1 0 1 1 1 1 0 1 1 1 0 0 1 1 1
Sex 1 0 1 1 0 0 0 0 0 0 1 1 0 0 1
Personal history of other cancer 1 1 0 0 1 0 0 0 0 1
Family history of lung cancer 0 0 1 1 0 0 0 0 0 0 1 1 0
Family history of other cancer 0 0 0 1 0 0 0 0 1 1 0
BMI b 0 0 0 0
Exposure of asbestos 0 1 0
FVC b 1
History of respiratory diseases 1 1 0 0 0 0 0
Smoke 1 1 0 0 1 0 0 0 0 1 1 1 0 1 0 1
Clinical symptoms 0 0 0
Time since previous lung cancer was diagnosed 0
FEV1 b 0 1 1
Biomarkers Squamous cell carcinoma antigen 0
NSE b 0 0
CEA b 1 0 0 0 0 0 1
CYFRA21‐1 b 1 0 1 1
MiRNA‐21‐5p b 1 0
MiR‐574‐5p 1 0
Laboratory indicators 0 0
Ferritin 0
Imaging information Size 1 1 0 1 1 1 0 0 1 1 1 1 1 1 1
Volume 1 1 1
Density 1 1 1 0 1 0 0 0 0
Location 0 0 1 1 1 0 1 0 1 0 1 0 0 1 0 0
Count 0 0 0 1 0
Margin (spiculate) 1 1 0 1 0 1 1 1 1 1 1 0 0 1 1
Satellite lesions 1 1 0 0 1
Calcification 0 0 1 1 0 1 1
Cavitation 0 0 0
Shape 0 0 0 0 0 0 0 0 1 0 1
Enhancement 1 0 0
Pleural indentation 1 0 0 0
Bronchus sign 1 0
Vascular signs 0 0
Enphysema 0 0 1 1 0 0
Vessels sign 0
Vessel number 1
Tracheal signs
Previous CT scan 0
Previous X‐ray 0
Vacuole signs
Associated pleural effusion 0 0
Enlarged hilar or mediastinal lymph nodes 0 0
Visibility in retrospect 0
Carbohydrate antigen 0
Neuron‐specific enolase 0
a

0 depicts the inclusion of a variable into the model as a candidate variable; 1 depicts retention in the final model.

bBMI, body mass index; FVC, forced vital capacity; FEV1, forced expiratory volume in one second; NSE, neuron‐specific enolase; CEA, carcinoembryonic antigen; CEFRA21‐1, cytokeratin fragment antigen 21‐1; MiR(NA), MicroRNA.

The AUCs of the models ranged from 0.676 to 0.970. Most models (77.8%, 14/18) performed well on discrimination, with an AUC higher or equal to 0.8. Calibration was assessed in nine models, and the results indicated a good fit. Most studies (61.1%, 11/18) had an EPV higher than 10, suggesting sufficient statistical power. Only six of the 18 models were validated using external datasets. However, five of these models were validated using external data from a similar population from the same countries, and only one model 38 was verified using data of participants from different origins. The latter model achieved good discrimination with an AUC of 0.970 (Tables 1 and 3).

Compared with the European and American models, the Chinese models lack external validation. Most of the data used to develop the Chinese models were obtained from a single‐center or small sample retrospective cohort studies and only two of these studies were validated using an external dataset. However, the discrimination ability of the Chinese models was good, with seven of eight models achieving an AUC higher than 0.8, whereas two models reported excellent calibration. In addition, all Chinese models had an EPV higher than 10. More details can be found in Tables 1, 3, and Figures 2 and 3.

Development and performance of the deep learning algorithms

The first study reporting on the development and performance of a deep learning algorithm for the discrimination of pulmonary nodules was published in 2013. 37 Only biomarkers were included in the development of this model, and the prediction ability was limited, with an AUC of 0.676. The majority of the deep learning models (84%, 21/25) were developed after 2018 and were based on the imaging features of the nodules. This improved the models' prediction ability, especially when the model was supplemented by epidemiological parameters and biomarkers (Figure 3).

The AUC of the deep learning models was reported in 21 of 25. However, only half of these models (12 of 21) reported the confidence intervals (Table 4). The reported AUCs ranged from 0.676 to 0.970. Most of the deep learning models (68.0%, 17/25) had a good discrimination ability with an AUC higher than 0.8, whereas the other four models (16.0%) had an AUC below 0.8. The majority of the models (84.0%, 21/25 were not validated externally [Table 2]).

Only seven of 18 deep learning models were developed in Asia. Furthermore, all Asian models achieved high discrimination with an AUC above 0.8. However, the sample size of the Asian models was generally small, and only one of these models was validated using an external dataset (Tables 2 and 4).

Comparison of deep learning models with traditional models

The discrimination ability of 60.0% (15/25) of the deep learning models was compared with traditional methods. All deep learning models achieved higher or similar discrimination abilities when compared with traditional methods (Table 6).

TABLE 6.

Comparison between existing methods and models based on the deep learning algorithm

First author Objects for comparison Indicators for comparison Superior methods
Yogan and Balagurunathan 14 None
Gerard A. Silvestri 15 Traditional models AUCa Deep learning
Gerard A. Silvestri 15 Clinician AUC Deep learning
Chao Zhang 16 Clinician Accuracy, sensitivity, and specificity Deep learning
Johanna Uthoff 17 None
Ilaria Bonavita 18 Clinician F1 score Deep learning
Parnian Afshar 19 None
Huafeng Wang 20 None
Jason L. Causey 21 Clinician AUC Similar
Samuel Hawkins 1,2 39 Lung‐RADS AUC Deep learning
Samuel Hawkins 1,2 39 Traditional models AUC Similar
Andrew V. Kossenkov 23 Traditional models AUC Deep learning
G. A. Soardi 24 None
Zuohong Wu 25 Traditional models AUC Deep learning
Stéphane Chauvie 26 Lung‐RADS PPVa, sensitivity, and specificity Deep learning
Stéphane Chauvie 26 Traditional models PPV, sensitivity, and specificity Deep learning
Shulong Li 27 None
Rekka Mastouri 28 None
Yin‐Chen Hsu 29 Lung‐RADS AUC Deep learning
Jiabao Liu 30 Clinician AUC Deep learning
Rahul Paul 31 None
Muahammad Bilal Zia 32 None
Yi‐Ming Xu 33 Clinician Sensitivity Deep learning
Subba R. Digumarthy 34 None
Yangwei Xiang 35 Traditional models AUC Deep learning
Liting Mao 36 ACR‐lung RADSa Accuracy, sensitivity, and specificity Deep learning
Shaun Daly 37 Traditional models AUC Deep learning
a

AUC, area under curve; ACR‐Lung‐RADS, American College of Radiology Lung Imaging Reporting and Data System; PPV, positive predictive value.

DISCUSSION

LDCT can be used to diagnose lung cancer at an early stage via the identification and classification of pulmonary nodules into different risk categories. However, current pulmonary nodules classification guidelines are based solely on nodule size and density. Other important biomarkers and patient characteristics are mostly ignored, resulting in a very high false‐positive rate, over diagnosis, and unnecessary treatment. 55 , 56 , 57 Various traditional and deep learning models based on clinical, biological, and epidemiological factors have been developed to overcome this problem. To our knowledge, in this manuscript, we present the first systemic review comparing the development, validation, and performance of these models in the characterization of pulmonary nodules identified on LDCT.

In this systemic review, we evaluated the performance of 43 models derived from 41 research articles based on over 20 000 subjects. Our findings indicate that the majority of the traditional and deep learning models achieved an AUC higher than 0.8, suggesting that these models can be used to identify the high‐risk population effectively and hence, reduce the false‐positive rate and the harms of over diagnosis and treatment.

Since 1997, the development of pulmonary nodule risk prediction models has increased rapidly. Most early models were developed using statistical methods such as regression analysis. Although imaging features such as nodule size, type, location, shape, and margin provide valuable information on the pathological characteristics of the nodules, our findings indicate that the incorporation of clinical characteristics such as age and smoking status can significantly improve the performance of these models. The first study confirming this finding was performed at the Mayo Clinic. 48 Since then, various traditional statistic‐based models incorporating both imaging and patient characteristics have been developed. Subsequent models also incorporated clinical indicators such as forced vital capacity (FVC) and forced expiratory volume (FEV)1, and serum biomarkers such as CEA and NSE, to further improve the prediction efficacy on the models. 39 , 40 , 50 , 51 , 52 Variables including age, size of the nodules, and margin of the nodules should be considered as a priory in machine‐learning analyses, as they were consistently considered as predictors of lung cancer in traditional studies.

A limited number of studies incorporated other risk factors such as exposure of asbestos, satellite lesions, bronchus sign, and volume of nodules (Table 5). However, the main limitation of these risk factors is the limited sample size that limits the generalizability of the model. A large number of models were based on single‐center and retrospective studies with small sample sizes or data obtained from old studies. Biomarkers were not commonly used in the development of the predictive risk factor model (Table 5, Figure 2(e)). Nodule volume might have been an effective predictor, 40 , 42 but was generally not taken into consideration by current models. Because most studies were retrospective, it was not possible to incorporate time‐dependent variables such as variations in biomarkers and nodule size over time into the model. Therefore, time‐dependent factors, such as the nodule volume growth rate, were also ignored by most studies.

Deep learning models can learn from various heterogeneous variables to generate homogeneous groups with similar features. These features can be mapped with similar survival models to obtain accurate predictions. Various studies 15 , 20 , 23 , 29 also suggest that compared with the traditional pulmonary nodule prediction models or expert judgment by clinicians, the use of deep learning algorithms has obvious advantages on discrimination (Table 6). However, although pulmonary nodule risk models based on deep learning algorithms have been used as early as 1993, 58 they have not been widely used to predict pulmonary nodules until recent years as they still have several limitations. One of the main limitations of deep learning algorithms is that they require large amounts of data, advanced imaging equipment, top‐ranked statisticians, and research funds to develop. Despite the high discrimination ability of the deep learning algorithm models evaluated in our systemic review, the GRADE scores of these models were generally low because of their limited sample size, high level of bias, inaccuracy, and indirectness (Table S2). Furthermore, it is difficult to identify the specific variables used to develop the deep learning prediction model, potentially limiting the quality and authenticity of these models.

Few studies were based on the Asian population. The majority of the Asian studies were based on a single center, had a limited sample size, and lacked external validation, which limited the quality of evidence (Tables 3 and 4, Figure 2). It is important to note that the accepted European and United States models may not be suitable for the Asian and Chinese populations because of large population differences, as suggested by Uthoff et al. 59 and Nair et al. 60

Our systemic review has several limitations that have to be acknowledged. First of all, variations between studies, including sample size, research design, data source, and imaging acquisition criteria, made it difficult to quantify, integrate, and extrapolate the results of the different studies. Some of the studies included in our analysis had high publication bias, particularly those that lacked external validity. Additionally, cultural and social risk factors were ignored by most models. Studies evaluating a single risk factor were also excluded from this analysis although these variables were highly predictive of lung cancer and represent the latest trend in the field.

Furthermore, most of the existing models were based on the entire population. Therefore, subgroup analysis based on important risk factors such as smoking status and tumor histology is recommended to improve the prediction performance of current models and adapt these tools according to the specific characteristics of the population being studied. However, this type of research requires large datasets, highlighting the need for further large‐scale multicenter prospective studies. Future studies should also focus on developing deep learning based models based on decentralized and deparametric data. 61 These methods process the raw data directly and therefore, reduce the heterogeneity while improving the models' performance compared with traditional models.

CONCLUSION

The incidence of lung cancer is increasing, particularly in developing countries. The models evaluated in our study were all developed in Europe, Asia, and the United States. These models showed good discrimination for identifying high‐risk pulmonary nodules, particularly when these models combined imaging features with clinical, behavioral characteristics, and other biomarkers. This highlights the need to develop models based on the unique characteristics of different populations, particularly those in developing countries, to reduce the global lung cancer burden. The use of deep learning algorithms increased significantly during the last few years and generally performed better than traditional models. However, more research is required to improve the quality of the deep learning models, particularly for the Asian population, because these models were often based on single‐center studies and lacked external validation. Further research should also focus on improving the quality of current screening guidelines by incorporating clinical and epidemiological factors into the evaluation of pulmonary nodules.

CONFLICT OF INTEREST

The author declares that there is no conflict of interest that could be perceived as prejudicing the impartiality of the research reported.

Supporting information

Appendix S1 Supporting Information

ACKNOWLEDGMENTS

This study was funded by grants from the National Key Research and Development Program of China (2018YFC1315000), Non‐profit Central Research Institute Fund of Chinese; National Natural Science Foundation of China (8187102812); Non‐profit Central Research Institute Fund of Chinese Academy of Medical Sciences (2020PT330001, 2019PT320027, 2019PT320023, 2018RC320010, and 3332019005).

Wu Z, Wang F, Cao W, Qin C, Dong X, Yang Z, et al. Lung cancer risk prediction models based on pulmonary nodules: A systematic review. Thorac Cancer. 2022;13:664–677. 10.1111/1759-7714.14333

Zheng Wu and Fei Wang equally contributed to this work.

Funding information National Key Research and Development Program of China, Grant/Award Number: 2018YFC1315000; National Natural Science Foundation of China, Grant/Award Number: 8187102812; Non‐profit Central Research Institute Fund of Chinese Academy of Medical Sciences, Grant/Award Numbers: 2018RC320010, 2019PT320027, 2019PT320023, 2020‐PT330‐001, 3332019005

Contributor Information

Fengwei Tan, Email: tanfengwei@cicams.ac.cn.

Ni Li, Email: nli@cicams.ac.cn.

Jie He, Email: hejie@cicams.ac.cn.

REFERENCES

  • 1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71:209–49. [DOI] [PubMed] [Google Scholar]
  • 2. Santucci C, Carioli G, Bertuccio P, Malvezzi M, Pastorino U, Boffetta P, et al. Progress in cancer mortality, incidence, and survival: a global overview. Eur J Cancer Prev. 2020;29:367–81. [DOI] [PubMed] [Google Scholar]
  • 3. Goldstraw P, Chansky K, Crowley J, Rami‐Porta R, Asamura H, Eberhardt WE, et al. The IASLC lung cancer staging project: proposals for revision of the TNM stage groupings in the forthcoming (eighth) edition of the TNM classification for lung cancer. J Thorac Oncol. 2016;11:39–51. [DOI] [PubMed] [Google Scholar]
  • 4. Aberle DR, Adams AM, Berg CD, Black WC, Clapp JD, Fagerstrom RM, et al. Reduced lung‐cancer mortality with low‐dose computed tomographic screening. N Engl J Med. 2011;365:395–409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Henschke CI, Naidich DP, Yankelevitz DF, McGuinness G, McCauley DI, Smith JP, et al. Early lung cancer action project: initial findings on repeat screenings. Cancer. 2001;92:153–9. [DOI] [PubMed] [Google Scholar]
  • 6. Church TR, Black WC, Aberle DR, Berg CD, Clingan KL, Duan F, et al. Results of initial low‐dose computed tomographic screening for lung cancer. N Engl J Med. 2013;368:1980–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Swensen SJ, Jett JR, Sloan JA, Midthun DE, Hartman TE, Sykes AM, et al. Screening for lung cancer with low‐dose spiral computed tomography. Am J Respir Crit Care Med. 2002;165:508–13. [DOI] [PubMed] [Google Scholar]
  • 8. Baldwin DR, Callister ME. The British Thoracic Society guidelines on the investigation and management of pulmonary nodules. Thorax. 2015;70:794–8. [DOI] [PubMed] [Google Scholar]
  • 9. Oudkerk M, Devaraj A, Vliegenthart R, Henzler T, Prosch H, Heussel CP, et al. European position statement on lung cancer screening. Lancet Oncol. 2017;18:e754–66. [DOI] [PubMed] [Google Scholar]
  • 10. Chan HP, Samala RK, Hadjiiski LM, Zhou C. Deep learning in medical image analysis. Adv Exp Med Biol. 2020;1213:3–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–44. [DOI] [PubMed] [Google Scholar]
  • 12. Iorio A, Spencer FA, Falavigna M, Alba C, Lang E, Burnand B, et al. Use of GRADE for assessment of evidence about prognosis: rating confidence in estimates of event rates in broad categories of patients. BMJ. 2015;350:h870. [DOI] [PubMed] [Google Scholar]
  • 13. Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR. A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol. 1996;49:1373–9. [DOI] [PubMed] [Google Scholar]
  • 14. Balagurunathan Y, Schabath MB, Wang H, Liu Y, Gillies RJ. Quantitative imaging features improve discrimination of malignancy in pulmonary nodules. Sci Rep. 2019;9:8528. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Silvestri GA, Tanner NT, Kearney P, Vachani A, Massion PP, Porter A, et al. Assessment of plasma proteomics biomarker's ability to distinguish benign from malignant lung nodules: results of the PANOPTIC (pulmonary nodule plasma proteomic classifier) trial. Chest. 2018;154:491–500. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Zhang C, Sun X, Dang K, Li K, Guo XW, Chang J, et al. Toward an expert level of lung cancer detection and classification using a deep convolutional neural network. Oncologist. 2019;24:1159–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Uthoff J, Stephens MJ, Newell JD Jr, Hoffman EA, Larson J, Koehn N, et al. Machine learning approach for distinguishing malignant and benign lung nodules utilizing standardized perinodular parenchymal features from CT. Med Phys. 2019;46:3207–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Bonavita I, Rafael‐Palou X, Ceresa M, Piella G, Ribas V, González Ballester MA. Integration of convolutional neural networks for pulmonary nodule malignancy assessment in a lung cancer classification pipeline. Comput Methods Programs Biomed. 2020;185:105172. [DOI] [PubMed] [Google Scholar]
  • 19. Afshar P, Oikonomou A, Naderkhani F, Tyrrell PN, Plataniotis KN, Farahani K, et al. 3D‐MCN: a 3D multi‐scale capsule network for lung nodule malignancy prediction. Sci Rep. 2020;10:7948. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Wang H, Zhao T, Li LC, Pan H, Liu W, Gao H, et al. A hybrid CNN feature model for pulmonary nodule malignancy risk differentiation. J Xray Sci Technol. 2018;26:171–87. [DOI] [PubMed] [Google Scholar]
  • 21. Causey JL, Zhang J, Ma S, Jiang B, Qualls JA, Politte DG, et al. Highly accurate model for prediction of lung nodule malignancy with CT scans. Sci Rep. 2018;8:9286. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Hawkins S, Wang H, Liu Y, Garcia A, Stringfield O, Krewer H, et al. Predicting malignant nodules from screening CT scans. J Thorac Oncol. 2016;11:2120–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Kossenkov AV, Qureshi R, Dawany NB, Wickramasinghe J, Liu Q, Majumdar RS, et al. A gene expression classifier from whole blood distinguishes benign from malignant lung nodules detected by low‐dose CT. Cancer Res. 2019;79:263–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Soardi GA, Perandini S, Motton M, Montemezzi S. Assessing probability of malignancy in solid solitary pulmonary nodules with a new Bayesian calculator: improving diagnostic accuracy by means of expanded and updated features. Eur Radiol. 2015;25:155–62. [DOI] [PubMed] [Google Scholar]
  • 25. Wu Z, Huang T, Zhang S, Cheng D, Li W, Chen B. A prediction model to evaluate the pretest risk of malignancy in solitary pulmonary nodules: evidence from a large Chinese southwestern population. J Cancer Res Clin Oncol. 2021;147:275–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Chauvie S, De Maggi A, Baralis I, Dalmasso F, Berchialla P, Priotto R, et al. Artificial intelligence and radiomics enhance the positive predictive value of digital chest tomosynthesis for lung cancer detection within SOS clinical trial. Eur Radiol. 2020;30:4134–40. [DOI] [PubMed] [Google Scholar]
  • 27. Li S, Xu P, Li B, Chen L, Zhou Z, Hao H, et al. Predicting lung nodule malignancies by combining deep convolutional neural network and handcrafted features. Phys Med Biol. 2019;64:175012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Mastouri R, Khlifa N, Neji H, Hantous‐Zannad S. A bilinear convolutional neural network for lung nodules classification on CT images. Int J Comput Assist Radiol Surg. 2021;16:91–101. [DOI] [PubMed] [Google Scholar]
  • 29. Hsu YC, Tsai YH, Weng HH, Hsu LS, Tsai YH, Lin YC, et al. Artificial neural networks improve LDCT lung cancer screening: a comparative validation study. BMC Cancer. 2020;20:1023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Liu J, Zhao L, Han X, Ji H, Liu L, He W. Estimation of malignancy of pulmonary nodules at CT scans: effect of computer‐aided diagnosis on diagnostic performance of radiologists. Asia Pac J Clin Oncol. 2021;17:216–21. [DOI] [PubMed] [Google Scholar]
  • 31. Paul R, Schabath M, Gillies R, Hall L, Goldgof D. Convolutional neural network ensembles for accurate lung nodule malignancy prediction 2 years in the future. Comput Biol Med. 2020;122:103882. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Zia MB, Juan ZJ, Zhou XJ, Xiao N, Wang JW, Khan A. Classification of malignant and benign lung nodule and prediction of image label class using multi‐deep model. Int J Adv Comp Sci Appl. 2020;11:35–41. [Google Scholar]
  • 33. Xu YM, Zhang T, Xu H, Qi L, Zhang W, Zhang YD, et al. Deep learning in CT images: automated pulmonary nodule detection for subsequent management using convolutional neural network. Cancer Manag Res. 2020;12:2979–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Digumarthy SR, Padole AM, Rastogi S, Price M, Mooradian MJ, Sequist LV, et al. Predicting malignant potential of subsolid nodules: can radiomics preempt longitudinal follow up CT? Cancer Imaging. 2019;19:36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Xiang YW, Sun YF, Liu Y, Han BH, Chen QH, Ye XD, et al. Development and validation of a predictive model for the diagnosis of solid solitary pulmonary nodules using data mining methods. J Thorac Dis. 2019;11:950–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Mao LT, Chen H, Liang MZ, Li KW, Gao JB, Qin PX, et al. Quantitative radiomic model for predicting malignancy of small solid pulmonary nodules detected by low‐dose CT screening. Quant Imaging Med Surg. 2019;9:263–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Daly S, Rinewalt D, Fhied C, Basu S, Mahon B, Liptay MJ, et al. Development and validation of a plasma biomarker panel for discerning clinical significance of indeterminate pulmonary nodules. J Thorac Oncol. 2013;8:31–6. [DOI] [PubMed] [Google Scholar]
  • 38. McWilliams A, Tammemagi MC, Mayo JR, Roberts H, Liu G, Soghrati K, et al. Probability of cancer in pulmonary nodules detected on first screening CT. N Engl J Med. 2013;369:910–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Nemesure B, Clouston S, Albano D, Kuperberg S, Bilfinger TV. Will that pulmonary nodule become cancerous? A risk prediction model for incident lung cancer. Cancer Prev Res (Phila). 2019;12:463–70. [DOI] [PubMed] [Google Scholar]
  • 40. Marcus MW, Duffy SW, Devaraj A, Green BA, Oudkerk M, Baldwin D, et al. Probability of cancer in lung nodules using sequential volumetric screening up to 12 months: the UKLS trial. Thorax. 2019;74:761–7. [DOI] [PubMed] [Google Scholar]
  • 41. Tammemagi M, Ritchie AJ, Atkar‐Khattra S, Dougherty B, Sanghera C, Mayo JR, et al. Predicting malignancy risk of screen‐detected lung nodules‐mean diameter or volume. J Thorac Oncol. 2019;14:203–11. [DOI] [PubMed] [Google Scholar]
  • 42. Raghu VK, Zhao W, Pu J, Leader JK, Wang R, Herman J, et al. Feasibility of lung cancer prediction from low‐dose CT scan and smoking factors using causal models. Thorax. 2019;74:643–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Walter JE, Heuvelmans MA, Bock GH, Yousaf‐Khan U, Groen HJM, Aalst CMV, et al. Characteristics of new solid nodules detected in incidence screening rounds of low‐dose CT lung cancer screening: the NELSON study. Thorax. 2018;73:741–7. [DOI] [PubMed] [Google Scholar]
  • 44. Li X, Zhang Q, Jin X, Cao L. Combining serum miRNAs, CEA, and CYFRA21‐1 with imaging and clinical features to distinguish benign and malignant pulmonary nodules: a pilot study: Xianfeng Li et al.: combining biomarker, imaging, and clinical features to distinguish pulmonary nodules. World J Surg Oncol. 2017;15:107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Reid M, Choi HK, Han X, Wang X, Mukhopadhyay S, Kou L, et al. Development of a risk prediction model to estimate the probability of malignancy in pulmonary nodules being considered for biopsy. Chest. 2019;156:367–75. [DOI] [PubMed] [Google Scholar]
  • 46. Gould MK, Ananth L, Barnett PG. A clinical model to estimate the pretest probability of lung cancer in patients with solitary pulmonary nodules. Chest. 2007;131:383–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Zo S, Woo SY, Kim S, Lee JE, Jeong BH, Um SW, et al. Predicting the risk of malignancy of lung nodules diagnosed as indeterminate on radial endobronchial ultrasound‐guided biopsy. J Clin Med. 2020;9:3652. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Chen XB, Yan RY, Zhao K, Zhang F, Li YJ, Wu L, et al. Nomogram for the prediction of malignancy in small (8‐20 mm) indeterminate solid solitary pulmonary nodules in Chinese populations. Cancer Manag Res. 2019;11:9439–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Swensen SJ, Silverstein MD, Ilstrup DM, Schleck CD, Edell ES. The probability of malignancy in solitary pulmonary nodules ‐ application to small radiologically indeterminate nodules. Arch Intern Med. 1997;157:849–55. [PubMed] [Google Scholar]
  • 50. Zhang M, Zhuo N, Guo ZL, Zhang XG, Liang WH, Zhao S, et al. Establishment of a mathematic model for predicting malignancy in solitary pulmonary nodules. J Thorac Dis. 2015;7:1833–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Zheng B, Zhou XW, Chen JH, Zheng W, Duan Q, Chen C. A modified model for preoperatively predicting malignancy of solitary pulmonary nodules: an Asia cohort study. Ann Thorac Surg. 2015;100:288–94. [DOI] [PubMed] [Google Scholar]
  • 52. Dong JS, Sun N, Li JG, Liu ZY, Zhang BH, Chen ZL, et al. Development and validation of clinical diagnostic models for the probability of malignancy in solitary pulmonary nodules. Thorac Cancer. 2014;5:162–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Li Y, Wang J. A mathematical model for predicting malignancy of solitary pulmonary nodules. World J Surg. 2012;36:830–5. [DOI] [PubMed] [Google Scholar]
  • 54. Yang L, Zhang Q, Bai L, Li TY, He C, Ma QL, et al. Assessment of the cancer risk factors of solitary pulmonary nodules. Oncotarget. 2017;8:29318–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Wood DE, Kazerooni EA, Baum SL, Eapen GA, Ettinger DS, Hou L, et al. Lung cancer screening, version 3.2018, NCCN clinical practice guidelines in oncology. J Natl Compr Canc Netw. 2018;16:412–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Donnelly EF, Kazerooni EA, Lee E, Henry TS, Boiselle PM, Crabtree TD, et al. ACR appropriateness criteria® lung cancer screening. J Am Coll Radiol. 2018;15:S341–6. [DOI] [PubMed] [Google Scholar]
  • 57. Mazzone PJ, Silvestri GA, Souter LH, Caverly TJ, Kanne JP, Katki HA, et al. Screening for lung cancer: CHEST guideline and expert panel report. Chest. 2021;160:e427–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Gurney JW. Determining the likelihood of malignancy in solitary pulmonary nodules with Bayesian analysis. Part I. Theory. Radiology. 1993;186:405–13. [DOI] [PubMed] [Google Scholar]
  • 59. Uthoff J, Koehn N, Larson J, Dilger SKN, Hammond E, Schwartz A, et al. Post‐imaging pulmonary nodule mathematical prediction models: are they clinically relevant? Eur Radiol. 2019;29:5367–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Nair VS, Sundaram V, Desai M, Gould MK. Accuracy of models to identify lung nodule cancer risk in the national lung screening trial. Am J Respir Crit Care Med. 2018;197:1220–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Warnat‐Herresthal S, Schultze H, Shastry KL, Manamohan S, Mukherjee S, Garg V, et al. Swarm learning for decentralized and confidential clinical machine learning. Nature. 2021;594:265–70. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix S1 Supporting Information


Articles from Thoracic Cancer are provided here courtesy of Wiley

RESOURCES