Abstract
Objectives
Classification of histologic subgroups has significant prognostic value for lung adenocarcinoma patients who undergo surgical resection. However, clinical histopathology assessment is generally performed on only a small portion of the overall tumor from biopsy or surgery. Our objective is to identify a noninvasive quantitative imaging biomarker (QIB) for the classification of histologic subgroups in lung adenocarcinoma patients.
Methods
We retrospectively collected and reviewed 1313 CT scans of patients with resected lung adenocarcinomas from two geographically distant institutions who were seen between January 2014 and October 2017. Three study cohorts, the training, internal validation, and external validation cohorts, were created, within which lung adenocarcinomas were divided into two disease-free-survival (DFS)-associated histologic subgroups, the mid/poor and good DFS groups. A comprehensive machine learning– and deep learning–based analytical system was adopted to identify reproducible QIBs and help to understand QIBs’ significance.
Results
Intensity-Skewness, a QIB quantifying tumor density distribution, was identified as the optimal biomarker for predicting histologic subgroups. Intensity-Skewness achieved high AUCs (95% CI) of 0.849(0.813,0.881), 0.820(0.781,0.856) and 0.863(0.827,0.895) on the training, internal validation, and external validation cohorts, respectively. A criterion of Intensity-Skewness ≤ 1.5, which indicated high tumor density, showed high specificity of 96% (sensitivity 46%) and 99% (sensitivity 53%) on predicting the mid/poor DFS group in the training and external validation cohorts, respectively.
Conclusions
A QIB derived from routinely acquired CT was able to predict lung adenocarcinoma histologic subgroups, providing a noninvasive method that could potentially benefit personalized treatment decision-making for lung cancer patients.
Keywords: Adenocarcinoma of lung, Histological types of neoplasms, Tomography, X-ray computed, Deep learning, Machine learning
Introduction
Lung cancer is the most frequently diagnosed cancer and the leading cause of cancer-related deaths worldwide of which adenocarcinoma is the most common histologic subtype [1]. In 2011, the International Association for the Study of Lung Cancer (IASLC), the American Thoracic Society (ATS), and the European Respiratory Society (ERS) proposed a lung cancer histologic subtype classification system [2]. According to the IASLC/ATS/ERS classification system, lung adenocarcinoma is classified as adenocarcinoma in situ (AIS); minimally invasive adenocarcinoma (MIA); invasive adenocarcinoma, which was subdivided into lepidic predominant (LEP), acinar predominant (ACI), papillary predominant (PAP), micropapillary predominant (MIP), and solid predominant (SOL) adenocarcinoma; and invasive mucinous (MUC) adenocarcinomas. The IASLC/ATS/ERS classification system has demonstrated significant prognostic and predictive value for patients with resected lung adenocarcinoma [3]. Patients with AIS/MIA/LEP and a complete surgical resection have an excellent prognosis with a 5-year disease-free survival (DFS) higher than 90%. In contrast, patients with ACI/PAP/MIP/SOL/MUC subtypes have a worse prognosis with an average 5-year DFS less than 65% [3]. Successful classification of lung adenocarcinoma patients into subtypes with associated differences in DFS may be important in stratifying patients either prior to surgery or for adjuvant therapy after surgical resection [4].
In current clinical practice, information regarding histologic subtypes of lung adenocarcinoma is based on analysis of pathology specimens. However, for this type of analysis, generally a small portion of the entire tumor is sampled and analyzed. A comprehensive characterization of the entire tumor is generally not performed. Technological advances in medical imaging and image-based noninvasive quantitative imaging biomarker (QIB) hold promise in addressing limitations of sample size [5] principally in the ability to interrogate the entire tumor on imaging. The use of QIB has showed promise in many thoracic oncologic applications [6–11]. We hypothesize that noninvasive QIBs might provide complementary information to histopathology and ultimately could help improve the classification of histologic subgroups of lung adenocarcinoma patients.
Therefore, in this study, we developed a robust and generalizable CT-based QIB to predict histologic subgroups that have an association with DFS in patients with resected lung adenocarcinoma. This was rigorously validated in an external validation using a consecutively collected dataset from a geographically widely-separated institution. To identify optimal QIB, we used our highly automated analytical pipeline which was consisted of deep learning–based [12] tumor segmentation, well-validated and reproducible imaging features, and optimal constructed machine learning model.
Materials and methods
Patient data collection
One thousand three hundred thirteen lung adenocarcinoma patients were retrospectively collected from two geographically distinct institutions A and B locating at the South and East of China [13], respectively. The study was approved by the Institutional Review Boards of the two institutions. Written informed consent was waived. At institution A, training and internal validation cohorts were created, which included 463 and 428 patients consecutively collected from January 2016 to October 2017 and from January 2014 to December 2015 respectively. At institution B, a separate external validation cohort was created, which included 422 patients consecutively collected from January 2017 to December 2017. The inclusion criteria were, patients (a) who underwent complete surgical resection of solitary lung lesion, (b) who had no prior history of primary lung cancer, (c) whose pathology reports were available for review, (d) who had CT scans of the chest performed within 4 weeks of surgery, and (f) who had no treatment for lung cancer prior to surgery. The exclusion criteria were (a) patients who did not have contrast-enhanced CT images of minimal acceptable image quality and (b) CT images that contained artifacts that were too severe to be analyzed. The flow diagram of patient collection was provided in Supplement S1. All clinical data used in this work were deposited at the Research Data Deposit public platform (www.researchdata.org.cn) under the approval number RDDA2019001083.
The pathology reports of patients were retrieved from the respective institutional databases. Based on the pathology reports, each tumor was assigned to a predominant histological subtype according to the component which accounted for the greatest percentage within the specimen [2]. For instance, if a specimen was reported as “70% ACI, 20% LEP, and 10% PAP,” it was assigned the histological subtype of ACI predominant. Using this assignment, all patients were then divided into good and mid/poor 5-year DFS associated with their corresponding histologic subtypes [3, 4, 14, 15], i.e., the good DFS group consisting of subtypes of AIS, MIA, and LEP, and the mid/poor DFS group consisting of subtypes of ACI, PAP, MIP, SOL, and MUC. Cases that contained equal percentage of components belonging to two histologic subgroups respectively were assigned to the mid/poor DFS group. This histologic subgroup assignment was performed by an experienced blinded independent pathologist.
System design
The QIB identification system is presented in Fig. 1. The workflow to identify QIBs consisted of five steps: (1) region of interest (ROI) initialization, (2) tumor segmentation, (3) feature extraction, (4) QIB identification, and (5) biomarker analysis (see Fig. 1 and Supplement S2).
Fig. 1.

Design of the quantitative imaging biomarker identification and analytical system
ROI initialization and tumor segmentation
Radiologists reviewed all slices of the CT scan and drew an initial ROI on one CT slice where the tumor appeared to have the largest diameter (see Supplement S3 for more details).
For automated three-dimensional (3D) tumor segmentation, the “Multi-View Aggregation” (MVA) scheme [16] based deep-learning segmentation technique was utilized. The MVA scheme aggregates multiple two-dimensional (2D) single and random view segmentation to fulfill 3D segmentation. The tumor segmentation was composed of three steps: (1) determine the tumor center from the initial ROI via a regular 2D segmentation, (2) perform multiple 2D single and random view segmentation around the tumor center, and (3) fill the gaps between 2D segmentation by tri-linear interpolation. In our study, the 2D segmentor was a 2D SegNet [17] pretrained on ImageNet [18] and then fine-tuned on the training cohort. See Supplement S4 for details to train the segmentation network.
Quantitative imaging biomarker extraction and identification
In total, 1164 quantitative imaging features were extracted from each segmented tumor [19]. And then, an integrated machine-learning system was adopted to identify QIBs out of the 1164 imaging features [20, 21].
The machine-learning system was based on a coarse-to-fine two-stage strategy. In the coarse stage, a large number of redundant and non-informative features were removed using unsupervised clustering and feature ranking algorithms. The unsupervised hierarchical clustering was performed in three steps: (1) Spearman’s rank correlations were calculated between features, (2) features were organized into a hierarchical clustering tree based on these correlations, and (3) features were separated into groups based on a set correlation threshold. Within each feature group, six feature ranking algorithms were applied to rank the correlated features and only the most informative features (i.e., top-ranked feature according to feature ranking algorithm) were kept while others were excluded from further analysis. The six feature ranking algorithms we used were t test score, Wilcoxon-test score, uni-variance AUC, mutual information, mRMR [22], and RelieF [23].
In the fine-stage, an Incremental Forward Search (IFS) approach [24] and six state-of-the-art machine-learning classification algorithms were applied to evaluate the individual/combined classification ability of the candidate features attained from the coarse stage. IFS is an approach that begins with an empty feature set and then iteratively includes features if and only if the addition of the feature will increase the performance of classification. The six machine learning–based classification algorithms used for building classification model were k-nearest neighbor, naïve Bayes, least absolute shrinkage and selection operator (LASSO), support vector machine (SVM) [25], bagging [26], and random forests [27].
Finally, a total of 6 × 6 = 36 candidate classification models (six feature ranking algorithms by six machine-learning classification algorithms) were attained after the two-stage feature selection. Among the 36 candidate models, the model that achieved the highest under the receiver operating characteristic curve (AUC) was selected as the optimal model. Features selected to build the optimal model were identified as optimal features.
More details for feature extraction and identification are provided in Supplement S5.
Biomarker analysis
Stability analysis
Two types of reproducibility of the imaging features were evaluated: (1) test-retest reproducibility which tests the measurement stability under various CT imaging parameters, and (2) inter-reader reproducibility which tests the measurement stability between radiologists. The test-retest reproducibility was evaluated using the publicly available same-day repeat CT scan dataset [28]. The inter-reader reproducibility was evaluated on QIBs computed from two different ROI initiations, which were performed by two radiologists of different experience (E.L. with more than 20 years of experience and W.D. with approximately 3 years of experience) using two different image visualization platforms, the Weasis [29] and 3D-slicer [30], respectively.
Benchmarking
Identified QIBs were compared with common clinical and radiological parameters. Clinical parameters included age, gender, TNM stage (according to the American Joint Committee on Cancer TNM system, the 8th edition), and tumor volume. Radiological parameters, also called CT signs, were qualitative signs on CT images defined by radiologists. In our study, 11 lung CT signs were defined by a retrospective review of prespecified radiology studies (see definitions in Table S1 in Supplement S6). The most significant radiological parameters were identified for the benchmarking based on the criteria that radiological parameters should be inter-reader reproducible (i.e., Kappa-index > 0.6) and with predictive ability (i.e., distribution between the dichotomous data should be significantly different with p value < 0.01).
Cutoff analysis
Different selections of cutoff values for the QIBs were evaluated in terms of true positive, false positive, true negative, false negative, sensitivity, and specificity. In our study, the mid/poor and good DFS groups were defined as positive and negative groups, respectively.
Stratification analysis
Patients were stratified based upon age, gender, and TNM stage as well as CT scanner device (manufacturer and version) and reconstruction slice thickness.
Biomarker interpretation
After identification of optimal QIBs, we attempted to better understand the radiologic/pathologic correlation of these by evaluating their association with actual radiology features (i.e., CT sign), as interpreted by expert radiologists.
Multi-category analysis
The identified QIBs were applied to predict the eight histologic subtypes directly as a multi-category problem, given that it would be of clinical importance if each histologic subtype could be correctly identified.
Statistical analysis
Statistical analysis was performed with use of MedCalc (version 15.8), SPSS (version 22.0) and MATLAB (version 2018b; Mathworks). Chi-square test and t test were employed to evaluate statistical significance for categorical data and continuous data, respectively. A two-sided p value < 0.001 indicated significance. Area under the receiver operating characteristic curve (AUC) and AUC’s 95% confidence interval (CI) were used to indicate classification performance. In the training cohort, AUCs were evaluated by tenfold cross-validation. This technique splits the data into ten subgroups, using nine subgroups for training and the remaining subgroup for testing. The subgroup used for testing is rotated until all cases have been used for testing exactly once. The final AUC of the training cohort was the average of the ten testing AUCs, while in the internal and external validation cohorts, the AUCs were directly computed. Concordance correlation coefficient (CCC) was applied to evaluate reproducibility for QIBs. For tumor segmentation, Intersection over Union (IoU) was applied to evaluate the segmentation performance.
Results
Patient characteristics and CT examination
Patient characteristics are provided in Table 1. The number of the two outcome subgroups, mid/poor vs. good DFS groups within the training, internal validation, and external validation cohorts, were 350 (76%) vs. 113 (24%), 288 (67%) vs. 140 (33%), and 349 (83%) vs. 73 (17%), respectively. The external validation cohort collected from East China showed significant differences to the training cohort collected from South China. Patients’ CT examination is provided in Table S2 in Supplement S7. All CT imaging characteristics showed significant difference between the training and validation cohorts.
Table 1.
Patient characteristics
| Training cohort | Internal validation cohort | p ° | External validation cohort | p ° | ||
|---|---|---|---|---|---|---|
| Year | 2016~2017 | 2014~2015 | / | 2017 | / | |
| Collection region | South China | South China | / | East China | / | |
| Gender | Male | 238 (51%) | 220 (51%) | 0.947 | 170 (40%) | 0.0012* |
| Female | 225 (49%) | 208 (49%) | 252 (60%) | |||
| Age | Mean ± std | 59.8 ± 9.3 | 58.3 ± 9.5 | 0.4708 | 58.7 ± 9.7 | 0.5752 |
| Tumor size (mm) | Mean ± std | 28.8 ± 13.1 | 26.4 ± 11.8 | 0.0052* | 24.4 ± 11.6 | <0.0001* |
| Histologic subgroups based on 5-year DFS | Mid/Poor | 350 (76%) | 288 (67%) | 0.0075* | 349 (83%) | 0.0121* |
| Good | 113 (24%) | 140 (33%) | 73 (17%) | |||
| Histologic subtypes | AIS | 24 (5%) | 28 (7%) | <0.0001* | 10 (2%) | <0.0001* |
| MIA | 7 (2%) | 3 (1%) | 36 (9%) | |||
| LEP | 82 (18%) | 109 (25%) | 27 (6%) | |||
| ACI | 232 (50%) | 126 (29%) | 233 (55%) | |||
| PAP | 61 (13%) | 65 (15%) | 40 (9%) | |||
| MIP | 11 (2%) | 20 (5%) | 18 (4%) | |||
| SOL | 33 (7%) | 20 (5%) | 34 (8%) | |||
| MUC | 13 (3%) | 57 (13%) | 24 (6%) | |||
| TNM stage | 0 | 10 (2%) | 17 (4%) | 0.0007* | 15 (4%) | 0.0023* |
| I | 354 (76%) | 278 (65%) | 311 (74%) | |||
| II | 41 (9%) | 69 (16%) | 44 (10%) | |||
| IIIa | 58 (13%) | 64 (15%) | 52 (12%) |
Comparisons with training set. Chi-square test for categorical data and t test for continuous data
Significant differences p < 0.01
Tumor segmentation
The proposed automated 3D tumor segmentation achieved a state-of-art IoU of 70.04 ± 11.68%. Comparisons with other 3D lung tumor segmentation algorithms are provided in Table S3 in Supplement S8.
Results of quantitative imaging biomarker identification
Six quantitative imaging features were selected to build the optimal predictive model based on the mRMR feature ranking and the LASSO classification algorithm (see Table S4 in Supplement S9). When observed on the corresponding IFS curve (see Fig. S5 in supplement S9), the single top feature contributed the most to the model, while the other features provided only marginal contributions to the model. This observation was confirmed by the ROC curve comparison between models built with the single best feature and that built with the top six features (see Fig. S6 in supplement S9). Therefore, in our study, the top feature, defined as Intensity-Skewness, was identified as the optimal QIB to predict lung adenocarcinoma histologic subgroups. More details for the biomarker identification is provided in Supplement S9.
Results of biomarker analysis
Stability analysis
The Intensity-Skewness achieved high test-retest reproducibility of CCC (95% CI) = 0.941 (0.882,0.973) and high inter-reader reproducibility of CCC (95% CI) = 0.952(0.942,0.960) (see supplement Fig. S7 and Fig. S8 in Supplement S10.1).
Benchmarking
The QIB, Intensity-Skewness, was benchmarked against common clinical parameters and the tumor Density, the most significant of the radiologic parameters (see supplement Fig. S9 and Table S5 in Supplement S10.2), in terms of AUC. As shown in Fig. 2, Intensity-Skewness achieved AUC (95% CI) of 0.849(0.813,0.881), 0.820(0.781,0.856), and 0.863(0.827,0.895) on the training, internal validation, and external validation cohorts, respectively.
Fig. 2.

Benchmarking results. a–c ROCs and AUCs (95% CI) on the training, internal validation, and external validation cohorts, respectively
Cutoff analysis
The mean ± std. of Intensity-Skewness in the three data cohorts for good DFS group were − 0.05 ± 0.77, − 0.71 ± 1.32, and 0.35 ± 0.72, respectively, and for mid/poor DFS group were − 1.66 ± 1.37, − 2.47 ± 1.34, and − 1.41 ± 1.57 respectively (see Fig. 3), suggesting that the lower the Intensity-Skewness was, the poorer the DFS would be. Figure 4 shows the true positive, true negative, false negative, false positive, sensitivity, and specificity on different cutoffs of Intensity-Skewness. As shown in Fig. 4, the cutoff Intensity-Skewness ≤ 1.5 could achieve high specificity to predict the mid/poor DFS group, especially in the training and external validation cohorts which had specificity of 96% (sensitivity 53%) and 99% (sensitivity 46%), respectively.
Fig. 3.

The distribution of Intensity-Skewness in training, internal validation, and external validation cohorts. Dashed line in the picture shows the mean of the distribution
Fig. 4.

Results of cutoff analysis on the identified radiomics biomarker Intensity-Skewness
Stratification analysis
The stratification analysis showed that the classification performance of Intensity-Skewness was not affected by patient age, gender, TNM stage, the manufacturer of CT, and the slice thickness of the CT reconstruction (see Supplement Table S6 in Supplement S10.3).
Biomarker interpretations
Skewness is a first-order histogram index to describe distorted normal distribution of data, skewed either to the left (positive skewness value) or to the right (negative skewness value). The identified QIB, Intensity-Skewness, describes the distortion of tumor pixel-density distribution, skewed either to the dark area (low radiation attenuation) or to the bright area (high radiation attenuation). An example is shown in Fig. 5 that, when a tumor is solid, its distribution of radiation attenuation is skewed to the right with a negative skewness value; in contrary, when a tumor is less solid (e.g., glass-ground opacity and subsolid nodule), its distribution of radiation attenuation is skewed to the right with a positive skewness value. The identification of Intensity-Skewness by AI was concordant with the finding of radiologist’s understanding that Density is the optimal radiologic biomarker to predict the histologic subgroups of adenocarcinoma [31]. The analysis on the association between the radiomic QIB Intensity-Skewness and the visual Radiologist Density is shown in Fig. S10 in Supplement 10.4.
Fig. 5.

Example for Intensity-Skewness on characterizing tumor density. a–c The three levels of human-expert visualization, the “Solid,” “Partial Solid,” and “GGO,” and the corresponding intensity distributions and Intensity-skewness values. Red contour in the picture shows the auto-segmentation
Multi-category analysis
The results of using Intensity-Skewness to predict the eight histologic subtypes are presented in Table S7 of Supplement S10.3. As shown in Table S7, the prediction of histologic subtypes for the good prognosis group achieved high/acceptable AUC, sensitivity, and specificity; however, the prediction of histologic subtypes for mid/poor prognosis group had low specificity.
A detailed description of the results of biomarker analysis can be found in Supplement S10.
Discussion
In this study, we identified an optimal QIB, the Intensity-Skewness, that achieves a high performance in predicting DFS associated histologic subgroups in lung adenocarcinoma. The performance, reproducibility, and generalizability of Intensity-Skewness are further confirmed on a distinct external validation cohort by separate independent blinded radiologists. By biomarker interpretation analysis, we found that the Intensity-Skewness derived from CT images reflects the association between lung adenocarcinoma histologic subtypes and tumor density; i.e., tissues of ACI/PAP/MIP/SOL/MUC were denser than tissues of AIS/MIA/LEP, which is consistent with the radiologic-pathologic observations [31]. This association is again observed in the multi-category analysis that, when using Intensity-Skewness to predict the eight histologic subtypes directly, AIS/MIA/LEP can be predicted with high/acceptable AUC and specificity, whereas ACI/PAP/MIP/SOL/MUC cannot. This is because AIS/MIA/LEP usually appear as pure GGO and partial solid nodules with different density values, while ACI/PAP/MIP/SOL/MUC generally all appear as solid nodules with high density on CT images. Hence, the Intensity-Skewness, which quantifies tumor density distribution, performs well for AIS/MIA/LEP, but less well for ACI/PAP/MIP/SOL/MUC.
Researchers in the field of radiology and oncology have investigated the radiologic-pathologic association in early-stage lung adenocarcinoma. In 2003, Nomori et al [32] showed that peak position on the intensity histogram could differentiate between AAH and bronchioloalveolar carcinoma (BAC) on CT (p < 0.001). In 2007, Ikeda et al [33] showed that the 75th percentile of intensity histogram was a better predictor to differentiate AAH and BAC with a sensitivity of 0.90 and a specificity of 0.81. In 2016, Ko et al [34] proposed to use the solid percentage of a lesion to predict the lung adenocarcinoma histologic subgroups and reported an accuracy of 75.6%. In 2017, a CT radiomics method [35] was proposed to predict lung adenocarcinoma histologic subgroups and reported an AUC of 80.5%. And most recently, CNN classifiers [36] were employed to predict lung nodule invasiveness and reported an AUC of 78.8%.
However, there are several limitations in the above methodologies which hamper their translation into clinical use. First, manually/semi-automatic segmentation was a prerequisite for most of those methods. Manually/semi-automatic segmentation techniques are fraught with high intra/intervariability [37], which was not fully studied in those cohorts. Second, those methods were developed based on limited data sets which were not necessarily prospectively collected or retrospective data which was not prospectively analyzed. Third, these studies were all conducted at a single institution without validation on external data, including without data from different CT scanners and use of different imaging parameters. Therefore, the generalizability of those methods is very uncertain, as image features (especially those high-order histogram and textural image features) might be affected by the selected imaging parameters [19]. Even CNN classifiers, an emerging and promising technique, could fail when tested on external validation data due to their vulnerability to various adversarial attacks in the form of subtle changes to input images [38].
Unlike these prior studies, ours used automated segmentation for biomarker measurement, collected a large amount of consecutive data for training and validation, and incorporated rigorous independent external validation to test generalizability. Moreover, since we used a high-throughput integrated machine-learning analytical system, the previously evaluated imaging features, e.g., histogram peak position, 75th percentile, and solid percentage, were all included in the tested 1164 imaging features. Our results showed that the identified Intensity-Skewness is the optimal feature out of the 1164 imaging features. Furthermore, our radiologic/pathologic correlation confirmed that the Intensity-Skewness is essentially an indicator which reproducibly quantifies density distribution for early-stage lung adenocarcinoma tissue.
In the clinical practice, a noninvasive and quantitative density distribution indicator for lung adenocarcinoma could be of great benefit for patients and their physicians as they contemplate treatment options. Our image-based adenocarcinoma density distribution biomarker could provide complementary information for a pathologist to assist the determination of the histologic subtype. For radiologists, these techniques could greatly aid in further characterization of the subsolid pulmonary nodule on CT and thus aid in lung nodule classification and ultimately patient management [39].
One limitation of our study is that the identified Intensity-Skewness and the corresponding cutoff criterion were learned from the association between CT images and patients’ pathological reports which were determined by pathologists subjectively. We hope to further validate and improve the Intensity-Skewness by studying its association with patients’ 5-year DFS directly which is a different “gold standard.” In fact, one of the challenges associated with DFS is that it is dependent upon surgical technique as well as subsequent therapies for an individual patient. Therefore, in some regards, the pathologic correlation, based upon large, known associations, such as histologic subtype, may be an equally meaningful endpoint.
In conclusion, our study identified a noninvasive and automated QIB which showed excellent prediction performance, high measurement reproducibility, and good external generalizability on predicting DFS-associated histologic subgroups in lung adenocarcinoma. Furthermore, the QIB may serve as a noninvasive and quantitative indicator to characterize tumor density distribution for lung adenocarcinoma, which is potentially valuable for personalized treatment decision-making for lung adenocarcinoma patients.
Supplementary Material
Key Points.
A noninvasive imaging biomarker, Intensity-Skewness, which described the distortion of pixel-intensity distribution within lesions on CT images, was identified as a biomarker to predict disease-free-survival-associated histologic subgroups in lung adenocarcinoma.
An Intensity-Skewness of ≤ 1.5 has high specificity in predicting the mid/poor disease-free survival histologic patient group in both the training cohort and the external validation cohort.
The Intensity-Skewness is a feature that can be automatically computed with high reproducibility and robustness.
Funding information
This study was in part supported by PT07092001 from the Ministry of Science and Technology of People’s Republic of China, and in part by U01 CA225431 from the National Cancer Institute. The content is solely the responsibility of the authors and does not necessarily represent the funding sources.
Abbreviations
- AI
Artificial intelligence
- AUC
Area under the receiver operating characteristic curve
- CCC
Concordance correlation coefficient
- CI
Confidence Interval
- DFS
Disease-free survival
- QIB
Quantitative imaging biomarker
- ROI
Region of interest
Footnotes
Electronic supplementary material The online version of this article (https://doi.org/10.1007/s00330-020-06663-6) contains supplementary material, which is available to authorized users.
Conflict of interest The authors of this manuscript declare no relationships with any companies, whose products or services may be related to the subject matter of the article.
Statistics and biometry One of the authors, Lin Lu, has significant statistical expertise. He is now an associate research scientist at the Department of Radiology, Columbia University Medical Center.
Informed consent Written informed consent was not required for this study because this study only used de-identified CTs for only the purpose of image (pixel-based) evaluation.
Ethical approval Institutional Review Board approvals were obtained from Sun Yat-sen University Cancer Center and the Affiliated Hospital of Qingdao University.
- retrospective
- observational
- multicenter study
Guarantor The scientific guarantor of this publication is Binsheng Zhao, Professor of Department of Radiology New York Presbyterian Hospital, Columbia University Medical Center.
References
- 1.Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A (2018) Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 68(6):394–424 [DOI] [PubMed] [Google Scholar]
- 2.Travis WD (2011) Classification of lung cancer. Semin Roentgenol 46(3): 178–186 [DOI] [PubMed] [Google Scholar]
- 3.Hung JJ, Yeh YC, Jeng WJ et al. (2014) Predictive value of the international association for the study of lung cancer/American Thoracic Society/European Respiratory Society classification of lung adenocarcinoma in tumor recurrence and patient survival. J Clin Oncol 32(22):2357–2364 [DOI] [PubMed] [Google Scholar]
- 4.Tsao MS, Marguet S, Le Teuff G et al. (2015) Subtype classification of lung adenocarcinoma predicts benefit from adjuvant chemotherapy in patients undergoing complete resection. J Clin Oncol 33(30):3439–3446 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Aerts HJ, Velazquez ER, Leijenaar RT et al. (2014) Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun 5:4006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Bi WL, Hosny A, Schabath MB et al. (2019) Artificial intelligence in cancer imaging: clinical challenges and applications. CA Cancer J Clin 69(2):127–157 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Hawkins S, Wang H, Liu Y et al. (2016) Predicting malignant nodules from screening CT scans. J Thorac Oncol 11(12):2120–2128 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Rios Velazquez E, Parmar C, Liu Y et al. (2017) Somatic mutations drive distinct imaging phenotypes in lung cancer. Cancer Res 77(14):3922–3930 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Coroller TP, Agrawal V, Narayan V et al. (2016) Radiomic phenotype features predict pathological response in non-small cell lung cancer. Radiother Oncol 119(3):480–486 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Coroller TP, Grossmann P, Hou Y et al. (2015) CT-based radiomic signature predicts distant metastasis in lung adenocarcinoma. Radiother Oncol 114(3):345–350 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Sun R, Limkin EJ, Vakalopoulou M et al. (2018) A radiomics approach to assess tumour-infiltrating CD8 cells and response to anti-PD-1 or anti-PD-L1 immunotherapy: an imaging biomarker, retrospective multicohort study. Lancet Oncol 19(9):1180–1191 [DOI] [PubMed] [Google Scholar]
- 12.LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature. 521(7553):436–444 [DOI] [PubMed] [Google Scholar]
- 13.Chen W, Zheng R, Baade PD et al. (2016) Cancer statistics in China, 2015. CA Cancer J Clin 66(2):115–132 [DOI] [PubMed] [Google Scholar]
- 14.Yoshizawa A, Motoi N, Riely GJ et al. (2011) Impact of proposed IASLC/ATS/ERS classification of lung adenocarcinoma: prognostic subgroups and implications for further revision of staging based on analysis of 514 stage I cases. Mod Pathol 24(5):653–664 [DOI] [PubMed] [Google Scholar]
- 15.Ujiie H, Kadota K, Chaft JE et al. (2015) Solid predominant histologic subtype in resected stage I lung adenocarcinoma is an independent predictor of early, extrathoracic, multisite recurrence and of poor postrecurrence survival. J Clin Oncol 33(26):2877–2884 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Roth HR, Lu L, Liu J et al. (2016) Improving computer-aided detection using convolutional neural networks and random view aggregation. IEEE Trans Med Imaging 35(5):1170–1181 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495 [DOI] [PubMed] [Google Scholar]
- 18.Russakovsky O, Deng J, Su H et al. (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252 [Google Scholar]
- 19.Zhao B, Tan Y, Tsai WY et al. (2016) Reproducibility of radiomics for deciphering tumor phenotype with imaging. Sci Rep 6:23428. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.E L, Lu L, Li L, Yang H, Schwartz LH, Zhao B (2018) Radiomics for classification of lung cancer histological subtypes based on nonenhanced computed tomography. Acad Radiol 26:1245–1252 [DOI] [PubMed] [Google Scholar]
- 21.Mokrane FZ, Lu L, Vavasseur A et al. (2019) Radiomics machine-learning signature for diagnosis of hepatocellular carcinoma in cirrhotic patients with indeterminate liver nodules. Eur Radiol 30:558–570 [DOI] [PubMed] [Google Scholar]
- 22.Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238 [DOI] [PubMed] [Google Scholar]
- 23.Robnik-Šikonja M, Kononenko I (eds) (1997) An adaptation of Relief for attribute estimation in regression. Machine Learning: Proceedings of the Fourteenth International Conference (ICML’97) [Google Scholar]
- 24.Liu B, Li S, Wang Y, Lu L, Li Y, Cai Y (2007) Predicting the protein SUMO modification sites based on properties sequential forward selection (PSFS). Biochem Biophys Res Commun 358(1):136–139 [DOI] [PubMed] [Google Scholar]
- 25.Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297 [Google Scholar]
- 26.Breiman L (1996) Bagging Predictors. Mach Learn 24:23–140 [Google Scholar]
- 27.Breiman L (2001) Random forests. Mach Learn 45(1):5–32 [Google Scholar]
- 28.Zhao B, James LP, Moskowitz CS et al. (2009) Evaluating variability in tumor measurements from same-day repeat CT scans of patients with non-small cell lung cancer. Radiology. 252(1):263–272 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Yang H, Schwartz LH, Zhao B (2016) A response assessment platform for development and validation of imaging biomarkers in oncology. Tomography. 2(4):406–410 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Fedorov A, Beichel R, Kalpathy-Cramer J et al. (2012) 3D slicer as an image computing platform for the quantitative imaging network. Magn Reson Imaging 30(9):1323–1341 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Austin JH, Garg K, Aberle D et al. (2013) Radiologic implications of the 2011 classification of adenocarcinoma of the lung. Radiology. 266(1):62–71 [DOI] [PubMed] [Google Scholar]
- 32.Nomori H, Ohtsuka T, Naruke T, Suemasu K (2003) Differentiating between atypical adenomatous hyperplasia and bronchioloalveolar carcinoma using the computed tomography number histogram. Ann Thorac Surg 76(3):867–871 [DOI] [PubMed] [Google Scholar]
- 33.Ikeda K, Awai K, Mori T, Kawanaka K, Yamashita Y, Nomori H (2007) Differential diagnosis of ground-glass opacity nodules: CT number analysis by three-dimensional computerized quantification. Chest. 132(3):984–990 [DOI] [PubMed] [Google Scholar]
- 34.Ko JP, Suh J, Ibidapo O et al. (2016) Lung adenocarcinoma: correlation of quantitative CT findings with pathologic findings. Radiology. 280(3):931–939 [DOI] [PubMed] [Google Scholar]
- 35.Yuan M, Zhang YD, Pu XH et al. (2017) Comparison of a radiomic biomarker with volumetric analysis for decoding tumour phenotypes of lung adenocarcinoma with different disease-specific survival. Eur Radiol 27(11):4857–4865 [DOI] [PubMed] [Google Scholar]
- 36.Zhao W, Yang J, Sun Y et al. (2018) 3D deep learning from CT scans predicts tumor invasiveness of subcentimeter pulmonary adenocarcinomas. Cancer Res 78(24):6881–6889 [DOI] [PubMed] [Google Scholar]
- 37.Welch ML, McIntosh C, Haibe-Kains B et al. (2019) Vulnerabilities of radiomic signature development: the need for safeguards. Radiother Oncol 130:2–9 [DOI] [PubMed] [Google Scholar]
- 38.Finlayson SG, Bowers JD, Ito J, Zittrain JL, Beam AL, Kohane IS (2019) Adversarial attacks on medical machine learning. Science. 363(6433):1287–1289 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Mets OM, de Jong PA, Chung K, Lammers JJ, van Ginneken B, Schaefer-Prokop CM (2016) Fleischner recommendations for the management of subsolid pulmonary nodules: high awareness but limited conformance - a survey study. Eur Radiol 26(11):3840–3849 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
