Skip to main content
npj Biomedical Innovations logoLink to npj Biomedical Innovations
. 2025 Jul 4;2:25. doi: 10.1038/s44385-025-00027-9

CT-derived functional imaging biomarkers combined with FEV1 for predicting 10-year all-cause mortality in COPDGene cohort

Girish Nair 1, Yuying Judy Xing 2, Aaron Luong 3, Faiza Bashar 4, Amanda Nowacki 3, Craig Stevens 5, Lili Zhao 6, Edward Castillo 3,
PMCID: PMC12227309  PMID: 40620412

Abstract

This study evaluates the predictive power of CT-derived functional imaging (CTFI) combined with forced expiratory volume in 1 second (FEV1) for 10-year all-cause mortality in COPD patients. We analyzed 8583 participants from the COPDGene® cohort, focusing on 3550 participants with spirometric obstruction. CTFI metrics, including ventilation (CT-V) and perfusion (PBM), were computed from non-contrast CT scans at lobar resolution. Our findings show that regional and global CTFI scores decline with advancing GOLD stages. A Random Survival Forest model, adjusted for age, BMI, and scanner type, demonstrated significant improvement in mortality prediction when combining FEV1 with CTFI, compared to FEV1 alone, with an AUC increase from 0.71 to 0.76 over 10 years. The Net Reclassification Index further confirmed the added predictive value of CTFI. These results suggest that integrating CTFI with traditional lung function measures enhances mortality prediction in COPD, offering a promising tool for clinical risk assessment.

Subject terms: Predictive markers, Medical research

Introduction

Chronic obstructive pulmonary disease (COPD) is a heterogenous, chronic respiratory disease characterized by poorly reversible airflow obstruction1. Early identification of those at high risk of mortality may facilitate earlier intervention. BODE index using physiological constructs consisting of Body mass index (BMI), airflow Obstruction, Dyspnea scale, and Exercise capacity was first developed as a mortality prediction model in this population2. Recent modeling studies for all-cause mortality incorporates clinical, spirometric, and CT imaging data, and have shown improved predictive ability over a 10-year period35. The features used in these studies were selected based on clinician input and on prior studies that demonstrated their potential as mortality predictors. However, recent advancements in image analysis techniques have allowed for novel lung functional information to be inferred from CT that have not yet been utilized to improve mortality prediction.

Quantitative CT imaging variables have fundamentally changed the landscape with their ability to detect structural changes in lung parenchyma and airway preceding changes in spirometry69. Advances in CT-functional imaging (CTFI), a robust image processing-based modality derived from non-contrast inspiratory and expiratory (IE) CT, offers a comprehensive spatial and functional assessment of lung parenchyma. CTFI has yielded accurate and reproducible estimates of CT pulmonary ventilation (CT-V) and pulmonary blood mass changes (PBM), a surrogate for pulmonary perfusion1012. Still, CT functional data is not routinely incorporated into COPD diagnostics or for risk assessment8.

In this study, we determine the predictive ability of CTFI combined with forced expiratory volume in 1 s (FEV1) at baseline for predicting patient mortality in COPD patients with airflow obstruction. We hypothesize there will be a significant improvement in predictive accuracy combining functional imaging markers with FEV1.

Results

Baseline characteristics of the study population and CTFI global and regional values are shown in Tables 1 & 2. We included 3550 patients with definite spirometric obstruction in the study. The average age of study subjects was 63 years. The population mostly included Non-Hispanic White and smokers. There was a significant trend for higher supplemental oxygen requirements, worsening dyspnea scores, lower BMI, walk distance and higher St. George’s Respiratory Questionnaire (SGRQ) scores with increasing GOLD stages. All-cause mortality was 34.7% and is dependent on GOLD stages (Fig. 1).

Table 1.

Baseline Demographics of Patients Included in the Study

All patients (N = 3550) GOLD 1 (N = 614) GOLD 2 (N = 1533) GOLD 3 (N = 931) GOLD 4 (N = 472) P-value
Age in years (SD) 63.3(8.53) 62.2(8.81) 62.8(8.77) 64.4(8.26) 63.8(7.56) <0.001
Male sex (%) 55.2 55.5 52.6 57.6 58.9 0.029
Non-Hispanic White race (%) 79.9 80.8 77.7 81.0 83.7 0.020
Height (cm) 169.9(9.58) 169.8(9.92) 169.9(9.50) 169.8(9.57) 169.9(9.47) 0.963
BMI (kg/m2) 27.78(5.88) 27.04(4.93) 28.79(5.96) 27.82(6.07) 25.42(5.57) <0.001
Supplemental O2 (L) 3.6(4.21) 2.7(2.74) 2.8(2.74) 3.6(4.84) 4.2(4.32) 0.001
6MW distance (feet) 1244(403.16) 1512(335.27) 1316(362.08) 1114(369.21) 893(347.04) <0.001
Exacerbation frequency 0.64(1.17) 0.18(0.63) 0.55(1.05) 0.83(1.32) 1.19(1.47) <0.001
mMRC dyspnea index 1.90(1.46) 0.79(1.18) 1.61(1.39) 2.47(1.24) 3.14(0.89) <0.001
SGRQ total 36.53(22.62) 17.96(17.73) 32.79(21.28) 45.46(19.05) 55.19(16.01) <0.001
BODE index 3.01(2.58) 0.60(1.05) 1.89(1.68) 4.74(1.79) 6.70(1.57)
Active smoker (%) 97.6 95.1 91.9 86.7 94.3 <0.001
CAD (%) 9.2 8.0 9.5 10.3 7.4 0.215
HTN (%) 48.3 37.9 51.5 53.8 40.5 <0.001
DM (%) 12.0 8.5 12.7 15.0 8.1 <0.001
CHF (%) 4.0 0.8 3.5 5.7 6.6 <0.001
Solid Cancer (%) 6.4 5.4 6.0 7.6 6.6 0.279
Thromboembolic (%) 5.3 4.6 5.9 4.9 5.1 0.583
CVA (%) 5.7 5.0 5.9 6.6 3.8 0.168

Continuous variables are presented as means with standard deviations and categorical variables are summarized using frequencies and percentages.

BMI Body mass index, 6MW 6 min walk distance, mMRC modified medical research council, CAD coronary artery disease, HTN hypertension, DM Diabetes, CHF congestive heart failure, CVA cerebrovascular accident.

Table 2.

Spirometry and CTFI Variables

All patients GOLD 1 GOLD 2 GOLD 3 GOLD 4 Raw
P-value
Adjusted P-value*
(N = 3550) (N = 614) (N = 1533) (N = 931) (N = 472)
FEV1 (L) 1.56(1.05, 2.14) 2.62(2.13, 3.12) 1.81 (1.50, 2.21) 1.10 (0.93, 1.34) 0.64 (0.51, 0.79) <0.001 <0.001
FEV1/FVC 0.55 (0.42, 0.64) 0.66 (0.63, 0.68) 0.60 (0.53, 0.65) 0.43 (0.37, 0.49) 0.30 (0.26, 0.36) <0.001 <0.001
FVC 2.98 (2.34, 3.72) 4.08 (3.33, 4.79) 3.13 (2.57, 3.79) 2.62 (2.11, 3.21) 2.12 (1.62, 2.61) <0.001 <0.001
CT vendor (%) 0.026
GE 37.2 39.3 38.2 35 35.8
Philips 5.4 3.6 4.8 6.6 7.6
Siemens 57.4 57.2 57 58.4 56.6
CT-V (L)
Global 2.00 (1.46, 2.66) 2.73 (1.99, 3.46) 2.13 (1.65, 2.71) 1.76 (1.32, 2.31) 1.52 (1.11, 1.86) <0.001 <0.001
RUL 0.40 (0.28, 0.55) 0.49 (0.35, 0.65) 0.49 (0.35, 0.65) 0.37 (0.26, 0.50) 0.30 (0.21, 0.39) <0.001 <0.001
RLL 0.51 (0.35, 0.70) 0.72 (0.54, 0.94) 0.54 (0.39, 0.72) 0.43 (0.29, 0.58) 0.36 (0.24, 0.48) <0.001 <0.001
RML 0.12 (0.07, 0.18) 0.17 (0.12, 0.24) 0.13 (0.08, 0.19) 0.10 (0.06, 0.15) 0.08 (0.05, 0.12) <0.001 <0.001
LUL 0.46 (0.33, 0.63) 0.59 (0.43, 0.78) 0.49 (0.36, 0.66) 0.42 (0.31, 0.56) 0.34 (0.24, 0.46) <0.001 <0.001
LLL 0.48 (0.32, 0.67) 0.69 (0.49, 0.90) 0.51 (0.36, 0.69) 0.42 (0.28, 0.57) 0.34 (0.23, 0.49) <0.001 <0.001
PBM
Global 68.2(39.5,104.9) 95.6(61.11,130.73) 73.8(45.90,110.54) 57.5(32.78,90.10) 46.9(28.70,69.54) <0.001 <0.001
RUL 12.1(5.77,20.27) 15.3 (6.42, 24.29) 13.4 (6.41, 21.17) 11.1(5.28, 19.37) 8.92 (4.89, 14.95) <0.001 <0.001
RLL 20.1(11.07,31.23) 30.4(19.59, 41.15) 22.8(13.46, 32.62) 15.9(8.74, 24.73) 13.1 (6.70, 19.70) <0.001 0.001
RML 3.51 (1.56, 6.40) 3.8 (1.93, 7.16) 3.58 (1.57, 6.42) 3.45 (1.51, 6.27) 2.95 (1.20, 5.54) <0.001 <0.001
LUL 13.9(6.71, 22.92) 16.3(7.62, 26.18) 15.0 (7.51, 24.49) 13.1(6.06, 21.34) 10.6(5.48, 17.15) <0.001 <0.001
LLL 9.2(5.00, 10.60) 10.1(7.40, 11.00) 9.7 (5.70, 10.70) 8.0 (4.80, 10.20) 5.1 (2.70, 8.50) <0.001 <0.001

Numbers represent median (IQR) for FEV1, FEV1/FVC, FVC and CTV in L, PBM is in gm/L.

RUL right upper lobe, RML right middle lobe, RLL right lower lobe, LUL left upper lobe, LLL left lower lobe.

*P-values were adjusted using the Bonferroni method to account for multiple comparisons. Adjustments were performed separately for the Spirometry, CT-V, and PBM variable groups.

Fig. 1.

Fig. 1

Survival probability in the population from baseline divided on GOLD stage. At six years GOLD I – IV are 0.92 (0.90, 0.94), 0.86 (0.84, 0.88), 0.75 (0.72, 0.78), 0.49 (0.44, 0.54) respectively.

Correlation between CTFI and FEV1 to other predictors of COPD mortality

We noted good correlation between CTFI parameters, CT-V, and PBM to FEV1. A Spearman correlation coefficient (denoted by r) showed positive correlation between FEV1 and Global CT-V (r = 0.60) and FEV1 and PBM (r = 0.41).

We noted a higher correlation between FEV1 compared to CTFI over other COPD mortality predictors. Distance walked in 6 minutes was positively correlated to FEV1, CT-V and PBM respectively (r = 0.53, 0.45, 0.34). Whereas BODE-index showed a high negative correlation to FEV1 but only moderate to low with CT-V and PBM (r = FEV1 −0.76, CT-V −0.46, PBM −0.34). Similarly, SGRQ and exacerbation frequency showed a negative correlation to FEV1, CT-V and PBM respectively [(SGRQ r = −0.49, −0.33, −0.23), exacerbation frequency r = −0.29, −0.16, −0.10)].

Relationship of the CTFI scores with GOLD stage

The global CT-V scores in liters (L) were noted to be lower with increasing of GOLD stages I-IV [GOLD I: 2.73 (1.99, 3.46), GOLD-II 2.13 (1.65, 2.71), GOLD-III 1.76 (1.32, 2.31), GOLD-IV 1.52 (1.11, 1.86)]. Similarly, PBM values in gm/mm3 were lower with advancing COPD stages [GOLD I: 95.60 (61.11, 130.73), GOLD-II 73.80 (45.90, 110.54), GOLD-III 57.55 (32.78, 90.10), GOLD-IV 46.89 (28.70, 69.54)] respectively. There were significant differences in regional ventilation (p < 0.001) and blood mass changes (p < 0.001) across the GOLD stages with right middle lobe having least amount of ventilation and PBM across all stages (Table 2), which is consistent with known physiology13,14.

Model for mortality prediction on longitudinal follow-up

To test the predictive ability of baseline functional information derived from CTFI in addition to lung function, we built a RSF model as detailed above. The RSF model, using 5-fold cross validation showed significant improvement in AUC for FEV1 + CTFI compared to a model with FEV1 alone (Fig. 2). At year 2, the AUC for FEV1 was 0.678 compared to 0.704 for FEV1 + CTFI. This trend was similar for all the subsequent years. The AUC for FEV1 was highest, 0.692 at years 5 and 7. Whereas the model with FEV1 + CTFI showed AUC over 0.73 from year 6 onwards (Fig. 2).

Fig. 2.

Fig. 2

AUC of Random Survival Forest model comparing CTFI + FEV1 without and after including for age, BMI and scanner type.

We noted that age, BMI, and scanner type (Siemens, GE, or Philips) were other significant features on variable importance selection. We developed a second RSF prediction model including age, BMI and scanner type. The AUC for FEV1 + CTFI was higher compared to model with FEV1 on all years. RSF model with functional imaging variable and FEV1 obtained at baseline showed significant discriminative capacity for mortality prediction from year 2 onwards and the trend continued for subsequent years of follow up. AUC was highest at 0.757 and 0.755 at years 9 and 10. This trend held true when restricting the model to a specific scanner type. The AUC values for the RSF model with CTFI and FEV1 using the GE scanner showed consistent performance, with AUC values remaining stable across different time points. Similarly, the RSF model using CTFI and FEV1 from Siemens scanner data exhibited comparable AUC values, with minor fluctuations but overall consistent performance over time (Fig. 3).

Fig. 3. Violin plots for CTV (top) and PBM regional changes (bottom) with increasing GOLD stage by lobe.

Fig. 3

Fig. 3

The violin plot displays a rotated kernel density plot on each side and a box plot in the middle, which visualizes the distribution and summary statistics of the data.

A NRI quantifies the extent to which a model with imaging variables enhances the accurate reclassification of individuals into death compared to the model without imaging variables. NRI for the prediction model with CTFI, FEV1, age, BMI, and scanner type showed significant improvement in the mortality prediction over the model without CTFI (Fig. 4).

Fig. 4.

Fig. 4

Net Reclassification Index for Random Survival Forest prediction model CTFI + FEV1 with age, BMI, and scanner type showed significant improvement in the mortality prediction over the model without CTFI.

Survival probability of the cohort on longitudinal follow-up

The survival probability decreases with increasing GOLD stages. Year 6 probabilities are as follows: GOLD I: 0.92 (0.90, 0.94), GOLD-II 0.86 (0.84, 0.88), GOLD-III 0.75 (0.72, 0.78), GOLD-IV 0.49 (0.44, 0.54). Similarly, the survival probability decreases with increasing BODE index quantiles. In Year 6, the probabilities are as follows: Quantile 1: 0.92 (0.90, 0.93), Quantile 2: 0.86 (0.84, 0.88), Quantile 3: 0.78 (0.74, 0.81), Quantile 4: 0.53 (0.49, 0.57). The survival probability decreases with increasing quantiles of the RSF model, which includes age, BMI, scanner type, and FEV1 + CTFI. In year 6, the probabilities are as follows: Quantile 1: 0.92 (0.91, 0.94), Quantile 2: 0.89 (0.87, 0.91), Quantile 3: 0.79 (0.77, 0.82), Quantile 4: 0.56 (0.53, 0.60).

Thus, both BODE index and RSF model with CTFI and FEV1 have similar survival probabilities as advancing GOLD stages. The RSF model including imaging and lung function obtained at baseline has good discriminative capacity for mortality prediction from year 2 onwards with increasing trend seen up to 10 years of follow up.

Discussion

The COPDGene® study is a multicenter cohort study of current and former smokers with at least a 10–pack-year smoking history enrolled at 21 centers across the United States15,16. We applied CT-derived lung function parameters to FEV1 to develop a robust RSF model for all-cause mortality in this population. The main findings of our study include the following: 1) CTFI (both global and regional) score is worse with the advanced GOLD stages, 2) FEV1 is moderately correlated with CT-V and PBM, and 3) a model combining FEV1 and CTFI obtained at baseline provides important additional information on mortality in this cohort.

In our study, we examined the additional information gained from CT functional imaging at baseline, in addition to FEV1 at baseline, on long term mortality in patients with COPD and spirometric obstruction. Prior research has focused on utilizing known parameters such as degree of obstruction, walk distance, exacerbation frequency, and dyspnea score to help with COPD mortality prediction. Multidimensional methods such as ADO (age, dyspnea, airflow obstruction), COTE (COPD-specific comorbidity test), DOSE (dyspnea, airflow obstruction, smoking, exacerbations), and CODEX (comorbidity, dyspnea, airflow obstruction, exacerbations) provide good discrimination of short term mortality in COPD patients17,18, When these predictive models were combined, it improved the discriminative ability at 1 year (c-statistic 0.780 ADO + COTE; 0.727 DOSE + COTE)18. In the PROSPERO study, a large meta-analysis including 42 studies showed multicomponent prognostic models only had moderate discriminative ability and factors that were related to mortality were previous hospitalization for acute exacerbation, readmission within 30 days, cardiovascular comorbidity, age, male sex, and long-term oxygen therapy19. Our approach of using baseline FEV1 and CTFI showed both short-term and long-term mortality discrimination without the need for other historical measures compared with ADO or DOSE algorithms.

Long-term COPD mortality is worse in patients over age 70, those with cardiovascular comorbidities, have history of diabetes, have worse dyspnea scores, and with FEV1 < 50% predicted20. More recent studies incorporating machine learning (ML) algorithms for all-cause mortality included spirometric and CT imaging data35. Features with highest impact on COPD risk of mortality were FEV1, 6-minute walk distance, and age4. CT imaging data used in those studies were from radiographic inputs. We included a physical model to derive lung function from imaging first, and it is not reliant on the imaging features. Furthermore, we prioritized RSF due to its strength in handling censored survival data and its interpretability when integrating both imaging and non-imaging variables. While RSF is simpler than deep learning models, it has proven effective for survival prediction in COPD. We used a minimalistic model with good discriminative capacity starting from year 2 that is comparable to other models for short-term COPD mortality. We adjusted for age, BMI, and scanner type based on the features on the variable importance ranking. The RSF model is unique in that it can handle complex non-linear relationships and multicollinearity. Our results are robust and as shown with NRI, CTFI is significantly contributing to predicting death in the COPD cohort.

Detailed spatial information and function using CT scan has been well studied in patients with COPD8. These studies were plagued with reproducibility and standardization issues due to differences in patient effort, lack of spirometric gating, CT manufacturers, lung segmentation algorithms, and issues with reconstruction kernels8. CTFI uses the IJF method for CT ventilation and has previously been shown to have inherent stability and high reproducibility21. As such, there would be minimal impact with lack of patient effort, which seems to be a big issue with other methods of CT derived lung function. Others have used densitometry [Hounsfield units (HU)] to determine lung function22,23. Parametric response mapping (PRM) uses coregistered inhalation and exhalation images to determine emphysema (<−950 HU inhalation and <−856 HU in exhalation) and small airway disease (<−856 HU on exhalation and >−950 HU on inhalation CT) and has shown good correlation with lung function9. These methods are also subjective to the issues addressed above, but CTFI would provide lung function (volume and blood mass changes) to quantify disease globally and regionally with good reproducibility10,11,24. In comparison with BODE index divided into quantiles, our model shows similar survival probabilities in advanced COPD. Despite similar survival probabilities in the studied population, the model with CTFI and FEV1 has good mortality discrimination from year 2 onwards.

We noted there was a significant difference in regional ventilation and PBM changes with advancing GOLD stages. Interestingly, PBM, a surrogate for pulmonary perfusion, is only moderately correlated with FEV1 but shows strong regional differences with advancing COPD stages. Physiologically, this could be related to worsening emphysema or pulmonary hypertension seen in patients with advanced GOLD stages.

Our study has a few limitations. First, although CTFI is robust, it may encounter problems similar to those affecting other quantitative CT imaging methods, such as lung segmentation accuracy, CT acquisition, and standardization. Our objective of this study was to determine if additional information obtained from CTFI combined with FEV1 could strengthen the mortality insight. Study results indicate it does, but we plan to conduct future studies measuring ventilation-perfusion mismatch scores obtained from CTFI that could potentially be stronger at identifying normal functioning and diseased lung separately. Second, we only included patients with obstructive spirometry from the COPDGene® cohort. An immediate area of future research will be to see if the results are applicable to those with preserved ratio and impaired spirometry. Third, we did not include radiographic features, such as pulmonary artery size, indicating pulmonary hypertension or coronary artery calcium scoring obtained from CT images, nor did we include small airway thickness, vessel segmentation, or the presence of mucus plugs. As our purpose of this study was to gain insight of how much CT derived lung function is contributing to COPD mortality, our model avoided radiologist readings. Future development could include a model incorporating a physical model obtained from CTFI and Artificial Intelligence derived parameters from IE-CT scans. Similarly, for this study we did not include dyspnea score, need for oxygen, or walk distance, as they were noted to be lower in the variable selection ranking. Moreover, the results of this study should be further validated on other COPD cohorts, such as with SPIROMICS. Lastly, the CTFI measurements used in this study were average values taken over whole lung and lobe volumes. Another area of future work will be to develop modeling methods capable of utilizing the full 3D spatial distribution of CTFI values when making mortality predictions.

In summary, to investigate mortality prediction for COPD patients, baseline clinical metrics, such as FEV1, were combined with average lobe and whole lung CTFI values. Correlations between FEV1 and CT-V, as well as between FEV1 and PBM, are higher than the correlations between FEV1 and other COPD mortality predictors. Further, there were statistically significant differences in regional ventilation and blood mass changes across the GOLD stages and we demonstrated that Incorporating values derived from CTFI into predictive modeling offers increased mortality prediction for patients with COPD. Average lobe and whole lung CTFI values inherently provide less detail to a predictive model than the CTFI they are derived from. Thus, these promising findings suggest that future predictive models based on FEV1 and full resolution CTFI data (Fig. 5) could have the potential to provide insights into disease progression and inform treatment decisions.

Fig. 5. PBM imaging in patients with increasing GOLD stages. Red areas showing higher CT-perfusion and blue regions low perfusion.

Fig. 5

The global PBM decreases with advancing GOLD stages, but the regional distribution of PBM is characteristically different.

Methods

We performed a retrospective, longitudinal analysis of 8583 patients from the COPDGene® Project (www.clinicaltrials.gov [NCT00608764]) stratified according to severity of obstruction. The study was approved as ancillary study ANC 475 by COPDGene®.

All participants were coached to full inspiration and end expiration in order to obtain volumetric computed tomographic scans without spirometric gating25,26. In this study, we only included patients with spirometric obstruction (FEV1/FVC < 70) based on GOLD stage (Fig. 6) to avoid potential bias in interpretation with other parenchymal lung diseases.

Fig. 6.

Fig. 6

Consort diagram explaining inclusion and exclusion.

CT-V and PBM are image processing-based modalities that recover changes in local tissue volumes (ventilation surrogate) and magnitude pulmonary blood mass change (perfusion surrogate), induced by respiratory motion, from an IE-CT scan. CT-V uses the Integrated Jacobian Formulation (IJF) method, which calculates volume changes with Monte Carlo techniques with quantifiable and controllable levels of uncertainty in the image processing pipeline, which allow for robust ventilation calculations with reproducibility and good correlation with lung function parameters10,21,27,28. PBM leverages HU estimates of lung density and the robust CT-V measured volume change to compute magnitude mass changes between the IE-CT scans as a surrogate for perfusion10,11. For each patient, we estimated mean CT-V and PBM at the voxel level and averaged to lobar and global lung volume. For each lobe, we measured average CT-V in mm3(liters) and average PBM values in gm/mm3. This results in a total of 10 lobar values for each patient plus average global (i.e., average over the whole lung volume) CT-V and PBM values. Software used for the study was written in MATLAB (release R2024b, The Mathworks Inc, Natick, Massachusetts, United States). Average computation runtime for one patient was 10 minutes using a Dell Precision Laptop with an Intel Core i7-6920HQ CPU and an Nvidia Quadro M5000 graphics processing unit (GPU).

Study Analysis

Descriptive analysis was used to summarize the patient characteristics. Continuous variables are presented as means with standard deviations and medians with interquartile ranges, while categorical variables are summarized using frequencies and percentages. To compare continuous variables among groups, we employed the ANOVA test while the Chi-square test was used for categorical variables. To assess the relationships between imaging and FEV1 variables, we calculated the Spearman correlation coefficient. To visualize the long-term survival probability of subjects over a span of 10 years, Kaplan-Meier plots were generated, stratified by four Global Initiative for Chronic Obstructive Lung Disease (GOLD) stages. The primary outcome was time-to-death from any cause.

We used the Random Survival Forests (RSF) with default parameters to compare the predictive ability between models with and without the imaging variables29. RSF can model complex non-linear relationships in the data, which is a struggle for traditional linear regression models. Additionally, RSF can handle data with a large number of predictors and automatically model complex interactions between these predictions. RSF also provides a ranking of variable importance, helping to identify which variables contribute most to the prediction. While Random Forest is designed for classification/regression, it doesn’t account for censoring, which is why RSF was preferred. Cox proportional hazards models were initially considered, but RSF was preferred due to its ability to automatically capture nonlinear interactions between variables, which Cox models may fail to.

To build a robust prediction model, we first selected important clinical, spirometric, and CTFI features based on univariate Cox regression analysis and RSF-derived variable importance (Supplemental Table 1) and then used these selected features as inputs for a RSF to predict mortality probability. Comorbidities closely related to smoking habits, such as high blood pressure, congestive heart failure, and coronary artery disease, were found to not greatly impact the model output. Time-dependent receiver operator curves (ROC) and Area Under the Curve (AUC) value was used to assess model performance based on five-fold cross validations30. Specifically, we compared the models utilizing FEV1 alone and with those including additional imaging variables. To visualize prediction performance of RSF model with imaging variables, we used the model predictions to generate the Kaplan-Meier plot stratified by four quantiles, and the log-rank test to assess whether there is a statistically significant difference in survival among the four quantile groups.

To quantify the improvement in reclassification of individuals into deceased or surviving categories by the inclusion of imaging variables, we calculated the Net Reclassification Improvement (NRI) metric. Positive NRI values indicate enhanced reclassification, suggesting that the model including imaging variables is more accurate in identifying deceased and surviving individuals. We report the NRI value, 95% confidence interval, and p-value at each time point

All statistical tests were two-sided, and statistical significance was determined with a threshold of p-value < 0.05. The entire analysis was conducted using R-4.2.1, provided by the R Foundation for Statistical Computing.

Supplementary information

Acknowledgements

Funding support provided by NIH/NHLBI award RO1HL169869 CT-Derived Functional Imaging for Predicting Disease Progression in COPD (PIs: Castillo, Nair, Zhao). The COPDGene® study is funded by National Heart, Lung, and Blood Institute grants U01 HL089897 and U01HL089856. The COPDGene® study (NCT00608764) is also supported by the COPD Foundation through contributions made to anIndustry Advisory Committee comprised of AstraZeneca, Boehringer-Ingelheim, Genentech, GlaxoSmithKline, Novartis, Pfizer, Siemens, and Sunovion.

Abbreviation list

(COPD)

Chronic obstructive pulmonary disease

(CTFI)

CT-derived functional imaging

(CT-V)

CT-ventilation

(FEV1)

Forced expiratory volume in 1 s

(GOLD) stages

Global Initiative for Chronic Obstructive Lung Disease

CT scan

Inhalation-Exhalation (IE)

(PBM)

Pulmonary blood mass change

(RSF)

Random Survival Forests

Author contributions

G.N. wrote the initial draft of the manuscript and contributed to the statistical analysis. Y.X. and L.Z. conducted the main statistical analysis. A.L., F.B., and A.N. aided in data processing and writing. C.S. contributed to writing. E.C. contributed to writing and conducted the image processing.

Data availability

The imaging data used in this study is from the COPDGene® Project (www.clinicaltrials.gov [NCT00608764]). Access to the the data is provided through: https://copdgene.org. The data that support the findings of this study are available from the corresponding author, Edward Castillo, upon reasonable request.

Competing interests

Edward Castillo reports financial support was provided by National Heart Lung and Blood Institute. Girish Nair reports financial support was provided by National Heart Lung and Blood Institute. Lili Zhao reports financial support was provided by National Heart Lung and Blood Institute. Edward Castillo reports a relationship with 4D Medicine Ltd that includes: consulting or advisory and funding grants. Edward Castillo has patent #10932744 licensed to 4D Medical LTD. All other authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

The online version contains supplementary material available at 10.1038/s44385-025-00027-9.

References

  • 1.Han, M. K. et al. From GOLD 0 to Pre-COPD. Am. J. Respiratory Crit. Care Med.203, 414–423, 10.1164/rccm.202008-3328PP (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Celli Bartolome, R. et al. The Body-Mass Index, Airflow Obstruction, Dyspnea, and Exercise Capacity Index in Chronic Obstructive Pulmonary Disease. New Engl. J. Med.350, 1005–1012. 10.1056/NEJMoa021322. [DOI] [PubMed]
  • 3.Lowe, K. E. et al. COPDGene(®) 2019: Redefining the Diagnosis of Chronic Obstructive Pulmonary Disease. Chronic Obstr. Pulm. Dis.6, 384–399, 10.15326/jcopdf.6.5.2019.0149 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Strand, M. et al. A risk prediction model for mortality among smokers in the COPDGene® study. Chronic Obstr. Pulm. Dis.: J. COPD Found.7, 346 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Moll, M. et al. Machine Learning and Prediction of All-Cause Mortality in COPD. Chest158, 952–964, 10.1016/j.chest.2020.02.079 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Regan, E. A. et al. Clinical and Radiologic Disease in Smokers With Normal Spirometry. JAMA Intern. Med.175, 1539–1549, 10.1001/jamainternmed.2015.2735 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Oh, A. S. et al. Emphysema Progression at CT by Deep Learning Predicts Functional Impairment and Mortality: Results from the COPDGene Study. Radiology304, 672–679, 10.1148/radiol.213054 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Bakker, J. T., Klooster, K., Vliegenthart, R. & Slebos, D.-J. Measuring pulmonary function in COPD using quantitative chest computed tomography analysis. Eur. Respiratory Rev.30, 210031, 10.1183/16000617.0031-2021 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Galban, C. J. et al. Computed tomography-based biomarker provides unique signature for diagnosis of COPD phenotypes and disease progression. Nat. Med.18, 1711–1715. http://www.nature.com/nm/journal/v18/n11/abs/nm.2971.html#supplementary-information (2012). [DOI] [PMC free article] [PubMed]
  • 10.Castillo, E. et al. Robust CT ventilation from the integral formulation of the Jacobian. Med. Phys.45, 2115–2125, 10.1002/mp.13453 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Castillo, E. et al. Quantifying pulmonary perfusion from noncontrast computed tomography. Med. Phys.48, 1804–1814, 10.1002/mp.14792 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Myziuk, N. et al. Pulmonary blood mass dynamics on 4DCT during tidal breathing. Phys. Med. Biol.64, 045014, 10.1088/1361-6560/aaff7b (2019). [DOI] [PubMed] [Google Scholar]
  • 13.Terry, P. B. & Traystman, R. J. The Clinical Significance of Collateral Ventilation. Ann. Am. Thorac. Soc.13, 2251–2257, 10.1513/AnnalsATS.201606-448FR (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Yamada, Y. et al. Differences in Lung and Lobe Volumes between Supine and Standing Positions Scanned with Conventional and Newly Developed 320-Detector-Row Upright CT: Intra-Individual Comparison. Respiration99, 598–605, 10.1159/000507265 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Bhatt, S. P. et al. Imaging Advances in Chronic Obstructive Pulmonary Disease. Insights from the Genetic Epidemiology of Chronic Obstructive Pulmonary Disease (COPDGene) Study. Am. J. Respiratory Crit. Care Med.199, 286–301, 10.1164/rccm.201807-1351SO (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Lynch, D. A. et al. CT-Definable Subtypes of Chronic Obstructive Pulmonary Disease: A Statement of the Fleischner Society. Radiology277, 192–205, 10.1148/radiol.2015141579 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Corlateanu, A. et al. Multidimensional indices in the assessment of chronic obstructive pulmonary disease. Respir. Med. 185, 106519, 10.1016/j.rmed.2021.106519 (2021). [DOI] [PubMed] [Google Scholar]
  • 18.Morales, D. R. et al. External validation of ADO, DOSE, COTE and CODEX at predicting death in primary care patients with COPD using standard and machine learning approaches. Respiratory Med.138, 150–155, 10.1016/j.rmed.2018.04.003 (2018). [DOI] [PubMed] [Google Scholar]
  • 19.Owusuaa, C., Dijkland, S. A., Nieboer, D., van der Rijt, C. C. D. & van der Heide, A. Predictors of mortality in chronic obstructive pulmonary disease: a systematic review and meta-analysis. BMC Pulm. Med.22, 125, 10.1186/s12890-022-01911-5 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Gagatek, S. et al. Validation of Clinical COPD Phenotypes for Prognosis of Long-Term Mortality in Swedish and Dutch Cohorts. COPD: J. Chronic Obstr. Pulm. Dis.19, 330–338, 10.1080/15412555.2022.2039608 (2022). [DOI] [PubMed] [Google Scholar]
  • 21.Castillo, E. et al. Technical Note: On the Spatial Correlation Between Robust CT-Ventilation Methods and SPECT Ventilation. Med. Phys.47, 5731–5738, 10.1002/mp.14511 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Mohamed Hoesein, F. A. A. et al. Contribution of CT Quantified Emphysema, Air Trapping and Airway Wall Thickness on Pulmonary Function in Male Smokers With and Without COPD. COPD: J. Chronic Obstr. Pulm. Dis.11, 503–509, 10.3109/15412555.2014.933952 (2014). [DOI] [PubMed] [Google Scholar]
  • 23.Schroeder, J. D. et al. Relationships Between Airflow Obstruction and Quantitative CT Measurements of Emphysema, Air Trapping, and Airways in Subjects With and Without Chronic Obstructive Pulmonary Disease. Am. J. Roentgenol.201, W460–W470, 10.2214/AJR.12.10102 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Castillo, E., Vinogradskiy, Y. & Castillo, R. Robust HU-Based CT-Ventilation from an Integrated Mass Conservation Formulation. Med. Phys.46, 5036–5046 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Regan, E. A. et al. Genetic Epidemiology of COPD (COPDGene) Study Design. COPD: J. Chronic Obstr. Pulm. Dis.7, 32–43, 10.3109/15412550903499522 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Mannino, D. M. GOLD Stage 0 COPD: Is it Real? Does it Matter? Chest130, 309–310, 10.1016/S0012-3692(15)51839-4 (2006). [DOI] [PubMed] [Google Scholar]
  • 27.Brennan, D. et al. Clinical Validation of 4-Dimensional Computed Tomography Ventilation With Pulmonary Function Test Data. Int. J. Radiat. Oncol. Biol. Phys.92, 423-429. 10.1016/j.ijrobp.2015.01.019 [DOI] [PMC free article] [PubMed]
  • 28.Nair, G. B. et al. An assessment of the correlation between robust CT-derived ventilation and pulmonary function test in a cohort with no respiratory symptoms. Br. J. Radiol.94, 20201218, 10.1259/bjr.20201218 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Hemant, I., Udaya, B. K., Eugene, H. B. & Michael, S. L. Random survival forests. Ann. Appl. Stat.2, 841–860, 10.1214/08-AOAS169 (2008). [Google Scholar]
  • 30.Blanche, P., Dartigues, J.-F. & Jacqmin-Gadda, H. Estimating and comparing time-dependent areas under receiver operating characteristic curves for censored event times with competing risks. Stat. Med.32, 5381–5397, 10.1002/sim.5958 (2013). [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The imaging data used in this study is from the COPDGene® Project (www.clinicaltrials.gov [NCT00608764]). Access to the the data is provided through: https://copdgene.org. The data that support the findings of this study are available from the corresponding author, Edward Castillo, upon reasonable request.


Articles from npj Biomedical Innovations are provided here courtesy of Nature Publishing Group

RESOURCES