Skip to main content
European Journal of Medical Research logoLink to European Journal of Medical Research
. 2025 Dec 27;31:161. doi: 10.1186/s40001-025-03601-4

Construction of a severity prediction model for hospitalized patients with acute exacerbation of chronic obstructive pulmonary disease based on machine learning

Zian Liu 1,#, Shiyuan Gao 1,#, Zhe Ye 3, Qiong Pan 1, Yiwen Huang 1, Jiahui Yuan 1, Fengmei Li 2, Yixin Lian 1,, Chen Geng 2
PMCID: PMC12853751  PMID: 41456012

Abstract

Background:

Chronic obstructive pulmonary disease is a common respiratory disease. The severity of acute exacerbation of chronic obstructive pulmonary disease is related to disease progression and risk of death. However, the existing grading standards mainly depend on indicators, such as respiratory rate, whether to apply assisted respiratory muscles, and changes in consciousness state, and only reflect the subjective judgment. Imaging omics can extract muscle characteristic data for more complex analysis, which helps to provide a more objective and accurate method to assess the severity of disease for clinic.

Objectives:

The purpose of this study is to construct a severity prediction model based on the combination of chest CT muscle imaging features and clinical data in hospitalized patients with AECOPD.

Methods:

234 hospitalized patients with AECOPD were retrospectively included, divided into 79 grade I, 74 grade II, and 81 grade III. Clinical data and chest CT images were collected. Construction of clinical feature model combined with muscle imaging omics model based on Python machine learning platform.

Results:

The number of hospitalizations for acute exacerbation, disease course, risk of acute exacerbation in stable stage, white blood cell count, neutrophil count, creatinine, and N-terminal B-type natriuretic peptide precursor were statistically different among hospitalized patients with AECOPD in the last year (all P < 0.05). The best model to predict the severity of AECOPD by cascade probability combination method is Xgboost model with AUC of 0.890.

Conclusions:

The disease grading prediction model of AECOPD inpatients constructed based on clinical data and muscle imaging omics characteristics has good performance, and has great potential in assisting clinicians to more accurately stratify the risk of AECOPD inpatients.

Keywords: Acute exacerbation of chronic obstructive pulmonary disease, Imagomics, Machine Learning, Predictive models

Introduction

Chronic obstructive pulmonary disease (COPD) is a heterogeneous lung disease characterized by chronic respiratory failure, with persistent, progressive airflow obstruction due to abnormalities in the airways (bronchitis, bronchiolitis) and/or alveoli (emphysema) [1, 2]. Studies have shown that the need for hospitalization is independently associated with mortality in Acute exacerbation of chronic obstructive pulmonary disease (AECOPD), and the risk of death increases with the increasing frequency of acute exacerbations [3]. At present, COPD treatment guidelines emphasize the daily management of chronic diseases to prevent AECOPD in high-risk patients.

Although there are multiple evaluation criteria for stratifying the severity of AECOPD, the most commonly used to stratify the severity of patients’ condition to make treatment decisions is the GOLD 2022 guideline recommended protocol, which divides the severity of AECOPD patients into three levels around respiratory rate, whether to apply assisted respiratory muscles, altered state of consciousness, hypoxemia, and hypercapnia. However, these grading indicators are interfered by many factors, such as subjective factors of doctors and patients, treatment, etc. While imaging techniques, such as CTA or duplex ultrasound provide valuable data, they are not without limitations. CTA requires the use of iodinated contrast agents, which carries the risk of contrast-induced nephropathy, especially in the elderly or in patients with pre-existing renal insufficiency—a common comorbidity in patients with severe AECOPD [4, 5]. Duplex ultrasound, on the other hand, is operator-dependent and may be limited by patient position. In contrast, non-contrast CT (NCCT) is extensively performed on admission in patients with AECOPD, and the utilization of NCCT for radiomics analysis offers a unique advantage as it provides objective assessment without the need for additional contrast exposure or specialized ultrasound. This makes it a safer and more universally applicable tool for rapid risk stratification in vulnerable patient populations.

The prevalence of sarcopenia in COPD patients is high, and the prevalence increases with disease progression [6]. At the same time, the severity of sarcopenia in COPD patients is significantly negatively correlated with patients’ pulmonary function classification, disease grouping, and health-related clinical outcomes [7]. Sarcopenia in COPD originates from multiple and overlapping mechanisms, including systemic inflammatory response, oxidative stress (ROS, etc.), hypoxemia, hypercapnia, glucocorticoid use, etc [8, 9]. Therefore, the study of sarcopenia can partially reflect the systemic state of COPD. At present, quantitative analysis of pectoral muscles on chest CT has been extensively studied in COPD, and imaging omics for more complex analysis by extracting muscle features may reveal predictive insights into the severity of AECOPD that traditional assessment methods have missed.

Therefore, this study aimed to develop and validate a machine learning model that integrates clinical data and chest CT muscle imaging omics features to predict the severity of hospitalized patients with AECOPD.

Materials and Methods

Case data collection

Patients who met the GOLD 2022 AECOPD diagnostic criteria who were hospitalized at the Second Affiliated Hospital of Soochow University from December 2020 to December 2023 were retrospectively included.

Inclusion criteria: 1 The study population includes age 40 years old; 2 Meet the diagnostic criteria of GOLD 2022 AECOPD; 3 Complete relevant laboratory examinations and chest CT scan within 24 hours before and after admission, and have complete imaging data; 4 Complete clinical data.

Exclusion criteria: 1 Combined with other respiratory diseases, such as asthma, pulmonary embolism, pneumothorax, lung cancer, interstitial lung disease, active pulmonary infectious diseases, etc.; 2 Combined with consumptive or metabolic diseases, such as hyperthyroidism, malignant tumors and diabetes; 3 Combined with severe liver and kidney dysfunction, immune system diseases, cardiovascular diseases, etc. or in the acute stage of such diseases; 4 Missing or incomplete clinical data; 5 The quality of image data does not meet the requirements, such as motion artifacts, insufficient resolution, etc.

A total of 602 hospitalized patients with AECOPD who met the diagnostic criteria of GOLD 2022 were excluded, 40 patients without chest CT examination within 24 hours, 121 patients with other respiratory diseases, 101 patients with wasting diseases, 45 patients with missing or incomplete clinical data, and 61 patients with poor CT image quality. Finally, 234 patients with AECOPD who met the inclusion criteria were included in this study. According to the GOLD 2022 AECOPD classification, 79 cases were grade I, 74 cases were grade II, and 81 cases were grade III (Fig. 1).

Fig. 1.

Fig. 1

Flowchart of AECOPD patient inclusion

Clinical data were collected within 24 hours before and after admission, including: age, gender, Body mass index (BMI, weight (kg)/height (m2)), smoking index (number of cigarettes per day × number of years of smoking), number of acute exacerbations in the last year, disease course, whether drug treatment is used, risk of acute exacerbations in stable phase (according to the number of acute exacerbations in the previous year, modified British medical research council (mMRC) or COPD Assessment (CAT), divided into low-risk and high-risk groups) [1], White blood cell (WBC) count, Neutrophil (N) count, Hemoglobin (Hb), Eosinophil (EO) count, High-sensitivity c-reactive protein (hs-CRP), Procalcitonin (PCT), Creatinine (CRE), prealbumin, albumin, Prothrombin time (PT), D-dimer, International normalized ratio (INR), antithrombin III (ATIII), fibrinogen, N-terminal pro-B-type natriuretic peptide (NT-proBNP). If there are multiple examinations within 24 hours before and after admission, the results of the first examination will be included.

According to the severity grading method of AECOPD patients recommended by GOLD 2022 guidelines, the study subjects were divided into three groups: non-acute respiratory failure group (grade I), acute respiratory failure non-life-threatening group (grade II) and acute respiratory failure with life-threatening group (grade III), as shown in Table 1.

Table 1.

GOLD 2022 AECOPD patient severity classification

Grade I Grade II Grade III
Respiratory rate (times/min) 20–30 >30 >30
Application of assisted respiratory muscles without with with
Altered state of consciousness without without with
Hypoxemia Improved by Venturi Mask 24–35% concentration oxygen Improved by Venturi mask > 35% concentration oxygen Cannot be improved by oxygen administered to a venturi mask or oxygen administered to > 40% concentration
Hypercapnia without Yes, PaCO2 increased from baseline or increased to 50–60 mmHg Yes, PaCO2 increased from baseline or > 60 mmHg or acidosis (pH 7.25)

Muscle segmentation on CT images

Region of interest (ROI) was sketched manually using itk-snap software (Version 4.0.0, https://sourceforge.net/projects/itk-snap/files/itk-snap/4.0.0/), left and right muscles were identified using a predefined attenuation range −50 to 90 HU, colored manually. The left and right pectoralis major and pectoralis minor muscles on a single axial level above the level of the aortic arch are ROI1; The left and right erector spinae muscles on a single axial level at the level of the twelfth thoracic vertebra are ROI2; The total volume and density values of erector spinae and pectoral muscles in chest CT and the product of the two were included in the clinical data, that is, ROI3 (volume is the volume of a single voxel multiplied by the number of voxels in the ROI region, density is the average density, and the sum of voxel values in the ROI region divided by the number of voxels). The sketching results were jointly validated by two experienced radiologists. Inter-observer agreement was assessed using the intraclass correlation coefficient (ICC). The results show that most of the features have ICC values greater than 0.85, indicating good inter-observer agreement and ensuring the reliability of the extracted features.

After all the features are normalized and preprocessed, the image omics features are extracted. A total of 1874 quantitative image omics features were extracted from the region of interest of each voxel.

Feature screening and model construction

SMOTE is used to balance the feature of the extracted imagomics feature positive samples. After performing standardized preprocessing on the imaging features of erector spinae and pectoral muscles extracted from CT images, the Least Absolute Shrinkage and Selection Operator (LASSO) regression model was applied to perform feature dimensionality reduction, and finally a set of imaging features for assessing AECOPD risk classification was constructed. Twenty-three clinical data were screened by LASSO algorithm, and the clinical data used to evaluate the risk classification of AECOPD were obtained. The missing clinical data of some samples shall be replaced by the median of the same batch of samples. Modeling was performed using screened clinical data and radiomics features. In this study, six common machine learning algorithms are used to train models, namely, Nu–support vector classification (Nu–SVC), C-support vector classification (C-SVC), logistic regression (LR), random forest (RF), adaptive boosting (Adaboost) and eXtreme Gradient Boosting (Xgboost) classifier, and features are input into these six classifiers for model construction. Experiments were performed on the test and training sets by five times of fivefold cross-validation.

In this study, a cascade classification strategy was used to establish a prediction model of AECOPD severity. First, a two-class model was constructed to distinguish non-life-threatening cases (grade I and II) from life-threatening cases (grade III), and then a respiratory failure grading model (grade I and II) was established for non-life-threatening cases. Finally, a three-class classification model of AECOPD severity was obtained by cascade probability combination method.

Statistical analysis

SPSS 26.0 software was used for statistical analysis. The basic characteristics of the sample were summarized by descriptive statistics, and the continuous variables were tested for normality. Data conforming to the normal distribution were expressed by mean and standard deviation (x¯ ± s), and non-normal distribution data were expressed by median and interquartile range [M (Q1, Q3)]; Categorical variables were expressed in frequency and percentage, i.e., N (%). One-way ANOVA was used for normally distributed continuous variables, and non-parametric Kruskal–Wallis H test was used for non-normally distributed continuous variables; chi-square test or Fisher test were used for difference analysis of categorical variables; Significance was marked as P < 0.05.

The study used Receiver operator characteristic curve (ROC) analysis to evaluate the prediction performance of each model, and calculated its Area under the curve (AUC), 95% Confidence interval (CI), accuracy, sensitivity, specificity, precision, and compared the classification performance of different classification algorithms on the test set.

Results

Baseline characteristics of hospitalized patients

A total of 234 hospitalized patients with AECOPD were included and divided into 79 cases with grade I, 74 cases with grade II, and 81 cases with grade III according to the GOLD 2022 AECOPD severity grading criteria. The clinical data of the patients are shown in Table 2.

Table 2.

Comparison results of clinical data among different severity degrees of hospitalized patients with AECOPD

Indicators Grade I (n=79) Grade II (n=74) Grade III (n=81) Statistical value P
Gender (n, %) 0.078 0.962
Male 66 (83.50) 63 (85.10) 68 (84.00)
Female 13 (16.50) 11 (14.90) 13 (16.00)
Age (years, x¯±s) 74.19 ± 8.48 77.05 ± 7.85 74.94 ± 8.37 2.461 0.088
BMI (kg/m2, x¯±s) 22.66 ± 3.49 21.93 ± 3.77 22.64 ± 4.81 0.729 0.484
Smoking index (Year package, x¯±s) 593.18 ± 567.10 529.54 ± 468.95 533.22 ± 531.74 0.309 0.734
Number of hospitalizations for acute exacerbations in the last year (Time, x¯±s) 0.47 ± 0.68 0.75 ± 1.08 1.09 ± 1.10 a,b 7.986 < 0.001
Duration of disease (years, x¯±s) 9.85 ± 11.26 13.61 ± 12.01 13.66 ± 9.37 3.144 0.045
Use of medication (n, %) 3.562 0.168
Yes 28 (35.40) 36 (48.60) 39 (48.10)
No 51 (64.60) 38 (51.40) 42 (51.90)
Risk of exacerbation in stable phase (n, %) 16.592 < 0.001
Low 41 (51.90) 29 (39.20) 17 (21.00)
High 38 (48.10) 44 (59.50) 64 (79.00) a,b
WBC (×10/L, x¯±s) 7.38 ± 2.78 8.93 ± 3.54 9.28 ± 4.28 6.193 0.002
N (×10/L, x¯±s) 5.22 ± 2.55 6.81 ± 3.42 a 7.22 ± 3.65 8.315 < 0.001
PCT (ng/ml, x¯±s) 0.08 ± 0.08 0.34 ± 1.55 0.16 ± 0.28 1.406 0.248
Hb (g/L, x¯±s) 132.33 ± 15.65 137.09 ± 15.32 137.63 ± 21.28 2.118 0.123
EO (×10/L, x¯±s) 0.15 ± 0.17 0.15 ± 0.18 0.10 ± 0.19 1.890 0.153
hs-CRP (mg/L, x¯±s) 16.29 ± 28.84 29.30 ± 56.11 20.79 ± 26.33 1.964 0.143
CRE (μmol/L, x¯±s) 85.14 ± 27.16 76.66 ± 29.22 70.98 ± 23.70 5.684 0.004
Albumin (g/L, x¯±s) 39.11 ± 4.10 39.89 ± 4.60 38.44 ± 5.08 1.854 0.159
Prealbumin (g/L, x¯±s) 0.20 ± 0.07 0.18 ± 0.06 0.18 ± 0.06 1.215 0.300
PT (s, x¯±s) 12.54 ± 1.20 12.62 ± 1.15 12.97 ± 1.38 2.490 0.085
D-dimer (μg/ml, x¯±s) 0.79 ± 1.41 0.57 ± 0.55 0.84 ± 0.95 1.388 0.252
INR (x¯±s) 1.01 ± 0.08 1.03 ± 0.09 1.04 ± 0.13 1.722 0.181
AT III (%, x¯±s) 86.52 ± 10.82 85.75 ± 12.18 83.89 ± 12.91 0.925 0.398
Fibrinogen (g/L, x¯±s) 3.71 ± 1.47 4.21 ± 1.39 3.86 ± 1.07 2.621 0.075
NT-proBNP (pg/ml, x¯±s) 395.46 ± 988.56 424.02 ± 759.75 1291.92 ± 2084.92 a.b 9.117 < 0.001

 , P < 0.05; , P < 0.01; , P < 0.001. a, P < 0.05 compared with grade I; b, P < 0.05 compared with grade II; c, P < 0.05 compared with grade III

Chi-square test and one-way analysis of variance showed that the number of hospitalizations for acute exacerbation in the last year (P < 0.001), disease course (P = 0.045), stable exacerbation risk (P < 0.001), white blood cell count (P = 0.002), neutrophil count (P < 0.001), creatinine (P = 0.004), and NT-proBNP (P < 0.001) were statistically different among the three groups.

Construction of joint model

Grades I, II and III are classified according to whether they are life-threatening

The LASSO algorithm is used to screen the extracted image omics features, and 27 features with non-zero coefficients are obtained, including 18 texture features (8 GLCM, 5 NGTDM, 3 GLSZM, 2 GLDM) and 9 first-order gray histogram features, as shown in Fig. 2 and Table 3.

Fig. 2.

Fig. 2

Imaging omics results after LASSO screening

Table 3.

Names and coefficients of 27 imagomics features with non-zero coefficients

Feature Name Image Type Feature Type Eigenvalues Weight coefficient
log-sigma-1–5-mm3D_glszm_SmallAreaEmphasis Gaussian Laplace Texture features Small area emphasis −0.124194090
squareroot_ngtdm_Busyness Square root Texture features Degree of variation 0.053362835
Wavelet LLH_gldm_DependenceNonUniformityNormalized Wavelet Texture features Dependency heterogeneity 0.021936254
wavelet-LHH_firstorder_Median Wavelet First-order gray histogram features Median number 0.171763434
log-sigma-1-mm-3D_glcm_MCC Gaussian Laplace Texture features Maximum correlation coefficient 0.088768429
logarithm_ngtdm_Strength Logarithm Texture features Return strength 0.075331320
original_firstorder_Mean Original image First-order gray histogram features Average −0.060474813
wavelet-LLH_glcm_MCC Wavelet Texture features Maximum correlation coefficient −0.057410093
square_firstorder_InterquartileRange Square First-order gray histogram features Interquartile range −0.050552331
lbp-3D-k_ngtdm_Busyness Local binary Texture features Degree of variation −0.050211373
exponential_glszm_SizeZoneNonUniformity Index Texture features Dimensional region inhomogeneity 0.048018975
wavelet-LHH_glcm_ClusterShade Wavelet Texture features Clustering shadow 0.047864251
lbp-3D-m1_firstorder_Median Local binary First-order gray histogram features Median number 0.045189763
wavelet-HLL_gldm_LargeDependenceHighGrayLevelEmphasis Wavelet Texture features Emphasize high gray scale and large dependence −0.038146203
wavelet-HHH_ngtdm_Contrast Wavelet Texture features Contrast ratio −0.032083419
gradient_glcm_Idmn Gradient transformation Texture features Deficit moment normalization 0.030651755
wavelet-HLL_firstorder_Mean Wavelet First-order gray histogram features Average 0.028701025
squareroot_glcm_ClusterProminence Square root Texture features Cluster prominence −0.028196745
log-sigma-1–5-mm-3D_firstorder_RootMeanSquared Gaussian Laplace First-order gray histogram features Root mean square −0.025216849
square_glcm_MCC Square Texture features Maximum correlation coefficient 0.021900238
wavelet-HLH_firstorder_Median Wavelet First-order gray histogram features Median number −0.019751195
wavelet-HHL_glszm_LargeAreaLowGrayLevelEmphasis Wavelet Texture features High gray scale long running length 0.016001842
wavelet-LLH_glcm_Contrast Wavelet Texture features Contrast ratio 0.015893369
wavelet-HHH_firstorder_Median Wavelet First-order gray histogram features Median number −0.014626083
squareroot_ngtdm_Strength Square root Texture features Strength 0.014449562
wavelet-LLH_firstorder_Median Wavelet First-order gray histogram features Median number 0.004796186
log-sigma-1-mm-3D_glcm_Idmn Gaussian Laplace Texture features Deficit moment normalization −0.004576479

LASSO algorithm was used to screen 23 clinical data, and 13 variables were obtained, namely, NT-proBNP, antithrombin III, D-dimer, prothrombin time, albumin, prealbumin, creatinine, eosinophil Cyte count, neutrophil count, drug control, smoking index, BMI, sex, see Fig. 3 and Table 4.

Fig. 3.

Fig. 3

Clinical data results after LASSO screening

Table 4.

Names and coefficients of 13 clinical data with non-zero coefficients

Name Weight coefficient
NT-proBNP 0.086622414
CRE −0.060036175
Drug −0.0546340231
PT 0.053007572
EO −0.039905134
pre-ALB 0.030660692
gender −0.022259690
N 0.021396957
BMI 0.015585371
ATIII −0.011841793
ALB −0.007673772
D-dimer 0.002599356
smoking index −0.000825472

Including the above 27 imaging features, 13 clinical data, the total volume and density values of erector spinae and pectoral muscles in chest CT and their products, a total of 43 features were input into six classifiers to construct a combined model. The best model in the training group was Xgboost model with AUC of 0.935 (95% CI, 0.921-0.948), accuracy, sensitivity, specificity, and precision of 0.863, 0.910, 0.816, and 0.832, respectively, and the best model in the test group was RF model with AUC of 0.931 (95% CI, 0.853–1.000), accuracy, sensitivity, specificity, and precision of 0.860, 0.800, 0.893, and 0.800, respectively, as shown in Fig. 4 and Table 5.

Fig. 4.

Fig. 4

ROC curves of CT imaging omics combined with clinical data model for predicting life-threatening AECOPD. (a) ROC curve of the training cohort. (b) ROC curve of the test cohort.

Table 5.

Diagnostic performance of CT imaging omics combined with clinical data model in predicting whether AECOPD is life-threatening in the training and test groups

AUC (95% CI) Accuracy Sensitivity Specificity Precision
CSVC Training 0.921 (0.905, 0.936) 0.844 0.836 0.852 0.850
Test 0.860 (0.751, 0.968) 0.721 0.467 0.857 0.636
LR Training 0.912 (0.896, 0.928) 0.838 0.813 0.862 0.855
Test 0.840 (0.705, 0.976) 0.744 0.933 0.643 0.583
RF Training 0.931 (0.917, 0.945) 0.861 0.846 0.877 0.873
Test 0.931 (0.853, 1) 0.860 0.800 0.893 0.800
Adaboost Training 0.918 (0.903, 0.934) 0.856 0.854 0.857 0.857
Test 0.9 (0.799, 1) 0.791 0.800 0.786 0.667
Xgboost Training 0.935 (0.921, 0.948) 0.863 0.910 0.816 0.832
Test 0.919 (0.840, 0.998) 0.814 0.867 0.786 0.684
NuSVC Training 0.924 (0.910, 0.939) 0.849 0.849 0.849 0.849
Test 0.858 (0.749, 0.968) 0.767 0.733 0.786 0.647

Grade I and Grade II were classified according to the presence or absence of respiratory failure

The LASSO algorithm is used to screen the extracted image omics features, and 21 features with non-zero coefficients are obtained, including 14 texture features (4 GLCM, 3 GLSZM, 3 GLDM, 3 NGTDM, 1 GLRLM), 6 first-order gray histogram features and 1 shape feature, as shown in Fig. 5 and Table 6.

Fig. 5.

Fig. 5

Imagomics results after LASSO screening

Table 6.

Names and coefficients of 21 imagomics features with non-zero coefficients

Feature Name Image Type Feature Type Eigenvalues Weight coefficient
exponential_firstorder_Range Index First-order gray histogram features Scope −0.098551556
wavelet-LLL_glszm_GrayLevelNonUniformityNormalized Wavelet Texture features Gray intensity variation −0.089025666
original_shape_Maximum2DDiameterRow Original image Shape features Maximum 2D diameter 0.069989043
squareroot_ngtdm_Busyness Square root Texture features Degree of variation 0.055137651
log-sigma-1–5-mm-3D_glcm_Idmn Gaussian Laplace Texture features Homogeneity 0.052796116
gradient_ngtdm_Coarseness Gradient transformation Texture features Roughness −0.051926800
wavelet-LLL_glcm_MCC Wavelet Texture features Maximum correlation coefficient 0.044209935
wavelet-LHL_firstorder_90Percentile Wavelet First-order gray histogram features 90th percentile 0.043427594
wavelet-LHH_firstorder_Kurtosis Wavelet First-order gray histogram features Kurtosis −0.043065732
wavelet-LHL_glrlm_LongRunHighGrayLevelEmphasis Wavelet Texture features Long-term high gray-level emphasis 0.042273364
log-sigma-0–5-mm-3D_gldm_DependenceVariance Gaussian Laplace Texture features Dependent variance 0.033456475
log-sigma-1-mm-3D_glszm_LargeAreaHighGrayLevelEmphasis Gaussian Laplace Texture features Large area high gray-scale emphasis 0.026504495
exponential_firstorder_Kurtosis Index First-order gray histogram features Kurtosis −0.024836586
squareroot_glszm_ZoneEntropy Square root Texture features Regional entropy 0.021905257
log-sigma-1–5-mm-3D_firstorder_Skewness Gaussian Laplace First-order gray histogram features Skewness −0.020516155
wavelet-LLH_firstorder_Skewness Wavelet First-order gray histogram features Skewness 0.014138977
wavelet-LHH_ngtdm_Contrast Wavelet Texture features Contrast ratio 0.013612707
log-sigma-1-mm-3D_glcm_MCC Gaussian Laplace Texture features Maximum correlation coefficient 0.012286738
log-sigma-1-mm-3D_glcm_Imc2 Gaussian Laplace Texture features Correlation information measurement 0.008612290
gradient_gldm_DependenceVariance Gradient transformation Texture features Dependent variance 0.005168462
wavelet-HLL_gldm_DependenceEntropy Wavelet Texture features Dependent entropy −0.004487384

LASSO algorithm was used to screen 23 clinical data, and 10 variables were obtained, namely, albumin, creatinine, procalcitonin, hemoglobin, neutrophil count, risk of exacerbation in stable phase, whether drug control is being used, course of disease, the number of hospitalizations and age of acute exacerbation in last year, as shown in Fig. 6 and Table 7.

Fig. 6.

Fig. 6

Results of clinical data after LASSO screening

Table 7.

Names and coefficients of 10 clinical data with non-zero coefficients

Name Weight coefficient
age 0.0952147161998937
N 0.0903028993339369
risk of aggravation 0.0612629371169276
PCT 0.0480171498769332
CRE −0.0449644822964078
ALB 0.0363473267703825
Hb 0.0319401213840267
length of disease 0.0239407523510268
Drug −0.0175317008883738
Hospitalization frequency 0.0160001349474592

Including the above 21 imaging features, 10 clinical data, chest CT erector spinae volume and density values and their products, a total of 34 features were input into six classifiers to construct a combined model. The best model in the training group was Nu–SVC model with AUC of 0.889 (95% CI, 0.864-0.915), accuracy, sensitivity, specificity, and precision of 0.812, 0.821, 0.803, and 0.793, respectively, and the best model in the test group was Xgboost model with AUC of 0.871 (95% CI, 0.746-0.996), accuracy, sensitivity, specificity, and precision of 0.7, 1.0, 0.4, and 0.625, respectively, as shown in Fig. 7 and Table 8.

Fig. 7.

Fig. 7

ROC curves of CT imaging omics combined with clinical data model for discriminating between Grade I and Grade II AECOPD. (a) ROC curve of the training cohort. (b) ROC curve of the test cohort.

Table 8.

Diagnostic performance of CT imaging omics combined with clinical data model in predicting AECOPD grades I and II in training and test groups

AUC (95% CI) Accuracy Sensitivity Specificity Precision
CSVC Training 0.591 (0.381, 0.801) 0.533 0.533 0.533 0.533
Test 0.860 (0.751, 0.968) 0.721 0.467 0.857 0.636
LR Training 0.858 (0.827, 0.890) 0.817 0.814 0.819 0.805
Test 0.693 (0.498, 0.888) 0.7 0.867 0.533 0.65
RF Training 0.761 (0.722, 0.799) 0.683 0.662 0.702 0.671
Test 0.858 (0.725, 0.991) 0.733 0.933 0.533 0.667
Adaboost Training 0.751 (0.712, 0.790) 0.716 0.686 0.743 0.711
Test 0.684 (0.487, 0.882) 0.667 0.733 0.6 0.64
Xgboost Training 0.722 (0.682, 0.762) 0.671 0.738 0.61 0.635
Test 0.871 (0.746, 0.996) 0.7 1.00 0.4 0.625
NuSVC Training 0.889 (0.864, 0.915) 0.812 0.821 0.803 0.793
Test 0.667 (0.465, 0.868) 0.6 0.667 0.533 0.588

Construction of three-classification model

The results of the above two test sets were combined with the cascade probability method, and the best model for predicting the severity of AECOPD was the Xgboost model, with an AUC of 0.890 (95% CI, 0.819-0.961), and accuracy, sensitivity, specificity and precision of 0.767, 0.933, 0.651 and 0.651, respectively, as shown in Fig. 8 and Table 9.

Fig. 8.

Fig. 8

ROC curves for the three-class prediction of AECOPD severity (Grade I, II, and III) in the test cohort

Table 9.

Diagnostic performance of predicted AECOPD three categories in the test group

AUC (95% CI) Accuracy Sensitivity Specificity Precision
CSVC 0.761 (0.648, 0.875) 0.761 0.5 0.744 0.577
LR 0.783 (0.676, 0.900) 0.726 0.900 0.605 0.614
RF 0.877 (0.798, 0.956) 0.781 0.833 0.744 0.694
Adaboost 0.798 (0.696, 0.900) 0.740 0.767 0.721 0.657
Xgboost 0.890 (0.819, 0.961) 0.767 0.933 0.651 0.651
NuSVC 0.785 (0.674, 0.895) 0.699 0.667 0.721 0.625

Discussion

This study constructed a machine learning model that combined chest CT muscle imaging omics and clinical data to predict the severity of the disease in hospitalized patients with AECOPD. To our knowledge, this is the first study to combine clinical data with muscle imaging omics features to predict AECOPD. The model shows excellent predictive performance for life-threatening, respiratory failure, and further prediction of three-level classification.

In the past, there have been many studies on the correlation analysis between CT imaging features and stable COPD. For example, texture analysis has shown its effectiveness in assessing the degree of emphysema. A study by Ginsburg et al. [10] showed that a texture-based approach was effective in classifying the lungs of never smokers, smokers without emphysema, and smokers with emphysema, suggesting that it is possible to identify early stages of smoking-related lung injury before emphysema develops. Lafata et al. [11] showed that the radiomics features extracted from CT images have the potential to quantify lung function changes and evaluate the correlation with spirometry testing. The same approach using radiomics can be extended to study its relationship with other gold standard COPD markers, such as the FEV1/FVC ratio or the frequency of exacerbations associated with COPD patients, enabling an accurate diagnosis of the severity of COPD. Kazuya Tanimura et al. [12] conducted a 5–10-year follow-up study on 130 male patients and 20 smoking control men to explore the relationship between erector spinae muscle as one of the anti-gravity muscle groups and the survival of COPD patients. The results of the study showed that the cross-sectional area of erector spinae muscle in COPD patients was lower than that in smoking controls, and it could reflect physical activity and the severity of COPD at the same time, which could be used as a predictor of mortality in COPD patients (HR, 0.85, 95% CI, 0.79-0.92, P < 0.001).

With the rapid development of medical imaging technology, many studies have clearly demonstrated lung structure, airway dilation, emphysema and other characteristics using high-resolution scanning of CT images, providing new basis and means for diagnosis and grading of COPD [1214]. As a new research method, imaging omics extracts a large number of quantitative features from medical images with the help of computer analysis technology, and has shown important application value in the prediction, diagnosis and treatment evaluation of many diseases [1518]. Imaging omics can comprehensively reflect the morphology, texture and other information of lesions, help to reveal potential disease patterns and pathological changes, and show great potential in the clinical management of complex diseases. Using chest CT to detect changes in muscle quantity and quality can facilitate individualized intervention, estimating the prognosis of COPD exacerbation [19].

Compared with previous studies, there are several key novelties in this study. Considering that the common comorbidity of COPD is sarcopenia [20], this study did not use indicators susceptible to subjective factors, but systematically extracted chest CT muscle imaging omics features, which ensured the use of objective physiological information and reduced potential prediction errors. In addition, previous prediction models mostly studied stable COPD, and this model studied the severity of acute exacerbation of COPD, which is of great benefit to the daily management of patients and the reduction of disease burden.

In this study, machine learning method was used to show that most of the features strongly correlated with the severity of acute exacerbation of COPD patients in the imaging features of pectoralis major, pectoralis minor and erector spinae on chest CT were texture features. Among the 27 features of life-threatening imaging omics, the first-order median feature of wavelet–LHH has a high relative weight, and the rest features are mainly wavelet transform texture features; Among the 21 imagomics features classified for severity grade I and grade II, the exponential first-order range features have a higher relative weight, and the rest features are mainly wavelet transform and Gaussian Laplace texture features. This is consistent with Kalysta Makimoto et al. ’s [21] understanding that radiomics can quantify subtle tissue heterogeneity imperceptible to the naked eye. For example, features such as“small area emphasis”may indicate the fragmentation of homogeneous muscle tissue into smaller, more distinct regions [17, 18, 22], which may pathophysiologically correspond to fat infiltration or fibrosis, a common feature of sarcopenia in COPD. Features such as“Busyness”and certain first-order statistical features (e.g., median and intensity) can reflect changes in muscle metabolic activity and structural integrity. Positively weighted features may be associated with preserved muscle structure and higher metabolic demands; Negatively weighted features may suggest degenerative changes and hypofunction. In addition, indicators such as size area inhomogeneity and dependence inhomogeneity can locate the distribution characteristics of lesions, while parameters such as deficit moment normalization and contrast can be used to evaluate the local inflammatory state. Radiomics features are obtained from medical images by applying various quantification methods to features that are difficult to recognize with the naked eye. These radiomic features enable the phenomena occurring on the muscle cross section due to muscle gain or loss to be quantified and digitized into various radiological indicators. The image texture features reflect the gray distribution features, and imply the heterogeneous composition or distribution of lesions in dimensional space [23]. It is worth noting that most of the features selected for inclusion in the radiomics model are high-order statistical features, indicators of texture and heterogeneity that can more clearly show subtle changes in tissue morphology [24]. In this study, multiple muscle radiomics features are ultimately used for the construction of prediction models, and most of them include features of voxel spatial distribution (high-order texture features and wavelet features), which can amplify subtle differences and have more accurate evaluation ability than general radiological feature parameters.

COPD is a systemic inflammatory disease involving multiple systems. The severity of its acute exacerbation is not only related to respiratory rate, degree of hypoxemia and degree of consciousness disturbance, but also affected by systemic inflammatory response, skeletal muscle consumption and multiple organ functional status. Comprehensive effects. In this study, a prediction model based on machine learning was constructed by combining multi-dimensional data, such as patient clinical data and chest CT muscle imaging omics characteristics, which provided a new direction for accurate stratification of acute exacerbation severity and provided a basis for individualized treatment strategies.

Clinical feasibility and implementation prospects

The models constructed in this study show good potential for clinical translation. The proposed clinical implementation workflow can be briefly described as follows: for patients hospitalized with AECOPD, after completing routine admission chest CT and blood tests, (1) radiologists or trained AI algorithms quickly outline the ROIs of pectoralis major, pectoralis minor, and erector spinae on CT images; (2) automated software extracts radiomics features from the ROI and integrates them with the patient’s clinical data; (3) the integrated data are fed into a pre-trained RF model, which outputs the probability that the patient’s severity is grade III; (4) if the severity is non-grade III, the data are fed into a pre-trained Xgboost model, which outputs the probability that the patient’s severity is grade I and grade II. This provides clinicians with objective decision support tools. The workflow utilizes existing clinical data without the need for additional imaging, demonstrating high feasibility and cost-effectiveness.

Limitations

This study has several limitations. First, the single-center retrospective nature and the limited sample size of this study are important limitations. Although we employed rigorous methods, such as LASSO regression and cross-validation to mitigate the risk of overfitting, but external validation in an independent cohort from different institutions is a crucial step for assessing the generalizability and robustness of a clinical prediction model, and this constitutes a core component of our future research plan. Furthermore, another significant limitation is the lack of pulmonary function data in this study. This is primarily due to the retrospective nature of the study and the clinical reality of managing AECOPD patients: during acute exacerbations, patients are often too ill to perform the forced expiratory maneuvers required for spirometry, leading to the common unavailability of pulmonary function tests during this phase. Consequently, our model is based on readily available CT and blood parameters at admission, which conversely gives it unique value for acute-phase assessment. However, the inability to correlate our model with the gold standard of pulmonary function is a drawback. Future studies that include pulmonary function tests during the stable phase and investigate their relationship with the acute-phase radiomics model would be of great significance.

Conclusion

This study innovatively combines muscle imaging omics with clinical data, avoids the interference of subjective factors in traditional grading methods, and provides a more objective and quantitative severity assessment tool. This model shows good application potential in assisting clinicians in risk stratification and individualized treatment of AECOPD patients.

Acknowledgements

This study would like to express its gratitude to the patients who participated in this study, the funds that provided funding and the professors who helped in the research.

Abbreviations

AECOPD

Acute exacerbation of chronic obstructive pulmonary disease;

AUC

Area under the curve;

CI

Confidence interval;

COPD

Chronic obstructive pulmonary disease;

CT

Computed tomography;

LASSO

Least Absolute Shrinkage and Selection Operator;

MRI

Magnetic resonance imaging;

NT-proBNP

N-terminal pro-B-type natriuretic peptide;

ROC

Receiver operator characteristic curve;

ROI

Region of interest.

Author contributions

All authors contributed to study conception and design. Zian Liu, Shiyuan Gao, Qiong Pan: raise questions, screen variables, collect data, write papers; Zhe Ye, Fengmei Li: screen variables; Yiwen Huang, Jiahui Yuan: collect data; Yixin Lian, Chen Geng: guide research, revise the thesis.

Funding

This work was supported in part by the National Science and Technology Major Project of the Ministry of Science and Technology of China (No. 2023ZD0503606); in part by National Natural Science Foundation of China (62441114); in part by Nuclear Medicine Technology Innovation Research Project(ZHYLZD 2025006) in part by Suzhou Science & Technology Projects (SSD2023008, SYW20240238, SKY2023223, SYW2025002); in part by Suzhou University Suzhou Medical College - Qilu Medical Research Fund Project(24QL200217).

Data availability

No data sets were generated or analysed during the current study.

Declarations

Ethics approval and consent to participate

This is a retrospective study and does not require further ethics committee approval as it does not involve animal or human clinical trials and is not unethical. The patient’s information was hidden before it was studied. All methods were performed in accordance with declaration of Helsinki.

Consent for publication

Not applicable.

Competing interests

The authors declare no conflict of interest.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally to this work.

References

  • 1.Agustí A, Celli BR, Criner GJ, Halpin D, Anzueto A, Barnes P, et al. Global initiative for chronic obstructive lung disease 2023 report: gold executive summary. J Pan Afr Thorac Soc. 2022;4(2):58–80. [Google Scholar]
  • 2.Liu Y, Liu T, Ruan L, Zhu D, He Y, Jia J, et al. Cilia plays a pivotal role in the hypersecretion of airway mucus in mice. Curr Mol Pharmacol. 2024;17(1):18761429368288. [DOI] [PubMed] [Google Scholar]
  • 3.Haile SR, Guerra B, Soriano JB, Puhan MA. Multiple score comparison: a network meta-analysis approach to comparison and external validation of prognostic scores. BMC Med Res Methodol. 2017;17(1):172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Kwok WC, Tam TC, Ho JC, Lam DC, Ip MS, Yap DY. Hospitalized acute exacerbation in chronic obstructive pulmonary disease-impact on long-term renal outcomes. Respir Res. 2024;25(1):36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Liu Z, Ma Z, Ding C. Association between COPD and CKD: a systematic review and meta-analysis. Front Public Health. 2024. 10.3389/fpubh.2024.1494291. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Lipovec NC, Schols AM, Borst B, Beijers RJ, Kosten T, Omersa D, et al. Sarcopenia in advanced copd affects cardiometabolic risk reduction by short-term high-intensity pulmonary rehabilitation. J Am Med Dir Assoc. 2016;17(9):814–20. [DOI] [PubMed] [Google Scholar]
  • 7.Araújo BE, Teixeira PP, Valduga K, Silva Fink J, Silva FM. Prevalence, associated factors, and prognostic value of sarcopenia in patients with acute exacerbated chronic obstructive pulmonary disease: a cohort study. Clin Nutr ESPEN. 2021;42:188–94. [DOI] [PubMed] [Google Scholar]
  • 8.Matera MG, Page C, Cazzola M. Sarcopenia as a treatable trait in COPD: from mechanisms to management. Respir Med. 2025;248:108401. [DOI] [PubMed] [Google Scholar]
  • 9.Yan Y, Hu J, Han N, Li HT, Yang X, Li LG, et al. Sorafenib-loaded metal-organic framework nanoparticles for anti-hepatocellular carcinoma effects through synergistically potentiating ferroptosis and remodeling tumor immune microenvironment. Mater Today Bio. 2025;32:101848. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Ginsburg SB, Lynch DA, Bowler RP, Schroeder JD. Automated texture-based quantification of centrilobular nodularity and centrilobular emphysema in chest ct images. Acad Radiol. 2012;19(10):1241–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Lafata KJ, Zhou Z, Liu J-G, Hong J, Kelsey CR, Yin F-F. An exploratory radiomics approach to quantifying pulmonary function in ct images. Sci Rep. 2019;9(1):11509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Tanimura K, Sato S, Fuseya Y, Hasegawa K, Uemasu K, Sato A, et al. Quantitative assessment of erector spinae muscles in patients with chronic obstructive pulmonary disease. Novel chest computed tomography-derived index for prognosis. Ann Am Thorac Soc. 2016;13(3):334–41. [DOI] [PubMed] [Google Scholar]
  • 13.Pishgar F, Shabani M, Silva TQAC, Bluemke DA, Budoff M, Barr RG, et al. Quantitative analysis of adipose depots by using chest ct and associations with all-cause mortality in chronic obstructive pulmonary disease: longitudinal analysis from mesarthritis ancillary study. Radiology. 2021;299(3):703–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Attaway AH, Welch N, Yadav R, Bellar A, Hatipoğlu U, Meli Y, et al. Quantitative computed tomography assessment of pectoralis and erector spinae muscle area and disease severity in chronic obstructive pulmonary disease referred for lung volume reduction. COPD: Journal of Chronic Obstructive Pulmonary Disease. 2021;18(2):191–200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Ebadi M, Bhanji RA, Dunichand-Hoedl AR, Mazurak VC, Baracos VE, Montano-Loza AJ. Sarcopenia severity based on computed tomography image analysis in patients with cirrhosis. Nutrients. 2020;12(11):3463. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Kim YJ. Machine learning models for sarcopenia identification based on radiomic features of muscles in computed tomography. Int J Environ Res Public Health. 2021;18(16):8710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Jong EE, Sanders KJ, Deist TM, Elmpt W, Jochems A, Timmeren JE, et al. Can radiomics help to predict skeletal muscle response to chemotherapy in stage iv non-small cell lung cancer? Eur J Cancer. 2019;120:107–13. [DOI] [PubMed] [Google Scholar]
  • 18.Dong X, Dan X, Yawen A, Haibo X, Huan L, Mengqi T, et al. Identifying sarcopenia in advanced non-small cell lung cancer patients using skeletal muscle ct radiomics and machine learning. Thorac Cancer. 2020;11(9):2650–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Qiao X, Hou G, Kang J, Wang Q-Y, Yin Y. Ct attenuation and cross-sectional area of the pectoralis are associated with clinical characteristics in chronic obstructive pulmonary disease patients. Front Physiol. 2022;13:833796. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Seymour J, Spruit M, Hopkinson N, Natanek S, Man W-C, Jackson A, et al. The prevalence of quadriceps weakness in copd and the relationship with disease severity. Eur Respir J. 2010;36(1):81–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Makimoto K, Hogg J, Bourbeau J, Tan W, Kirby M. Ct imaging with machine learning for predicting progression to copd in individuals at risk. Chest. 2023. 10.1016/j.chest.2023.06.008. [DOI] [PubMed] [Google Scholar]
  • 22.Shahzadi I, Zwanenburg A, Frohwein LJ, Schramm D, Meyer HJ, Hinnerichs M, et al. Short-term mortality prediction in acute pulmonary embolism: radiomics values of skeletal muscle and intramuscular adipose tissue. J Cachexia Sarcopenia Muscle. 2024;15(4):1430–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Li Z, Liu L, Zhang Z, Yang X, Li X, Gao Y, et al. A novel ct-based radiomics features analysis for identification and severity staging of copd. Acad Radiol. 2022;29(5):663–73. [DOI] [PubMed] [Google Scholar]
  • 24.Huynh E, Coroller TP, Narayan V, Agrawal V, Hou Y, Romano J, et al. Ct-based radiomic analysis of stereotactic body radiation therapy patients with lung cancer. Radiother Oncol. 2016;120(2):258–66. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

No data sets were generated or analysed during the current study.


Articles from European Journal of Medical Research are provided here courtesy of BMC

RESOURCES