Abstract
Purpose
Chronic obstructive pulmonary disease (COPD) is a prevalent and preventable condition that typically worsens over time. Acute exacerbations of COPD significantly impact disease progression, underscoring the importance of prevention efforts. This observational study aimed to achieve two main objectives: (1) identify patients at risk of exacerbations using an ensemble of clustering algorithms, and (2) classify patients into distinct clusters based on disease severity.
Methods
Data from portable medical devices were analyzed post-hoc using hyperparameter optimization with Self-Organizing Maps (SOM), Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Isolation Forest, and Support Vector Machine (SVM) algorithms, to detect flare-ups. Principal Component Analysis (PCA) followed by KMeans clustering was applied to categorize patients by severity.
Results
25 patients were included within the study population, data from 17 patients had the required reliability. Five patients were identified in the highest deterioration group, with one clinically confirmed exacerbation accurately detected by our ensemble algorithm. Then, PCA and KMeans clustering grouped patients into three clusters based on severity: Cluster 0 started with the least severe characteristics but experienced decline, Cluster 1 consistently showed the most severe characteristics, and Cluster 2 showed slight improvement.
Conclusion
Our approach effectively identified patients at risk of exacerbations and classified them by disease severity. Although promising, the approach would need to be verified on a larger sample with a larger number of recorded clinically verified exacerbations.
Keywords: COPD, Data analysis, Machine learning, Clustering
Introduction
The COPD (Chronic Obstructive Pulmonary Disease) is a common, mostly preventable respiratory disease. Recent guidelines GOLD 2024 (Global initiative for chronic obstructive lung disease) [1] define it as a heterogenous respiratory condition characterized by chronic respiratory symptoms (dyspnea, cough, sputum production and/or exacerbations) due to abnormities of the airways (bronchitis, bronchiolitis) and/or alveoli due to chronic inflammation that leads to persistent, often progressive airflow obstruction. The etiology of COPD is the result of the interaction between the genome and the environment where the most important extrinsic factor is the inhalation of toxic gasses (most importantly tobacco smoking), however other risk factors, such as alfa-1-antitrypsin deficiency, or impaired development of the lung (e.g., premature birth) are also being recognized [2, 3]. A significant proportion of COPD patients experience acute exacerbations, which are defined as acute worsening of respiratory symptoms (especially cough and dyspnea). The exacerbations are frequently leading to hospital admissions, emergency department visits and persistent clinically significant deterioration of patients’ condition. Importantly, all-cause (involving non-respiratory causes) mortality is increased during the exacerbation and for a certain period following the event. Additionally, the overall health condition of the patients gets progressively worse with each suffered COPD exacerbation, contributing to the significant burden of invalidity and health-care costs [4].
Consequently, one of the key clinical aims with regards to COPD is the reduction in the number of exacerbations. The optimal outcome would be the prevention of the very first severe exacerbation (exacerbation leading to hospitalization), an event often considered as the beginning of a spiral of rapid disease progressions, manifested as a series of exacerbations separated by ever shortening time intervals [5].
While the maintenance of stable COPD mostly relies on lifestyle changes (most importantly smoking cessation and regular exercise) and adherence to pharmacological therapy, the key to successful treatment of a COPD exacerbation lies in early detection and initiation of appropriate therapy. The early detection of exacerbations is often the pitfall in real-life clinical practice as nearly a half of the exacerbations are unreported by the patients, regardless of exacerbation severity [6].
The universally accepted COPD classification system proposed by GOLD [1] is a welcome tool for risk-stratification and estimation of prognosis. However, in the struggle for a more precise and personalized clinical approach, additional classifications were proposed. One such approach is the phenotyping of patients, which relies on the assessment of the most prominent clinical features of each patient, such as the presence of chronic sputum production, the presence of emphysema, eosinophilic type of inflammation etc. The phenotypes proposed differ across pneumological societies with varying overlap. For example, the Czech Pneumological and Phthiseological Society recognizes 6 different COPD phenotypes [7]. While both the GOLD staging and the phenotyping approach are already greatly helping the decision making in clinical practice, novel approaches are necessary to aid the early detection and treatment of exacerbations.
Several previous studies were designed to predict COPD exacerbations, using different methods and designs. Some studies have identified risk factors for exacerbation within a defined time period (i.e., male sex, lower ventilatory parameters, gastroesophageal reflux etc.) [8, 8]. A more ambitious goal was to predict the risk of acute COPD exacerbation (AECOPD) in real time. These studies usually employed machine learning models incorporating various clinical features to estimate said risk with varying accuracy (68–80.5%) [9–12].
The field of COPD predictors is steadily improving, and a variety of approaches were described in recent years. One of the early approaches employed heart rate variability (HRV) and Principal Component Analysis and Cluster Analysis (PCA-CA). This approach relied on real physiological data and cross-spectral analysis, specifically the coherence and partial coherence between heart rate, blood pressure and respiratory signals, and it was successful in differentiating between normal (8) and COPD (47) subjects. Moreover, the authors were able to classify exacerbation severity with greater than 88% accuracy [13]. The appropriate selection of features is crucial as increasing the number of features may lead to less meaningful clustering and poorer severity stratification of the patients. Feature space can hence be extremely rich, especially when a larger number of modalities are employed. Merone et al. [14] present an approach to overcoming the limitations of unsupervised feature selection, including the PCA based on variance. 150 COPD patients were enrolled and stratified into COPD stages 1 to 4 based on FEV1 (Forced expiratory volume in the first second) with cut-off values at 80, 50 and 30% of predicted values, respectively. This stratification relies only on FEV1, however, the dataset created by the researchers included 20 clinical, 28 sensor and 6 image-based features. To identify distinct groups of COPD patients, simultaneous feature selection and clustering (SFC) algorithm selecting the most informative features was applied. The SFC algorithm was based on the Expectation Maximization (EM) algorithm, which estimates the salience of each feature. The minimum message length (MML) penalty was computed to determine the number of clusters. The algorithm simultaneously selected the best discriminative features and performed Gaussian mixture clustering. The study was able to identify five distinct COPD phenotypes based on clinical and physiological characteristics. These phenotypes were distinct from the GOLD 2023 COPD stages, which may have implications for personalized medicine. Another decision support system to aid COPD diagnosis relies on supervised Random Forest classifier [15] which employs two distinct feature selection mechanisms, namely the Correlation-based Feature Subset Selection algorithm and the GainRatioAttributeEval algorithm. The first algorithm evaluates the worth of attributes’ subset by considering the individual predictive ability of each feature along with the degree of their redundancy, while the latter calculates the attribute’s worth by the measurement of the gain ratio with respect to the class. This is especially advantageous considering the amount of data sources. There are 6 types of sensors and two database types from which the data are being extracted producing 24 distinctive features. However, even larger feature spaces occur. A study conducted in Korea [16] worked with 144 features that were subsequently reduced to 71 features via Recursive Feature Elimination, Cross-Validated (RFECV). The features were further reduced following the recommendation of clinicians who were experts in the field, this was marked as FRDR (features reduction with doctor recommendation). With the use of RFECV a 96% SVM (Support Vector Machine) classifier method accuracy was achieved, whereas in the case of FRDR the accuracy achieved was 90%. These results suggest the described method may be equal or superior to expert opinion. One of the most useful indicators would be the prediction of readmission in COPD patients which could facilitate early adequate treatment, possibly improving clinical outcomes as well as reducing associated healthcare costs. A comparison of three different classification models was published in 2022 assessing artificial neural networks (ANNs), Decision Trees (DTs) and Support Vector Machines (SVMs). The proposed methodology was more reliable in predicting readmission within one month from hospital discharge, rather than for three months and longer.
All the aforementioned works have two important characteristics in common: the need for a lot of data and the use of supervised learning. The data requirement increases the complexity of the problem either by increasing the necessary data acquisition time or necessitating the simultaneous use of multiple devices at the same time. In order to apply supervised learning, the data needs to be labeled, which significantly contributes to the complexity of the experiment.
In the present study, we offer a novel perspective on data analysis for COPD patients by applying a comprehensive and integrative approach. We performed detailed monitoring of these patients using a variety of data sources, including home spirometer readings, blood pressure monitoring, night oximetry, questionnaires, blood tests, and air pollution data. This multifaceted dataset comprised 58 features from 17 COPD individuals.
Our study has two primary objectives: (i) to detect when a patient experiences an exacerbation and (ii) to identify clusters of patients based on disease severity. To achieve the first objective, we developed a hybrid system that integrates multiple established machine learning algorithms: Self-Organizing Maps (SOM), Support Vector Machine (SVM), Isolation Forest (IF), and DBSCAN. While these algorithms are well-known individually, their combined application to the specific problem of COPD exacerbation prediction is unprecedented in the literature. This integrative approach leverages the capabilities and strengths of several machine learning algorithms to address the complex problem of predicting flare-ups, providing robust and accurate prediction. For the second objective, we employed Principal Component Analysis (PCA) followed by KMeans clustering to categorize patients based on the severity of their condition. The interdisciplinary nature of our work bridges the gap between artificial intelligence and medical research, offering practical insights and potentially transformative applications in healthcare.
Problem Statement
As described in the introduction, machine learning and artificial intelligence are powerful techniques that can help the experts to predict when an exacerbation might occur, as well as to determine the most relevant features that can be used to better understand a patient’s health status. Since the data used to train ML models is a fundamental component, we designed an architecture that can perform the entire ETL (extract, transform and load) process. The pipeline for data gathering, transmission, storage and analysis used in our experiment consisted of several components. Figure 1 shows an overview of our experimental framework, describing the network of connections and dependencies between the components. These components span a wide spectrum, encompassing diverse devices, database systems, message-queuing software and a preprocessing pipeline, which will be explained throughout this section.
Fig. 1.

Schema of the architecture designed for data ingestion and processing
The first component (A) comprises of several medical and smart devices used by the patients that collect relevant variables for the machine learning module and for the care providers. Special attention was given to the easy and intuitive usage of these devices to assure reliable data collection, especially in the elderly patients. The specific details of the relevant data collected will be explained in the datasets section.
The front-end mHealth application that was provided to COPD care receivers is called Medimonitor. Medimonitor application (B) facilitates the collection of health-related data. All the data and measurements are collected by the devices that care receivers use in their home environment. Medimonitor provides its users with an overview of their daily health status, tasks and treatment plan assigned by the care giver, medication administration and requests, notifications, questionnaires, and it also offers the possibility to communicate with the care givers and to manage medical appointments through videocalls.
Unobtrusive Bluetooth connectivity facilitates the transfer of data from the medical devices to the patient’s mobile device (tablet). The patient can enrich these measurements with contextual text notes, providing a comprehensive understanding of their health status. Upon submission, the data is securely routed to a dedicated API endpoint, which deposits it within the organization’s MS SQL database, the FNOL (University hospital Olomoc) server.
To ensure efficient and consistent data exchange, a C# application, the FHIR Data Sender, is operating on the FNOL server (C). Following the architecture diagram presented in Fig. 1, once the data is stored in the FNOL server, there is a need to store the data in tables so it could be accessed not only by FNOL or TREE, but also other partners involved, this is the reason behind a Data Lake (E).
The Data Lake is a centralized repository that allows the storage of both structured and unstructured data at any scale. It includes a series of components such as the ingestion layer, the processing layer, and the Data Analytics layer (F), and the data lake. Once the data ingestion and pre-processing are complete, the data is prepared for use by machine learning algorithms that enable the extraction of valuable knowledge to support the decision making by domain experts.
MRabbit MQ (D) message broker was leveraged for the data exchange between the two main components and additional parties involved in the SHAPES project.
This perpetual process of data collection, preprocessing, transmission, sharing, processing and transmission of results ensures a continuous unobstructed flow of data from the Bluetooth-connected medical devices to the centralized database and to the hands of healthcare providers, empowering them to make informed decisions which may ultimately improve clinical outcomes for the patients.
Materials and methods
This section outlines the methodology followed in this study, including feature selection, preprocessing and design of the proposed algorithms. Figure 2 illustrates the overall lifecycle of our work. The flowchart describes the different stages followed during data gathering, pre-processing and analysis. Firstly, the datasets were downloaded from two main databases: FNOL database, obtained after applying the procedure described in previous section and Czech Hydrometeorological Institute that contains, among others, historical pollution data registered by each meteorological station in the Czech Republic. Secondly, both data were pre-processed and merged to be understandable by machine learning algorithms. Thirdly, the clustering ensemble method proposed is employed to find out which care receiver might suffer an exacerbation and when. Next, a hyperparameter optimization procedure is performed to adjust the best parameters for each SOM, DBSCAN Isolated Forest and SVM algorithms. Finally, the exacerbation results were obtained and further analysed by experts.
Fig. 2.
General scheme describing the overall procedure followed in the study
Datasets
In this study, we used data stored in the Data Lake coming from two different sources. The first one, which we named “FNOL Database” in Fig. 2 refers to all data gathered by FNOL institution (A) and that is described in the paragraph below. Secondly, we included air-pollution data recorded at the district of primary residence of each participant.
The air-pollution data were obtained from the Czech Hydrometeorological Institute (B). The parameters of interest were the concentrations of particulate matter with diameters of 2.5 and 10 micrometers (PM2.5 and PM10, respectively), and the concentration of the chemical compounds NO2, SO2 and O3 from eight meteorological stations designated: BMOCA, MOLSA, BMOKA, MOLJA, MPRRA, MSMSA, MVBYA, ZSNVA. A total of 984 observations were obtained, spanning from 1 January 2022 to 31 January 2023. To make sense of these data and to extract useful insights, we have calculated the Air Quality Index (AQI) [17] which allowed for the stratification of the data samples into categories with decreasing air quality. More specifically, the index corresponds to the worst value for each of the five pollutants and it can take values from 0 (good air quality) to 5 (extremely poor air quality). See the pre-processing subsection for more detailed description of the data operations performed.
Finally, with regards to FNOL database, the data is organized in 4 main groups:
- Vitals Signs This group of data contains information about:
- Systolic and Diastolic Blood Pressure (SBP, DBP) measured once every day.
- Heart Rate (HR) and Blood Oxygen Saturation. Registered continuously during sleep for each day.
- Inhaler monitors the time scheduled for using the inhaler. From this device, we received data on two types of inhalations:
- Regular Inhalation The patient used the inhaler when it was planned. They monitor the time delay (in seconds), positive values indicate the inhaler was used later than planned and vice versa. It measures the number of inhalations administered vs the planned number of inhalations.
- Nonregular inhalations. The patient used the inhaler outside the schedule. (This could indicate breathing problems). It measures the time and the number of inhalations.
- Qualitative data that contains both clinical data and subjective health status for each participant.
- Baseline data The clinical data of the patients collected at the hospital (age, gender, smoking status, weight, height, the date of first diagnosis, heart rhythm, supplemental oxygen and diabetes mellitus).
- Questionnaires Once daily throughout the pilot study, the patients filled in the COPD Assessment Test (CAT) questionnaire [18] to gauge subjective symptoms and estimated their level of breathlessness on the mMRC (modified Medical Research Council) scale. The CAT questionnaire assesses the intensity of COPD symptoms (cough, sputum production, chest tightness, exercise tolerance, sleep quality and energy), while the mMRC scale gauges subjective dyspnea.
Spirometer The data obtained from the portable spirometer provided a set of parameters describing the quality of breathing. Spirometry plays a pivotal role in the diagnosis of respiratory diseases. Although the spirometry data involve a total of 30 features, only the most relevant ones are mentioned for the sake of clarity.
Study group and ethics
All patients were recruited at the Department of Respiratory Diseases and Tuberculosis of the University hospital Olomouc. Initially, 25 patients were recruited (22 males and 3 females) as required by the project plan. The data from 8 participants were incomplete due to poor adherence to the study protocol and were thus excluded from further analyses. Age structure and GOLD staging of the included participants is shown in Table 1.
Table 1.
Study group age description
| Gender | GOLD | No Samples | Mean | SD | Min | Max |
|---|---|---|---|---|---|---|
| F | 1 | 1 | 76.00 | NaN | 76 | 76 |
| M | 1 | 1 | 74.00 | NaN | 74 | 74 |
| 2 | 12 | 67.83 | 5.006 | 60 | 76 | |
| 3 | 4 | 68.00 | 6.683 | 60 | 74 |
All participants signed informed consent prior to inclusion in the study. The study was conducted in accordance with the ethical principles and approved by the local ethics committee under Ref. no. 50/21 and registered on ClinTrials.gov with ID NCT05269043.
Preprocessing
The missing values were handled slightly differently for each data source. With regards to the FNOL database, the missing values were treated as follows (I) the number of missing values were calculated for each feature, and features with more than 70% of missing data were removed and (II) for the remaining features, missing values were imputed using linear regression. On the other hand, to handle missing values in the pollution dataset, we studied the geographical location of the station expected to provide the data point and then searched other stations in a 70 km radius that could supplement the missing values. Finally, for each nearby station we calculated the RMSE between our historical data and the data coming from each nearby station and we selected the nearby station whose historical data is more related to our stations of interest to fill our initial missing values. Once the datasets were clean, a feature engineering procedure was performed with the aim of creating new and hidden features that could be of interest to our problem.
Most measurements were acquired with a temporal resolution of one measurement per day with the exception of oxygen saturation and heart rate which were measured once per second during the night. Several measurements were selected to summarize the information into an hourly resolution because it was not possible to aggregate all the data into a single data point due to big gaps of missing data. The aggregation metrics were mean, standard deviation, min, max, multiscale entropy (MSE) [19] and Higuchi Fractal Dimension (HFD) [20]. The former two are used to measure the amount of information in the time series. The multi scale entropy is computed as follows:
| 1 |
where represents the time scale and is the new time series generated, the Sample Entropy is then computed for this new time series, the process is done at several time scales. For scale one it simply corresponds to the original time series. Finally, the Higuchi Fractal method is implemented as follows. The original signal is discretized in the form of a time series, from which derives a new time series defined as:
| 2 |
where m = 1,2,...,k corresponds to start time, whereas k is the time interval, k = 1,...,kmax is a free tuning parameter. As a result, the given time interval spawns k-sets of new time series. Resulting curves have lengths corresponding to:
| 3 |
For the time interval k the length of the curve is defined as the average over the k sets of Lm (k). Higuchi Fractal Dimension then is acquired as a slope of a linear function that best fits the curve given by the dataset
In addition, for oxygen saturation some additional measures were computed: time (in minutes) under 90% saturation, time (in minutes) with desaturation greater than 4% and the oxygen desaturation index [21].
In relation to the spirometer values some additional indexes were computed to reduce the number of features to be used in the model. The indexes calculated are global (or central) concavity [22],
| 4 |
Tiffeneau Index (TI):
| 5 |
and beta angle [19]
| 6 |
where PEFR is peak expiratory flow rate, is forced expiratory flow at 50% of forced vital capacity (FVC). These metrics can be utilized to summarize the health of the lungs. On the other hand, we summarised the features of the pollutant particles in the pollution dataset into a single feature that better describes the pollution state using the Air Quality Index. To do this, we took the following steps: for each station and day, we selected the 5 pollutant particles and for each one we mapped them according to the concentration marks shown in the European Union guide reference [17]. The particle that had the highest concentration level would define the air quality for that specific day and station. After that, we merged the two datasets into one. This operation is possible because each patient record in the FNOL dataset has a unique code that refers to the station code of its location. At this point, our dataset contained a total of 58 features (57 coming from FNOL and 1 from the pollution dataset). The final step of the data preparation was the selection of the most significant features. Pearson correlation criterion was applied to dispense features that can be explained by other features. This was the case for data coming from questionnaires. The questionnaire features show a high correlation, as depicted in Fig. 3, therefore a representative subset of the features could be selected.
Fig. 3.

Pearson correlation matrix from questionnaires data
After performing the feature selection step, the number of features was reduced from 58 to 11: (i) beta angle, (ii) Tiffeneau index and (iii) global concavity from spirometry flow-volume curve, (iv) Air Quality Index (AQI) from pollution dataset, (v) Activities, (vi) Chesttightness, (vii) Cough and (viii) DyspneaScale from the categorical patient tests and (ix) smoking status, (x) Height and (xi) Weight from clinical data. Finally, a 2D scatterplot was constructed to compare the visual representation of the data using all features (Fig. 4) and the subset of selected features (Fig. 5). Note that PCA was used to reduce the dimensionality to two components to facilitate this representation.
Fig. 4.

Dimensionality reduction to two components using PCA over the whole dataset
Fig. 5.

Dimensionality reduction to two components using PCA over the selected subset of features
Proposed clustering ensemble approach
As mentioned previously, we are faced with an unsupervised learning problem where the objective is to find, among a set of COPD patients, those who have suffered one or more exacerbations during the pilot phase. To better understand our proposed algorithm, we suggest the reader to see the problem from a geometric point of view. Let’s assume that the dataset can be represented in an N-dimensional space.
In this sense, each point in the space represents the health status of a given patient on a given date. Our hypothesis is based on the fact that the set of points representing the health status of each patient will form a set of clusters in the N–D space, where groups of similar health states can define clusters that are considered as normal. We can then hypothesize that the points that do not belong to any cluster represent anomalous behaviours or events. The identification of these anomalous events is of great interest as these may represent an acute exacerbation, prompting the medical experts to inspect the health status of the patient as recorded in a set of selected features. A detailed explanation of how the proposed algorithm works is given below:
During the design of the proposed algorithm, we conducted extensive experiments using various clustering algorithms, such as Self-Organizing Maps (SOM), Support Vector Machine (SVM), Isolation Forest (IF), and DBSCAN, with different parameter configurations. While an in-depth discussion of these algorithms is beyond the scope of this paper, interested readers can refer to the foundational works of [23–25] and [26] for more comprehensive details.
To accurately predict patient exacerbations, we developed a hybrid system that integrates multiple clustering algorithms. The system leverages the strengths of SOM, SVM, DBSCAN, and IF to enhance robustness and prediction accuracy. The process is as follows:
Filtering Step SOM is applied to the entire dataset to identify patients at potential risk of exacerbation. This step reduces the general dataset to a focused subset of patients.
Clustering Step The filtered patient data is then used to train the SVM, DBSCAN, and IF algorithms. Each algorithm independently analyzes the data to predict potential exacerbation dates.
Ensemble Step The results from the clustering algorithms are combined. A data point is considered an anomaly (potential exacerbation) if it is identified as such by at least three of the four algorithms (SOM, SVM, DBSCAN, and IF).
The following pseudocode outlines the ensemble approach:
Algorithm 1.
Ensemble Clustering Algorithm for Predicting Patient Exacerbations
The outline of the designed system is illustrated in Fig. 2 and Algorithm 1. Initially, SOM is applied to the entire dataset to identify patients at risk. The data for these patients is then processed by SVM, DBSCAN, and IF to predict specific exacerbation dates. Finally, the ensemble step consolidates the predictions, ensuring a data point is flagged as an anomaly only if it is consistently identified by at least three of the four algorithms.
This approach ensures that the system benefits from the combined strengths of each clustering algorithm, resulting in a robust and accurate method for predicting patient exacerbations.
Experiments
Data from 17 COPD patients were used in the experiment. Patients who have suffered a COPD exacerbation during the pilot were identified by the clinician. However, only the exacerbations leading to the patient seeking medical attention could be registered, it is therefore possible that some patients could have suffered a flare-up that went unnoticed. In fact, just the patient with id #medimonitor_patient_71 registered two flare-ups on 2022/12/19 and 2023/01/13. This limits the reliability of the clinician-assigned labels. In this preliminary work, we had to deal with this incomplete or missing data. On the other hand, in addition to the preprocessing step described in Section 3, we note that not all patients registered the same amount of data and by considering that including patients with a large proportion of missing data could skew the results, we opted for including only the patients who have registered at least 60 samples. Finally, after applying the preprocessing steps described above to this subset, the data was prepared for ingestion by the proposed ensemble clustering model. The results of the experimentation are shown in Table 2, where the user id is shown in the first column, the suggested exacerbation date found by the ensemble algorithm is presented in the second column and the confidence level of such detection is shown in the last column. Considering these results, three patients with unreported exacerbation were detected along with one patient who sought medical attention a couple of days following the detection of an anomaly. These samples were identified as anomalies, and then as possible flare-ups, due to errors in some of the registered features. For example, by analysing the results of the patient #medimonitor_patient_75 from Table 2 and Fig. 6, the anomaly was identified by the behaviour of the Tiffeneau index and it was linked to false reading rather than an exacerbation. In the case of patient #medimonitor_patient_71 both detections were linked to the first registered exacerbation, the second one could not be detected due to a gap of data of almost two weeks prior to the flare-up. Figure 7 presents the time series of the features used in the ML model highlighting the detected anomalies. False positive presented as a large peak in the registered values, and they were linked to false reading rather than exacerbation.
Table 2.
Results of the machine learning model
| User id | Date | Exacerbation confidence (%) |
|---|---|---|
| #medimonitor_patient_76 | 2022-10-28 | 100 |
| #medimonitor_patient_75 | 2022-11-16 | 100 |
| #medimonitor_patient_71 | 2022-12-12 | 100 |
| #medimonitor_patient_71 | 2022-12-15 | 75 |
| #medimonitor_patient_85 | 2022-12-20 | 75 |
Fig. 6.
Time series data of Tiffeneau index for patient #medimonitor_patient_75. The suggested exacerbation is shown as a red dashed line
Fig. 7.
Time series data of each feature used to train the clustering ensemble algorithm for patient #medimonitor_patient_71. Exacerbations found by the algorithm are shown as red dashed lines
In conclusion, the model successfully detected COPD exacerbations a few days before going to the hospital, the detected anomalies could therefore serve as a notification to the patient, prompting them to check with their physician. In addition, the model could also be used to detect faulty readings if an anomaly is detected in the absence of clinical symptoms. Throughout the first part of the experiment, similarities in the time series were noticed between different patients. To further the investigation, we decided to find whether all the users can be classified into clusters, in order to analyze the patients as groups with similar observed properties, especially status and trends in their health condition. Unlike the first part of this experiment, now we consider the data from all users. More specifically, we first applied Principal Component Analysis over the set of 11 features, reducing them to 3. Then, the K-Means algorithm was applied [27], the number of clusters selected was three based on the analysis of the data. This clustering aimed at creating a custom classification of the patients rather than the identification of anomalies. The color-coded representation of the resulting clusters is presented in Fig. 8.
Fig. 8.
Clustering results cluster 0 (red), cluster 1(green), cluster 2 (purple)
Importantly, as the clustering was performed for each observation individually, a single patient may have been assigned to different clusters. The cluster occurring the most for any given patient was considered the true cluster and was used as a label. For a more detailed analysis of the behaviour of each cluster, Fig. 9 presents the time series of all the features calculated from spirometry measurements (Tiffeneau index, global concavity, and beta angle), while Fig. 10 the ones related to the CAT and mMRC questionnaires. The data in both figures is presented as mean and SD. The data is presented separately for each cluster. The Tiffeneau index, a ratio calculated as FEV1/FVC, is key for the diagnosis of diseases with bronchial obstruction such as COPD. In Cluster 0 the mean values of FEV1/VC mostly fluctuate between 50 and 60% and the envelope of the standard deviation is rather constant. Both Clusters 1 and 2 are described by the initial values near to 80, while Cluster 1 shows a decrease of the standard deviation, Cluster 2 has the opposite trend. Mean values in Cluster 1 can be as low as 40, while in Cluster 2 they are not below 60. Two parameters of curvilinearity were utilized to assess the spirometry data - global concavity and beta angle. To date, these experimental metrics are not routinely used in clinical practice. Considering first the global concavity, the cluster 0 was steadily fluctuating around the value 0.5, while the clusters 1 and 2 showed significant fluctuation. The highest values of global concavity were observed in cluster 0. The one patient who experienced clinically confirmed COPD exacerbation was a member of cluster 0.
Fig. 9.
Mean and standard deviation for spirometer features by cluster
Fig. 10.
Mean and standard deviation for questionnaire features by cluster
Secondly, considering the beta angle, most of the values in cluster 1 were above . Mean values fluctuating around were characteristic of cluster 2. Conversely, the lowest values of beta angle were observed in cluster 0. The mathematical description of these curvilinearity metrics is provided in the Materials and Methods section. The analysis demonstrates that the global concavity and beta angle metrics might provide additional insight along with the routinely used Tiffeneau index. Besides the objective metrics derived from the spirometry measurements, subjective responses gathered with the use of CAT and mMRC questionnaires were also analyzed. Cluster 1 reported the most severe symptoms with little fluctuation of the mean value throughout the pilot. Cluster 2 was characterized by the decrease in reported values signifying an improvement of clinical symptoms. On the contrary, the symptoms in Cluster 0 were worsening, which was illustrated by increasing mean values. Similar trends were observed for chest tightness and cough. Notably, the chest tightness was deteriorating significantly in Cluster 0. Responses acquired by administering the mMRC scale, which is analogous to New York Heart Association (NYHA) rating for heart failure, resulted in similar trends. Cluster one reported the most prominent symptoms with little variation throughout the pilot. Cluster 2 was less symptomatic with mMRC values of 0 and 1 being the most frequently reported, signifying symptoms mostly during heavy exercise or strenuous activity. In contrast, the patients in Cluster 0 initially reported a mean mMRC score 0 with a gradual trend towards the score of 2 throughout the pilot. Summarizing the described observations, it can be concluded that, throughout the pilot, the patients in Cluster 1 steadily displayed the most severe clinical characteristics, the Cluster 2 showed slight improvement, and the patients in Cluster 0 started off with the least severe characteristics but they experienced clinical decline. To ensure that the acquired clusters provide any new information correlation between the globally endorsed GOLD classification and the newly obtained clusters was estimated, revealing no statistically significant correlation as is demonstrated by the Kendall’s B (Table 3).
Table 3.
Correlation coefficients between gold classification and proposed clusters
| Cluster | ||
|---|---|---|
| GOLD | Kendall’s B | – 0.057 |
| p-value | 0.798 | |
| N | 18 | |
Additionally, two multinomial logistic regression models were developed to assess the relationship between laboratory results and the established clusters as well as GOLD. The most relevant results were observed for blood hemoglobin and serum potassium levels. To compare the effect of the hemoglobin and potassium on the clustering based on the presented model and GOLD classification two multinomial logistic regression models were constructed. The results for Model 1 (established clusters) and Model 2 (GOLD) are presented in Table 4.
Table 4.
Comparison of multinomial logistic regression models for model based on proposed clusters (model 1) and model based on GOLD classification (model 2)
| Model | Deviance | AIC | BIC | Overall model test | |||
|---|---|---|---|---|---|---|---|
| df | p | ||||||
| 1 | 17.031 | 29.031 | 34.373 | 0.518 | 18.285 | 4 | 0.001 |
| 2 | 19.819 | 31.819 | 37.161 | 0.351 | 10.734 | 4 | 0.03 |
Overall, Model 1 (established clusters) showed a better fit to the data than Model 2 (GOLD). This was indicated by the lower deviance, Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC) values for Model 1. Additionally, Model 1 had higher McFadden’s value, which suggests that this model was able to explain more of the variation in the dependent variable. Nevertheless, it’s important to note that the tests for both models are statistically significant. Both models could be useful for predicting the dependent variable. Significant differences in the concentration of kalium and hemoglobin were observed between the individual cluster, as well as between the individual GOLD stages, despite no significant correlation between the two classification methods.
Discussion
Numerous prior studies attempted to identify potential risk factors of AECOPD using clustering, big data analysis and AI algorithms. Abidi et al. [8] found that individuals with a previous history of acute exacerbation had a higher risk of further exacerbations (mean 1.82 exacerbations per year compared to 1.2 mean pooled exacerbation rate in the entire study population). Yoon et al. [8] used 7 factors (age, body mass index, smoking status, history of asthma, COPD assessment test (CAT) score, post-bronchodilator FEV1 expressed as a percentage of predicted value, and the diffusing capacity of the lungs for carbon monoxide (DLCO) to cluster COPD patients and analyze the risk factors of AECOPD across the clusters. The study identified four main risk factors of acute COPD exacerbation: Asthma-COPD overlap (ACO), COPD severity, higher score in St. Georges Respiratory Questionnaire (ie., worse overall quality of life) and gastroesophageal reflux.
The first studies aiming to predict AECOPD in real time began in the 2010s. Fernandez-Gratero et al. [9] performed home-based remote monitoring using electronic questionnaire data. The study used machine-learning approaches to analyze the obtained data for 6 months with excellent results (100% accuracy for event-based prediction and 80.5% for symptom-based prediction). The study by Goto et al. [11] used demographic features, vital functions, and electronic medical records to predict AECOPD resulting in the examination on the emergency department. A more complex study, by Peng et al. [12] used 28 features (vital functions, medical history, laboratory markers of inflammation) analyzed by tree-based machine learning to predict the prognosis of individuals admitted to the hospital due to AECOPD). Wu et al. [10] used machine learning approaches (Random Forrest, Decision Tree, k-nearest neighbor, linear discriminant analysis, Ada-boost and Deep-learning) to analyze complex parameters to predict risk of acute hospitalization in discharged individuals. The authors monitored daily activities and environmental information as predictors, reaching a sensitivity of 94.5% and a specificity of 92.5% in all feature’s analysis.
The recent metanalysis of Smith et al. [28] concludes that the evidence supporting machine learning approaches to predict AECOPD or mortality due to COPD is very low and none of the studies published presents a model relevant for clinical praxis. Moreover, existing scoring systems have a strong correlation with COPD mortality and AECOPD. However, when comparing with other areas of medicine (radiology, pulmonary functions tests, ECG, etc.) the evolution of clinically relevant tools for COPD patients is probable. We therefore argue that the present clustering approach is novel and may contribute to the advancement in this field.
Firstly, the clustering was assessed from the perspective of Tiffeneau index. Chronic airflow limitation is defined by the FEV1/FVC < 0.70 as recommended by the Global Initiative for Obstructive Lung Diseases (GOLD). Different trends of mean values and standard deviations were observed among Cluster 1 and 2. However, from the perspective of Cluster 0 the signal fluctuates mainly between values 50 and 60 and does not exhibit any specific trend, though the mean values are always below the defined chronic airflow limitation which is expectable among the COPD patients. From such a noise it would be difficult if not impossible to predict exacerbation. Two additional spirometry-based descriptors were calculated. Concavity index is dimensionless and ranging from zero (no concavity) to a limiting value of 100, corresponding to maximum concavity. Negative values are also possible and indicate convexity. This was observed with the mean values in cluster 1, that are well below zero. Conversely, in the case of cluster 2 the mean values approach zero, which is contained within the SD albeit the mean value is positive. Limits for the normal concavity are estimated separately for males and females. Johns et al. [29] define the abnormal degree of concavity as greater than 34.8 or 26.3% for males or females, respectively. However, these values were devised for Tasmanian population and are likely to differ across populations. Normal values therefore need to be established for any given population prior to clinical utilization. The degree of global concavity typically decreases following the administration of a bronchodilator. The concavity index might be useful in the cases where other spirometry metrics fail as shown in Smith et al. [28] where 16.6% of patients reported limitations during exercise caused by breathlessness, despite normal values of routine spirometry indices. Conversely 21% of these individuals had abnormal concavity indices and 90% of those without limitations had normal concavity, making the specificity of this approach high. As the greater concavity reflects increasing obstruction, it is expectable that it might be a predictor of an exacerbation. Though the observed values of global concavity were not abnormal as defined in the cited articles, mean values and trends among the established clusters are prominent. The exacerbating patient was a member of the cluster with the highest mean value of global concavity, although in this case the mean values were rather steady in comparison to the other clusters.
Beta angle is lower than in patients with obstruction of large airway [30]. Patients in Cluster 0 consistently presented with values lower than . The patients in the remaining clusters exhibited an increasing trend during the course of the study. Most notably, the patients in Cluster 1 almost exclusively recorded values higher than .
These observations suggest that although all the spirometry metrics are based on the same raw data, they provide different information and comprehension, which might lead towards better predictability of the exacerbations when using a combination of these parameters. As the trends observed in spirometry-derived features were mimicked by the trends in self-reported symptoms, these may prove to be highly valuable in AECOPD prediction.
Besides the objective and subjective measures related directly to the respiratory system, various laboratory biomarkers were assessed at the baseline, including blood hemoglobin and serum potassium levels. Previous studies have identified anemia to be a common feature of patients with AECOPD and a risk factor of death after discharge from hospital. Kollert et al. [31] have found that normal values of hemoglobin were associated with improved life expectancy in COPD patients with respiratory failure. Toft-Petersen et al. [32] reported high incidence of anemia in exacerbated COPD patients, predicting higher long-term mortality in these patients. In the presented case the Cluster 0 had a higher probability of lower hemoglobin levels than the other clusters.
Furthermore, serum electrolytes might also be useful correlates. Low levels of serum potassium were observed in COPD patients in comparison to healthy controls. Hypokalemia might be attributed to respiratory acidosis and metabolic alkalosis, as well as long-term glucocorticoid therapy. Additionally, serum potassium was lower in patients who died in comparison to those who survived (p = 0.009) [33]. High prevalence of hypokalemia was observed in patients with AECOPD also in Lindner et al. [34], though no association with adverse outcome in patients with AECOPD was found. In the present work, the probability of low level of potassium was linked to Cluster 0. However, it needs to be stressed that the patients in our cohort may have been using supplemental potassium or other medications affecting potassium levels. Hence it is difficult to draw any definitive conclusions, even though these observations are in congruence with previous works.
The hereby presented clusters may have real-life clinical implications as they address the different clinical behaviour of COPD in the individual patients. This provides a new piece of information not implicitly included in the GOLD staging, which serves as an estimate of morbidity and mortality in the patients, or the phenotyping, which considers the different underlying pathological mechanisms and the most prominent clinical features. In a sense, the GOLD stages and phenotypes may be considered to be the descriptors of disease “state”, while the proposed phenotypes may reflect more on the disease “dynamic”. However, the authors acknowledge that the small sample size and relatively short follow-up mean that the proposed clusters may be artificial. Additionally, different sets of inputs, different study populations and different techniques may result in different clusters with varying overlap. The prediction of exacerbations would certainly make the treatment of COPD more efficient and would lead to improved clinical outcomes for patients due to early intervention. The authors argue the proposed method of prediction of exacerbation via anomaly detection in the time-series of individuals’ biological signals and supplementary information could be effective, as was demonstrated in the patient with clinically verified COPD exacerbation. A larger cohort with longer follow-up would be necessary to verify this claim.
The question of what combination of biological signals (and potentially also ambient signals) would be the most effective remains unanswered. Additionally, it is challenging to estimate to what degree the overall poor adherence in our cohort could be attributed to the technical complexity of active self-measurement (spirometry) and the necessity to repeated self-assessment using questionnaires.
Study limitations
The main limitations of the presented study are the small sample size, the occurrence of only one physician-confirmed exacerbation during the pilot study, and the underrepresentation of females (n = 1). The overall poor adherence led to further exclusion of patients, limiting the available data. Despite these limitations we believe this study provides an important proof of concept worthy of a larger scale follow-up study or replication by other teams. These constraints diminished the statistical significance of our findings; consequently, this research should be regarded as a preliminary investigation.
Another limitation was the deviation from the original plan of using PM2.5 sensors, caused by the global chip shortage during the aftermath of the global COVID-19 pandemic. This is partially outweighed by the inclusion of publicly available outdoor air pollution data.
Conclusions and future work
This work presented a multi-modal approach to COPD exacerbation detection in a small cohort of 17 patients. The data utilized included biological signals measured by personal medical devices (including a portable spirometer), air pollution data from meteorological stations, laboratory findings and self-reported symptoms (CAT, mMRC). A novel approach to clustering was demonstrated by applying three separate (DBSCAN, IF and SVM) clustering algorithms over the data filtered by SOM. This approach should aid the robustness of the proposed system design.
Despite the enrolment of only 17 patients, the authors were able to identify 3 distinctive clusters: the patients with consistently severe clinical presentation, the patients with a tendency to clinical improvement, and the patients with a tendency to fast clinical deterioration. The only patient who experienced a clinically confirmed COPD exacerbation was a member of the rapidly deteriorating cluster and the patient was correctly identified by observing an anomaly in the respective time series. These results are not to be considered as definitive, but rather a proof of concept. It is expectable that with different patient populations, different sample sizes, and perhaps most importantly with different combinations of biological and ambient signals, and different clinical parameters, the researchers may end up with different clusters with varying overlap. The authors argue that the three presented clusters may be potentially clinically meaningful, identifying the distinct types of dynamics of the disease. This may provide another important marker besides the GOLD stages and clinical phenotypes.
Perhaps more importantly, a pipeline was proposed for real-time estimation of the onset of COPD exacerbation via anomaly detection in the recorded time series of each individual patient. Although promising, the approach would need to be verified on a larger sample with a larger number of recorded clinically verified exacerbations.
The authors believe that the early detection of exacerbations could bring significant clinical benefits, while the ultimate aspiration should be the detection and prevention of the very first severe exacerbation, as this often marks the beginning of a spiral of rapid clinical deterioration. Several important considerations for future works include the necessity for larger cohorts and longer follow-up, the utilization of a more robust combination of biological signals (potentially including pulse rate and its variability, pulse wave characteristics, daytime blood oxygen saturation, or the assessment of physical activity via wearable accelerometers), or the potential for measuring in-door air quality with particle analyzers. That being said, an important pitfall of the present study was the poor adherence of the patients to the study protocol. To mitigate this, the question of the least invasive and the least demanding means of acquisition of the desired signals and clinical information must likely be addressed, as well as the appropriate education and motivation of the patients.
Acknowledgements
Authors are especially indebted to all the participants that were open to try new approaches in the daily monitoring of their health status by personal devices.
Funding
The research leading to these results received funding from the Horizon 2020 (H2020) Framework Programme of the European Union for Research Innovation under Grant Agreement No 857159-SHAPES-H2020-SC1-FA-DTS-2018-2020.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Venkatesan P. Gold copd report: 2024 update. Lancet Respir Med. 2024;12(1):15–6. 10.1016/S2213-2600(23)00461-7. [DOI] [PubMed] [Google Scholar]
- 2.He Y, Qian D, Diao J, Cho M, Silverman E, Gusev A, Manrai A, Martin A, Patel C. Prediction and stratification of longitudinal risk for chronic obstructive pulmonary disease across smoking behaviors. Nat Commun. 2023;14(1):8297. 10.1038/s41467-023-44047-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Lee S, Lee I, Kim S. Predicting development of chronic obstructive pulmonary disease and its risk factor analysis. In: Annu Int Conf IEEE Eng Med Biol Soc. 2023. 10.1109/EMBC40787.2023.10340286. [DOI] [PubMed]
- 4.Mah J, Ritchie A, Finney L. Selected updates on chronic obstructive pulmonary disease. Curr Opin Pulm Med. 2023. 10.1097/MCP.0000000000001042. [DOI] [PubMed] [Google Scholar]
- 5.Hurst J, Vestbo J, Anzueto A, et al. Susceptibility to exacerbation in chronic obstructive pulmonary disease. N Engl J Med. 2010;363(12):1128–38. 10.1056/NEJMoa0909883. [DOI] [PubMed] [Google Scholar]
- 6.Langsetmo L, Platt R, Ernst P, Bourbeau J. Underreporting exacerbation of chronic obstructive pulmonary disease in a longitudinal cohort. Am J Respir Crit Care Med. 2008;177(4):396–401. 10.1164/rccm.200708-1290OC. [DOI] [PubMed] [Google Scholar]
- 7.Zatloukal J, Brat K, Neumannova K, Volakova E, Hejduk K, Kocova E, Kudela O, Kopecky M, Plutinsky M, Koblizek V. Chronic obstructive pulmonary disease - diagnosis and management of stable disease; a personalized approach to care, using the treatable traits concept based on clinical phenotypes. Position paper of the Czech pneumological and phthisiological society. Biomed Pap Med Fac Univ Palacky Olomouc Czech Repub. 2020;164(4):325–56. 10.5507/bp.2020.056. [DOI] [PubMed] [Google Scholar]
- 8.Adibi A, Sin D, Safari A, Johnson K, Aaron S, FitzGerald J, Sadatsafavi M. The acute copd exacerbation prediction tool (accept): a modelling study. Lancet Respir Med. 2020;8(10):1013–21. 10.1016/S2213-2600(19)30397-2. [DOI] [PubMed] [Google Scholar]
- 9.Fernandez-Granero M, Sanchez-Morillo D, Lopez-Gordo M, Leon A. A machine learning approach to prediction of exacerbations of chronic obstructive pulmonary disease. In: Artificial computation in biology and medicinescience. Lecture notes in computer, vol. 9107. 2015. p. 305–11. 10.1007/978-3-319-18914-7_32.
- 10.Wu Y, Lan C, Tzeng I, Wu C. The copd-readmission (core) score: a novel prediction model for one-year chronic obstructive pulmonary disease readmissions. J Formos Med Assoc. 2021;120(3):1005–13. 10.1016/j.jfma.2020.08.043. [DOI] [PubMed] [Google Scholar]
- 11.Goto T, Camargo C, Faridi M, Yun B, Hasegawa K. Machine learning approaches for predicting disposition of asthma and copd exacerbations in the ed. Am J Emerg Med. 2018;36(9):1650–4. 10.1016/j.ajem.2018.06.062. [DOI] [PubMed] [Google Scholar]
- 12.Peng J, Chen C, Zhou M, Xie X, Zhou Y, Luo C. A machine-learning approach to forecast aggravation risk in patients with acute exacerbation of chronic obstructive pulmonary disease with clinical indicators. Sci Rep. 2020;10(1):3118. 10.1038/s41598-020-60042-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Newandee DA, Reisman SS, Bartels AN, De Meersman RE. Copd severity classification using principal component and cluster analysis on hrv parameters. In: 2003 IEEE 29th annual proceedings of bioengineering conference. 2003. 10.1109/nebc.2003.1216028.
- 14.Merone M et al. Discovering copd phenotyping via simultaneous feature selection and clustering. In: 2018 IEEE international conference on bioinformatics and biomedicine (BIBM). 2018. 10.1109/bibm.2018.8621443.
- 15.Bellos C, Papadopoulos A, Rosso R, Fotiadis DI. Categorization of patients’ health status in copd disease using a wearable platform and random forests methodology. In: Proceedings of 2012 IEEE-EMBS international conference on biomedical and health informatics. 2012. 10.1109/bhi.2012.6211600.
- 16.Hussain A et al. Detection of different stages of copd patients using machine learning techniques. In: 2021 23rd international conference on advanced communication technology (ICACT). 2021. 10.23919/icact51234.2021.9370958.
- 17.ECMWF projects: ECMWF projects: copernicus training—CAMS. https://ecmwf-projects.github.io/copernicus-training-cams/proc-aq-index.html.
- 18.CatestOnline. https://www.catestonline.org/. Accessed 7 Aug 2024.
- 19.Costa M, Goldberger A, Peng C. Multiscale entropy analysis of biological signals. Phys Rev E. 2005;71(2 Pt 1): 021906. [DOI] [PubMed] [Google Scholar]
- 20.Higuchi T. Approach to an irregular time series on the basis of the fractal theory. Physica D. 1988;31(2):277–83. 10.1016/0167-2789(88)90081-4. [Google Scholar]
- 21.He S, Cistulli P, Chazal P. A review of novel oximetry parameters for the prediction of cardiovascular disease in obstructive sleep apnoea. Diagnostics. 2023;13(21):3323. 10.3390/diagnostics13213323. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Alowiwi H, Watson S, Jetmalani K, et al. Relationship between concavity of the flow-volume loop and small airway measures in smokers with normal spirometry. BMC Pulm Med. 2022;22(1):211. 10.1186/s12890-022-01998-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Kohonen T. The self-organizing map. Proc IEEE. 1990;78(9):1464–80. 10.1109/5.58325. [Google Scholar]
- 24.Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B. Support vector machines. IEEE Intell Syst Their Appl. 1998;13(4):18–28. [Google Scholar]
- 25.Liu FT, Ting KM, Zhou ZH. Isolation forest. In: 2008 Eighth IEEE international conference on data mining. IEEE; 2008. p. 413–22.
- 26.Schubert E, Sander J, Ester M, Kriegel HP, Xu X. Dbscan revisited, revisited: why and how you should (still) use dbscan. ACM Trans Database Syst (TODS). 2017;42(3):1–21. [Google Scholar]
- 27.Arthur D, Vassilvitskii S. K-means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms. Society for Industrial and Applied Mathematics; 2007. p. 1027–35
- 28.Smith L, Oakden-Rayner L, Bird A, Zeng M, To M, Mukherjee S, Palmer L. Machine learning and deep learning predictive models for long-term prognosis in patients with chronic obstructive pulmonary disease: a systematic review and meta-analysis. Lancet Digit Health. 2023;5(12):872–81. 10.1016/S2589-7500(23)00177-2. [DOI] [PubMed] [Google Scholar]
- 29.Johns D, Walters J, Walters E. Diagnosis and early detection of copd using spirometry. J Thorac Dis. 2014;6(11):1557–69. 10.3978/j.issn.2072-1439.2014.08.18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hoesterey D, Das N, Janssens W, et al. Spirometric indices of early airflow impairment in individuals at risk of developing copd: spirometry beyond fev1/fvc. Respir Med. 2019;156:58–68. 10.1016/j.rmed.2019.08.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Kollert F, Tippelt A, Müller C, et al. Hemoglobin levels above anemia thresholds are maximally predictive for long-term survival in copd with chronic respiratory failure. Respir Care. 2013;58(7):1204–12. 10.4187/respcare.01961. [DOI] [PubMed] [Google Scholar]
- 32.Toft-Petersen A, Torp-Pedersen C, Weinreich U, Rasmussen B. Association between hemoglobin and prognosis in patients admitted to hospital for copd. Int J Chron Obstruct Pulm Dis. 2016;11:2813–20. 10.2147/COPD.S116269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Deep A, Behera P, Subhankar S, Rajendran A, Rao C. Serum electrolytes in patients presenting with acute exacerbation of chronic obstructive pulmonary disease (copd) and their comparison with stable copd patients. Cureus. 2023;15(4):38080. 10.7759/cureus.38080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Lindner G, Herschmann S, Funk G, et al. Sodium and potassium disorders in patients with copd exacerbation presenting to the emergency department. BMC Emerg Med. 2022;22(1):49. 10.1186/s12873-022-00607-7. [DOI] [PMC free article] [PubMed] [Google Scholar]







