Abstract
The rapidly increasing prevalence of debilitating breathing disorders, such as chronic obstructive pulmonary disease (COPD), calls for a meaningful integration of artificial intelligence (AI) into respiratory healthcare. Deep learning techniques are “data hungry” whilst patient-based data is invariably expensive and time consuming to record. To this end, we introduce a novel COPD-simulator, a physical apparatus with an easy to replicate design which enables rapid and effective generation of a wide range of COPD-like data from healthy subjects, for enhanced training of deep learning frameworks. To ensure the faithfulness of our domain-aware COPD surrogates, the generated waveforms are examined through both flow waveforms and photoplethysmography (PPG) waveforms (as a proxy for intrathoracic pressure) in terms of duty cycle, sample entropy, FEV1/FVC ratios and flow-volume loops. The proposed simulator operates on healthy subjects and is able to generate FEV1/FVC obstruction ratios ranging from greater than 0.8 to less than 0.2, mirroring values that can observed in the full spectrum of real-world COPD. As a final stage of verification, a simple convolutional neural network is trained on surrogate data alone, and is used to accurately detect COPD in real-world patients. When training solely on surrogate data, and testing on real-world data, a comparison of true positive rate against false positive rate yields an area under the curve of 0.75, compared with 0.63 when training solely on real-world data.
Keywords: COPD, deep learning, photoplethysmography, surrogate data, wearable health
I. Introduction
The prevalence of obstructive breathing disorders, such as chronic obstructive pulmonary disease (COPD) and asthma, is increasing rapidly [1], which calls for the employment of techniques within the realm of artificial intelligence (AI). The understanding of breathing mechanics and resulting respiratory waveforms for different breathing disorders is paramount for the automatic classification of breathing disorders, both in terms of screening and identifying their severity. However, despite the promise of AI in this context, in the often very busy hospital lung function units it can be practically difficult to record enough data for viable machine learning (ML) algorithms. This issue has been further compounded by the COVID-19 pandemic. To this end, we propose an apparatus for the artificial generation of obstructive breathing disorder waveforms through healthy subjects, and the corresponding mechanisms for reliably generating the whole spectrum of disease severities.
A. Changes to Breathing With Obstruction and Restriction
Chronic obstructive pulmonary disease (COPD) is caused by an increased inflammatory response in the lungs which leads to obstructed airflow [2]. Chronic obstructive pulmonary disease encompasses both emphysema, defined by a breakdown in the elastic structure of the alveolar walls [3] and bronchitis, defined by increased mucus secretion in the lungs [4]. When we exhale, the airways narrow due to reduced pressure, and thus if airway obstruction exists it is exaggerated during expiration. This explains why patients with COPD generally take longer to breath out than breathe in, and can generate higher inspiratory peak flows than expiratory peak flows. The COPD can be diagnosed with a spirometry test, which measures the ratio of volume during forced expiration in one second (FEV1), against forced vital capacity (FVC). Practically those with COPD usually exhibit FEV1 to FVC ratios of less than 0.7 [5], but COPD is more specifically defined by different severities. According to the Global Initiative for Chronic Obstructive Lung Disease (GOLD), COPD can be split into four major categories based on FEV1. Mild COPD is defined as an FEV1 of
80% of a patients predicted FEV1 based on height, age and sex. Moderate COPD is defined as an FEV1 that is 50-79% of its predicted value, severe COPD is 30-49% and finally very severe is defined as less than 30% [6]. The increased effects of obstruction during expiration also lead to a decreased inspiration time (TI) in comparison with the overall breathing time (TTOT) as it takes longer to breathe out. The ratio TI/TTOT, known as the inspiratory duty cycle, is lower in patients with COPD [7], [8].
This is in contrast to restrictive lung disease, an example of which is pulmonary fibrosis (scaring of the lungs). In this case, there is no obstruction of airways, but a restriction that applies equally to both inspiration and expiration. Whilst diagnosis of pulmonary fibrosis requires a multidisciplinary approach, such as the use of CT scans [9], spirometry tests will generally show healthy FEV1/FVC ratios, but with a lower peak flow for both inspiration and expiration as well as a greatly reduced vital lung capacity.
B. Artificial Changes to Breathing Resistance
Resistance to breathing has been considered both to measure the strength and endurance of lungs in subjects, and also as a potential avenue to train lungs for increases in strength and endurance. A portable apparatus for collecting respiratory gas was designed in the early 1970s, comprising of tubes with 32 mm diameter (incurring negligible resistance to breathing) and a one-way valve so that gas could be stored when breathing out, but new air would be breathed in [10]. This apparatus was adapted in the mid to late 1970s by replacing the 32 mm inspiratory tube with different smaller tube diameters (14 mm, 11 mm or 8 mm), and breathing under different inspiratory resistances was examined in endurance athletes [11]. A similar apparatus with four different inspiratory tube sizes was used to investigate the lung strength of a group of British coal miners over the age of 45 [12]. More recently, multiple valves on a single mask have been implemented for variable resistance to inspiration and expiration, with the desire to train lungs for increased strength and endurance [13].
In contrast to existing set-ups, the COPD-simulator presented in this paper is capable of providing different resistances to both inspiration and expiration independently, with the aim of simulating the respiratory waveforms of different breathing disorders.
The so enabled simulation of breathing disorders through healthy subjects has the following benefits:
-
•
The ability to collect vast amounts of data by expanding the subject pool to include healthy individuals;
-
•
Full control over breathing resistances for both inspiration and expiration;
-
•
Multiple obstructive breathing disorders of different severities can be investigated on the same healthy individual, thus keeping individual physiological differences constant;
-
•
A controlled environment which makes it easier to investigate how other physiological measures vary with resistance to breathing;
-
•
A physically meaningful way to generate surrogate breathing disorder waveform data for both training and testing machine learning models.
C. Respiration and the Photoplethysmogram
Photoplethysmography (PPG) refers to a non-invasive measurement of blood volume using light. By emitting light through tissue and measuring the amount of light transmitted through (transmission PPG) or reflected back (reflectance PPG) to a photodiode, we can infer the amount of light absorbed by the blood. For this reason, PPG is commonly implemented in wearable devices to measure changes in blood volume such as the pulsatile changes with each heart beat.
The heart and lungs both lie within the thoracic cavity, and the pressure inside the thoracic cavity changes when breathing to allow for the flow of air in and out of the lungs. In order to breathe out passively, the diaphragm relaxes which reduces the space in the thoracic cavity, in turn increasing intrathoracic pressure that pushes air out of the lungs. To breathe out forcefully, the abdominal muscles push up against the diaphragm to increase the intrathoracic pressure further. Both types of exhalation decrease the magnitude of the pressure gradient between the heart and peripheral veins at the site of the PPG sensor. A shallower pressure gradient means less venous blood flow back to the heart (venous return), and thus an increase or “pooling” of venous blood at the site of the PPG sensor. This mode of respiration is referred to as a respiration induced intensity variation (RIIV) [14]. In the presence of increased lung resistance, as is the case with COPD, forced expiration would require a larger increase in pressure to overcome this obstruction. The PPG waveform can act as a measure of peripheral venous pressure [15] and in turn be a proxy for intrathoracic pressure, which directly impacts peripheral venous pressure. It should be noted that this is theoretical, and to our knowledge no work exists that analyses the manifestation of the spatial distribution of pressures in the thoracic cavity and the PPG waveform. In addition, the photoplethysmogram can contain information on the timing between inspiratory and expiratory phases [16].
II. Design
The proposed COPD-simulator consists of 3D printed parts and PVC tubes. It has a single input tube for the subject to breathe through, connected to two one-way valves facing in opposite directions which switch the airflow path depending on inspiration and expiration. The valves consist of low density foam plug in a 3D printed cone-shaped funnel with a hole slightly smaller than the diameter of the plug. Securing the plug in the funnel is a fine mesh in which air can pass through but the plug cannot. Depending on the orientation of the valve, either positive or negative airflow will seal the hole with the plug, thus preventing air from passing through. It is important that the plug is light so that it will move easily to the hole under low pressures. The switch in the airflow path between inhalation and exhalation allows us to change the resistance to inhalation and exhalation independently. It also has the added benefit of not meaningfully increasing the dead space, as new air must travel through the inspiratory tube, and air from the lungs can only travel through the expiratory tube.
Connected to the inspiratory valve is an inspiratory tube which can be varied in diameter, as is the case with the expiratory valve and expiratory tube. The largest tube diameter is 25 mm, which is considered as very low resistance to breathing. The smallest tube diameter used is 3 mm, which provides very challenging resistance to breathing. To minimise the resistance of the whole apparatus, 3D printed parts also have an internal diameter of 25 mm. Both the inspiratory and expiratory tubes are then connected to an output tube which leads into a SFM3200 digital flow meter by Sensiron (Stäfa, Switzerland) to record the breathing flow. The entire apparatus is shown in Fig. 1. The digital flow meter was connected to an Arduino Uno (Somerville, MA, USA), which sampled flow values at a sampling frequency of 20 Hz and displayed them on a computer monitor. It should be noted that the valve system that switches between inhalation and exhalation requires a small pressure gradient to activate. In Fig. 2(a), which shows flow values as a result of breathing through the simulator, we can see that the transitions from inhalation to exhalation are smooth and with no overshoot in flow. This indicates that, compared with the pressures involved in tidal breathing, this activation pressure is negligible.
Fig. 1.

Proposed breathing disorder simulation apparatus: block diagram (top) and Physical realisation (bottom). (a) The mouth input. (b) One-way valves in different directions for inspiration and expiration, comprised of a low density foam plug, a cone shaped funnel with a hole that is slightly smaller in diameter than the plug, and a fine mesh which allows air to travel through keeps the plug in position. (c) Tubes for both inspiration and expiration, in this case 300 mm in length, can be easily replaced with tubes of different diameter; this allows for independent control of resistances to inspiration and expiration. (d) A digital flow meter, which records spirometry waveforms, and the supporting electronics module.
Fig. 2.

Results of tidal breathing through the apparatus with four different inspiration to expiration obstruction ratios, and across 10 subjects. Stars represent statistical significance between boxplots as calculated by ANOVA, with * corresponding to P
0.05, ** P
0.01, and ***P
0.001. (a) Exemplar spirometry waveform with an 8 mm diameter inspiratory tube and 8 mm diameter expiratory tube, giving a balanced obstruction ratio. Positive flow corresponds to inspiration and negative flow to expiration. (b) Exemplar spirometry waveform with an 8 mm inspiratory tube and 3 mm diameter expiratory tube, giving an unbalanced obstruction ratio. (c) Box plots of the inspiratory duty cycle (%), referring to the proportion of overall breathing duration spent in inspiration, across 10 subjects and 4 different inspiration:expiration tube diameter ratios. (d) Box plots of the inspiratory amplitude ratio, referring to peak inspiratory flow divided by peak expiratory flow, across 10 subjects and 4 different inspiration:expiration tube diameter ratios. (e) Boxplots of sample entropy (scale 1, tolerance = 0.2) across 10 subjects and 4 different inspiration:expiration tube diameter ratios.
Resistance values of the COPD-simulator were calculated with the assumption that given the 25 mm diameter of all the 3D printed parts, excluding the smaller diameter of the value, the vast majority of resistance in the simulator stems from the variable diameter tubes of 300 mm length. If we assume that the flow through the tubes is laminar, we can use Poiseuille's law to calculate resistance values. These values are calculated for each tube and presented in Table I, with an assumed air temperature of 25
C. Experimental results using impulse oscillometry by Paredi et al. [17] found total lung resistance to have a mean of 0.36 kPas/L during inspiration and 0.38 kPas/L during expiration in healthy participants, compared with 0.54 kPas/L during inspiration and 0.63kPas/L during expiration in COPD patients with an FEV1 less than 60%. With the exception of the 3 mm diameter tube, which adds resistance which is beyond the maximum resistance value found by Paredi et al. [17] by roughly 2-fold, the 8 mm, 6 mm and 5 mm tubes provide resistances that are expected in the context of COPD when combined with the expected healthy lung resistances.
TABLE I. Tube Resistances With Length 300 mm.
| Tube Diameter (mm) | Resistance (kPasL-1) |
|---|---|
| 25 | 0.0006 |
| 19 | 0.0017 |
| 16 | 0.0034 |
| 8 | 0.0549 |
| 6 | 0.1735 |
| 5 | 0.3598 |
| 3 | 2.7766 |
It is important to note that, as is the case within the lungs, flow cannot always be assumed to be laminar. In the case of the 8 mm, 6 mm, 5 mm and 3 mm diameter tubes, the Reynolds number was calculated to exceed 2000 (indicating turbulent flow) at flow rates which could occur during tidal breathing. During turbulent flow, the flow increases proportionally to the square root of the pressure gradient. Given this exponential relationship between pressure and flow, once flow changes from primarily laminar to primarily turbulent, the pressure flow relationship of the COPD simulator is best thought of as linear up to the point at which flow becomes turbulent, and then approximately constant beyond this point. This change point in flow rate from laminar to turbulent will decrease as the tube diameter decreases. The same relationship between pressure and flow also exists in the lungs [18]. Given that the resistance is difficult to define beyond the transition to turbulent flow, it was prudent to evaluate the COPD simulator through spirometry and thus confirm that the resistance of the tubes had same end effect on flow as COPD.
The apparatus was evaluated with tidal breathing over 10 subjects (5 male, 5 female) aged 18-30 years, across 4 different inspiration to expiration tube diameter ratios, and further assessed with maximal forced breathing in 8 subjects (4 male, 4 female) across 6 different tube diameters for measurements of FEV1/FVC ratios. Photoplethysmography (PPG) was recorded from the ear simultaneously [16], [19] during all recordings to gain insight into the effects of varying obstruction on thoracic pressure waveforms [20]. Tidal PPG waveforms were then used to assess if a deep learning model that was trained exclusively on healthy and simulated disease data could be deployed to detect chronic obstructive pulmonary disease in real world PPG data. A diagram of the experimental set up for these recordings is provided in the Supplementary Material.
The recordings were performed under the Imperial College London ethics committee approval JRCO 20IC6414 and the NHS Health Research Authority 20/SC/0315. All subjects gave full informed consent.
III. COPD-Simulator Waveforms
Small decreases in the expiratory tube diameter in relation to the inspiratory tube diameter resulted in changes to tidal breathing waveforms that are typical of patients with obstructive breathing disorders such as chronic obstructive pulmonary disease (COPD). Example spirometry waveforms in Fig. 2 show the roughly symmetric breathing patterns when obstruction to inspiration and expiration is balanced (a), and the characteristic longer expiration time and reduced expiratory flow when obstruction to expiration is exaggerated with a tube diameter of only 3 mm (b). Furthermore, these results are consistent across all 10 subjects, with Fig. 2(c) showing a decrease in inspiratory duty cycle (percentage of overall breathing time spent inspiring) as obstruction to expiration is increased and Fig. 2(d) showing an increase in the inspiratory amplitude compared with expiratory amplitude with increased obstruction. The median inspiratory duty cycle (proportion of overall breathing spent inspiring) of 34.4% for the 8 mm:3mm inspiration to expiration diameter ratio echoes the duty cycle of COPD patients, which were found to be around 35% at rest [7].
Observe also a gradual decrease in sample entropy with increased obstruction to expiration, shown in Fig. 2(e). Sample entropy is a measure of the structural complexity of a signal, and thus it is natural that sample entropy would decrease with increased obstruction, as obstruction decreases the degrees of freedom for breathing and in turn makes breathing patterns more predictable. Similar reductions in sample entropy have been demonstrated in the breathing patterns of patients with COPD, with sample entropy decreasing as COPD severity increases [21]. Observe that in cases of duty cycle, inspiratory amplitude ratio and sample entropy, the interquartile range of the 5 mm expiratory tube did not overlap with the median of the 6 mm tube. However, the only significant differences as determined by ANOVA were between the 3 mm expiratory tube and the other groups. The one exception was a statistically significant difference between the 5 mm and 8 mm expiratory tubes in the case of the inspiratory amplitude ratio. The lack of statistical significance between other groups here is likely due to the small sample size of 10 subjects.
The broad range of obstructions achievable by the apparatus is exemplified by the volume flow loops in Fig. 3(a), which show decreased flow for a given volume with a decrease in tube diameter. This is specific to expiration due to the constant inspiratory tube diameter and varied expiratory tube diameter, resulting in substantial changes to the expiration side of the volume flow loop, with minimal changes to the inspiration side of the volume flow loop. It should be noted that the volume flow loops in Fig. 3(a) also illuminate two important limitations of the apparatus. Firstly, whilst flow values are expected to decrease, overall recorded expiratory volumes should not decrease with tube diameter, given that this would not effect lung volume. The apparent reduction in recorded volume shown in the volume flow loops is due to leakage of the system at higher pressures, and could be rectified straightforwardly with more robust materials and the valves and joiners printed to more precise specifications. The second limitation is that the tube apparatus impedes breathing with a constant level of obstruction for a given tube diameter, whereas in reality as we exhale the airways continue to narrow in proportion to a decrease in lung volume. Obstruction in patients with COPD therefore increases further with continued expiration, resulting in a concave inflection in real-world volume flow loops that is not captured by the apparatus.
Fig. 3.

Examplar plots of maximally forced breathing for a single subject across 5 different expiratory tube diameters, ranging from 25 mm to 3 mm, with a fixed inspiratory tube diameter of 25 mm providing low obstruction to inspiration. (a) Flow-volume loops for different expiratory tube diameters, showing a decreased flow for a given volume with a decrease in expiratory tube diameter. (b) Simultaneously recorded ear-photoplethysmography waveforms during maximally forced breathing with each tube diameter, showing an increase in both PPG intensity and duration with a decrease in tube diameter.
Larger photoplethysmogram (PPG) intensities are generated over a longer period of time with a decrease in expiratory tube diameter, as shown in Fig. 3(b). Thoracic pressure increases as we exhale to push air out of the lungs and this in turn decreases venous return to the heart and leads to the filling of peripheral venous beds at the site of the PPG probe. Increased PPG intensity thus reflects increased thoracic pressure and, as expected, thoracic pressure over time increases in proportion to increased obstruction simulated by smaller tube diameters, as increased pressure is required to force air through a smaller tube. Through measuring PPG, it is observed that the apparatus can simulate different thoracic pressure profiles within the human body, based on changes in external obstruction.
The apparatus was able to achieve a wide range of FEV1/FVC ratios across all subjects, with an example of the varied ratios in 5 subjects across 6 different expiratory tube diameters shown in Fig. 4. The maximum achieved FEV1/FVC ratio was 0.98 with the 25 mm diameter tube, and the minimum was 0.12 through the 3 mm diameter tube. Importantly, artificially induced FEV1/FVC ratios were able to cover the full range of obstruction ratios across mild, moderate, severe and very severe. This indicates the promise of the tube-based apparatus for simulating a full range of disease severities in each individual, and in turn vastly expanding the quantity of obstructive breathing disorder data available. It is important to mention that whilst we have demonstrated resistance values and FEV1 values for several representative pairings of tubes, the proposed COPD simulator offers the flexibility for future research based on any combination of tube diameters, guided by the range we have demonstrated.
Fig. 4.
Calculated FEV1/FVC ratios across 6 expiratory tube diameters from 25 mm to 3 mm with an inspiratory tube diameter fixed at 25 mm, plotted for 5 different subjects. Highlighted as shaded colours are the 4 different obstruction severities at the corresponding FEV1/FVC ratio, with blue indicating mild obstruction, green indicating moderate obstruction, orange indicating severe obstruction and red indicating very severe obstruction.
IV. Deep Learning Verification of Surrogate Data
To validate the proposed apparatus for the generation of surrogate COPD data, we recorded wearable in-ear photoplethysmography (PPG) during tidal breathing with the apparatus and labelled it as chronic obstructive pulmonary disease for the purposes of training a convolutional neural network (CNN) to detect COPD. The photoplethysmography was recorded with the MAX30101 digital PPG chip by Maxim Integrated (San Jose, CA, USA), using the 880 nm infrared wavelength. After training on healthy PPG respiratory waveforms, and on artificially obstructed PPG respiratory waveforms that were labelled as COPD, the model was tested on a combination of unseen healthy data and real world COPD data. Through training a CNN on PPG waveforms, it allowed us to test if our simulator correctly mimics COPD through changes in peripheral blood volume, and not just through flow waveforms. Detecting chronic obstructive pulmonary disease during tidal breathing from wearable PPG waveforms is a difficult task, considering that the transfer function from airflow in the lungs to PPG is a low-pass filter and this vastly reduces the observable difference in features such as inspiratory duty cycle and skewness [16]. Nevertheless, these timing and amplitude differences were present in both data generated by our apparatus and real-world data recorded from COPD patients.
A. Preprocessing and Model Parameters
Wearable photoplethysmography (PPG) was recorded simultaneously alongside tidal breathing with the apparatus, with specific obstruction ratios of 8 mm to 3 mm and 8 mm to 5 mm. To form the surrogate data, 2 out of the 10 healthy subjects generated data with both the 8 mm/3mm and 8 mm/5mm combinations of resistances, and 8 out of the 10 subjects used data exclusively from the 8 mm/5mm combination. The 8 mm/5mm combination of inspiratory and expiratory tubes creates resistances that are the most similar to those found by Paredi et al. [17]. Respiratory waveforms were extracted from the photoplethysmography using multivariate empirical mode decomposition [22] with the three major PPG respiratory modes as inputs [16]. Artefact-free respiratory segments were selected for each case, resulting in a total of 217 healthy segments, 81 artificially obstructed segments and 62 COPD segments. The 62 COPD segments were split across 4 COPD patients, 2 of which were female and 1 of which had darker skin. Three of the COPD patients exhibited moderate COPD, whereas one exhibited severe COPD. Each segment was standardised with a maximum absolute amplitude of 1, and scaled to be 250 samples per individual cycle (defined as a single breath in followed by a single breath out). In this way, differences in amplitude resulting from sensor contact, tissue perfusion and positioning were removed, as well as individual differences in respiratory rate. This preprocessing was implemented to better facilitate the learning of the structural differences between healthy and diseased PPG by the one-dimensional CNN, and not individual differences that are unrelated to COPD.
A simple two-layer one-dimensional convolutional neural network was implemented in Pytorch [23], consisting of 5 kernels of size 175 in layer one, and 3 kernels of size 50 in layer two. These atypical larger kernel sizes were necessary given the nature of the PPG-derived waveform, which consists of gradual changes in amplitude. The input to the model is the PPG-derived respiratory waveform, and the output is a probability of COPD. A full overview of the model architecture is provided in Fig. 5(a). When evaluating each train-test configuration, the whole cross-validation was performed 100 times and with different random seeds, to form a distribution of area under curve values.
Fig. 5.

Employed deep learning (DL) architecture and a summary of results from training the model exclusively on surrogate data. This was achieved for the task of classifying chronic obstructive pulmonary disease (COPD) from wearable photoplethysmography derived respiratory waveforms. (a) The DL Model architecture with an example input photoplethysmography respiratory waveform, two one-dimensional convolutional layers, with the associated number of kernels and number of weights per kernel in curly brackets, and a dense (linear) layer, with the weighted output giving a probability of COPD. (b) Receiver operator characteristic (ROC) curves and area under curve (AUC) values for training and testing with 1, 2 and 3 respiratory cycles, respectively. The black diagonal line represents random chance. (c) Bar graph showing the proportion of segments classified as COPD in real-world test data to form an to overall probability of COPD, for all healthy test segments and each individual COPD test patient. This result uses 2 cycles as an input and a probability threshold for classification per segment of 0.37. (d) Boxplots showing the inspiratory time as a proportion of total respiratory time (duty cycle) for all test segments, categorised into correct and incorrect classification for each of the COPD and healthy classes.
B. Results and Discussion
The model was successful at detecting COPD in wearable PPG derived respiratory waveforms by training exclusively on apparatus-generated surrogate data and testing exclusively on real-world COPD data. Training on different numbers of respiratory cycles resulted in slightly different AUC results, with 1, 2 and 3 cycle(s) resulting in area under the curve (AUC) values of 0.72, 0.75 and 0.73 respectively. The 2-cycle model performed significantly better than the 1-cycle and 3-cycle models when evaluated through ANOVA (P =
, P = 0.013). The full receiver operator characteristic curves are plotted for each case in Fig. 5(b). With a threshold of 0.37 and only 2 PPG respiratory cycles as an input, the model was able to correctly identify COPD in unseen real world segments with an accuracy ranging from 40% to 88% across patients. The model incorrectly classified only 14% of healthy segments as COPD. These results are displayed in Fig. 5(c). The accuracy from training on surrogate data alone is comparable to results of previous leave-one-subject-out classification of COPD with PPG via a random forest [16]. This successful result of classifying real-world COPD, by having only ever seen surrogate data, demonstrates that this simple proof of concept apparatus approximates principal aspects of COPD that are relevant for classification, despite previously mentioned limitations such as the constant obstruction during expiration. When it comes to breathing patterns, not all breaths of a subject with COPD will display the typical timing, namely a longer time to exhale, associated with the disease. Similarly, there may be occasions where a healthy individual takes longer to breathe out than breathe in. A longer expiration phase results in a proportionally shorter inspiratory phase, known as the inspiratory duty cycle. An inspiratory duty cycle of less than 50% is indicative of a longer time to exhale, common among patients with COPD. It is shown in Fig. 5(d) that correctly classified COPD waveforms had this characteristic, whereas COPD waveforms that were classified as healthy shared a similar inspiratory duty cycle to typical healthy breathing patterns. Importantly, this indicates that through training exclusively on surrogate data generated via the proposed apparatus, this deep learning model has learned a correct representation of the features of COPD; this explains why it most often fails when a subject with COPD has respiratory timing that resembles that of a healthy subject. With this in mind, a potential application of this technology could be through the wearable monitoring of trends in the proportion of respiratory cycles classified as disease over periods of days, weeks and months.
Whilst it is clear that the duty cycle is of high importance as a feature for determining COPD with PPG, as was the case in previous work [16], it was necessary to test if the exclusively surrogate-trained CNN learned any COPD related features that were different to duty cycle. To this end, we examined the outputs of the network at each stage with different test data examples. The network consisted of two convolutional layers and a linear layer, as shown in Fig. 5(a). The output of the convolutional layers corresponds to the extracted features, and the output of the linear layer corresponds to the network's interpretation of these features. In this case, a negative output upon multiplication with the linear layer contributes to a healthy classification and a positive output contributes to classification as COPD. The first example considered was a real world COPD segment that was correctly classified as COPD and had a duty cycle and skewness typical of COPD. The second was another real-world waveform, correctly classified as COPD, which had duty cycle and skewness values which were typical of healthy waveforms. The third was a healthy waveform, correctly classified as healthy. The outputs of both the second convolutional layer and element-wise product with the linear layer, for all three second layer convolutional kernels, are shown for each of the three test examples in Fig. 6. Observe that both of the waveforms which were correctly classified as COPD share a common feature and also contain features that differ, all of which contribute to the individual classifications as COPD. This is evidence that the surrogates generated via the proposed COPD simulator also contain relevant information that goes beyond just the timing of the waveforms.
Fig. 6.
Outputs from two stages of the convolutional neural network after training exclusively on surrogate data. The top row shows outputs from the second convolutional layer kernels, corresponding to a feature map. The bottom row shows outputs from element wise produce with the linear layer weights, corresponding to the network's interpretation of the features. Shown are three different examples; a COPD waveform classified as COPD with typical skewness and duty cycle values (red), a COPD waveform classified as COPD with skewness and duty cycle values associated with healthy waveforms (purple), and a healthy waveform correctly classified as healthy (blue). In the legend, “P” corresponds to the probability of COPD assigned by the classifier, “Skew” corresponds to the skewness of the waveform and “Duty” corresponds to the duty cycle.
To fully examine the suitability of our surrogate data generator, it was important to test how performance is influenced by training on both our apparatus-generated surrogate data and real world COPD data. To ensure a fair test, COPD data were isolated for each individual patient so that training and testing was never performed on the same patient in a single realisation. In each case, the COPD data was added to the surrogate data for training, and the volume of surrogate data remained the same. Moreover, as was the case when training solely on surrogate data, the ROC curves were evaluated only on test data and not on training data. All possible combinations of patients for training and testing were explored, with each combination also repeated 100 times with different random seeds. Fig. 7 shows that when the apparatus data is accompanied by real world COPD data (labelled as “Surrogate + 25% COPD”, “Surrogate + 50% COPD” and “Surrogate + 75% COPD”) the model performance increased slightly from an AUC of 0.75 to an AUC of 0.76 and 0.78 when 50% and 75% of the COPD was included in the training pool. Both the increases accuracy with 50% and with 75% of the COPD added were statistically significant (P = 0.04, P =
). There was no significant increase in model accuracy when only 25% of the real world COPD data was added to the training data. This indicates that the apparatus-generated surrogate data contains most of the relevant information for the classification of COPD as real world data, or in other words that adding real world COPD data only slightly enriched the training data.
Fig. 7.
Receiver operator characteristic (ROC) curves and area under curve (AUC) values of test data classification for different training scenarios, namely training on the apparatus data alone, training on a combination of apparatus data and COPD data and training with COPD data alone. The black diagonal line represents random chance.
Finally, training was also examined with COPD data alone. The COPD data was duplicated during training in each case to match the relative proportion of simulated data to healthy data. Training on the COPD data alone performed poorly when compared to training with the apparatus-generated surrogate data, as displayed in Fig. 7. In the best case, training on three COPD subjects and testing on one resulted in an AUC of 0.63, which was significantly lower than the 0.75 achieved by training exclusively on the surrogate data (P =
). The main reason for this is likely the lack of COPD data for the data-hungry CNN, which highlights the importance of physically meaningful surrogate data in scenarios where data is scarce.
V. Conclusion
We have demonstrated a simple yet effective method of simulating obstructive respiratory waveforms through healthy subjects by means of a novel tube-based apparatus. Independent control over both inspiratory and expiratory resistances has allowed for the simulation of respiratory waveforms corresponding to obstructive breathing disorders with a wide range of FEV1/FVC ratios, from healthy values through to values seen in very severe chronic obstructive pulmonary disease. Notably, this has made it possible for the investigation of obstructive breathing disorders at a range of severities, in the same individual, allowing the waveform differences due to different tube resistances to be investigated whilst individual physiological differences are kept constant. Importantly, the proposed apparatus has provided us with a physically meaningful way to generate surrogate breathing disorder waveforms - a prerequisite for the use of machine learning models for classification of breathing disorders. The output surrogate data has been validated through the training of a convolutional neural network based model exclusively on our apparatus-generated surrogate data, and the resulting successful classification of real world COPD data. The proposed COPD-simulator is easy to replicate and opens new avenues for the design of models to screen for COPD with wearable technology, and without the need for expensive and time consuming data collection from patients.
Supplementary Materials
Funding Statement
This work was supported in part by the Racing Foundation under Grants 285/2018, MURI/EPSRC, and EP/P008461, and in part by the Dementia Research Institute at Imperial College London.
Contributor Information
Harry J. Davies, Email: harry.davies14@imperial.ac.uk.
Ghena Hammour, Email: ghena.hammour17@imperial.ac.uk.
Hongjian Xiao, Email: hongjian.xiao18@imperial.ac.uk.
Patrik Bachtiger, Email: p.bachtiger@imperial.ac.uk.
Alexander Larionov, Email: alexander.larionov15@imperial.ac.uk.
Philip L. Molyneaux, Email: p.molyneaux@imperial.ac.uk.
Nicholas S. Peters, Email: n.peters@imperial.ac.uk.
Danilo P. Mandic, Email: d.mandic@imperial.ac.uk.
References
- [1].Xie M., Liu X., Cao X., Guo M., and Li X., “Trends in prevalence and incidence of chronic respiratory diseases from 1990 to 2017,” Respir. Res., vol. 21, no. 1, Feb. 2020, Art. no. 49. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [2].Viegi G., Pistelli F., Sherrill D. L., Maio S., Baldacci S., and Carrozzi L., “Definition, epidemiology and natural history of COPD,” Eur. Respir. Soc., vol. 30, no. 5, pp. 993–1013, Nov. 2007. [DOI] [PubMed] [Google Scholar]
- [3].Thurlbeck W. M. and Müller N. L., “Emphysema: Definition, imaging, and quantification,” Amer. J. Roentgenol., vol. 163, no. 5, pp. 1017–1025, 1994. [DOI] [PubMed] [Google Scholar]
- [4].Heard B. E., Khatchatourov V., Otto H., Putov N. V., and Sobin L., “The morphology of emphysema, chronic bronchitis, and bronchiectasis: Definition, nomenclature, and classification,” J. Clin. Pathol., vol. 32, no. 9, 1979, Art. no. 882. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Roman-Rodriguez M. and Kaplan A., “GOLD 2021 strategy report: Implications for asthma–COPD overlap,” Int. J. Chronic Obstructive Pulmonary Dis., vol. 16, 2021, Art. no. 1709. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Patel A. R., Patel A. R., Singh S., Singh S., and Khawaja I., “Global initiative for chronic obstructive lung disease: The changes made,” Cureus, vol. 11, no. 6, Jun. 2019, Art. no. e4985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].Tobin M. J., Chadha T. S., Jenouri G., Birch S. J., Gazeroglu H. B., and Sackner M. A., “Breathing patterns: 2. Diseased subjects,” Chest, vol. 84, no. 3, pp. 286–294, 1983. [DOI] [PubMed] [Google Scholar]
- [8].Wilkens H. et al. , “Breathing pattern and chest wall volumes during exercise in patients with cystic fibrosis, pulmonary fibrosis and COPD before and after lung transplantation,” Thorax, vol. 65, no. 9, pp. 808–814, 2010. [DOI] [PubMed] [Google Scholar]
- [9].King T. E., Pardo A., and Selman M., “Idiopathic pulmonary fibrosis,” Lancet, vol. 378, no. 9807, pp. 1949–1961, Dec. 2011. [DOI] [PubMed] [Google Scholar]
- [10].Daniels J., “Portable respiratory gas collection equipment,” J. Appl. Physiol., vol. 31, no. 1, pp. 164–167, Jul. 1971. [DOI] [PubMed] [Google Scholar]
- [11].Dressendorfer R. H., Wade C. E., and Bernauer E. M., “Combined effects of breathing resistance and hyperoxia on aerobic work tolerance,” J. Appl. Physiol., vol. 42, no. 3, pp. 444–448, 1977. [DOI] [PubMed] [Google Scholar]
- [12].Love R. G., Muir D. C., Sweetland K. F., Bentley R. A., and Griffin O. G., “Acceptable levels for the breathing resistance of respiratory apparatus: Results for men over the age of 45,” Occup. Environ. Med., vol. 34, no. 2, pp. 126–129, May 1977. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Kido S. et al. , “Effects of combined training with breathing resistance and sustained physical exertion to improve endurance capacity and respiratory muscle function in healthy young adults,” J. Phys. Ther. Sci., vol. 25, no. 5, pp. 605–610, May 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [14].Meredith D. J., Clifton D., Charlton P., Brooks J., Pugh C. W., and Tarassenko L., “Photoplethysmographic derivation of respiratory rate: A review of relevant physiology,” J. Med. Eng. Technol., vol. 36, no. 1, pp. 1–7, Mar. 2012. [DOI] [PubMed] [Google Scholar]
- [15].Nilsson L., Johansson A., and Kalman S., “Respiratory variations in the reflection mode photoplethysmographic signal. Relationships to peripheral venous pressure,” Med. Biol. Eng. Comput., vol. 41, no. 3, pp. 249–254, May 2003. [DOI] [PubMed] [Google Scholar]
- [16].Davies H. J., Bachtiger P., Williams I., Molyneaux P. L., Peters N. S., and Mandic D., “Wearable in-ear PPG: Detailed respiratory variations enable classification of COPD,” IEEE Trans. Biomed. Eng., vol. 69, no. 7, pp. 2390–2400, Jul. 2022. [DOI] [PubMed] [Google Scholar]
- [17].Paredi P. et al. , “Comparison of inspiratory and expiratory resistance and reactance in patients with asthma and chronic obstructive pulmonary disease,” Thorax, vol. 65, no. 3, pp. 263–267, 2010. [DOI] [PubMed] [Google Scholar]
- [18].Mead J., Turner J. M., Macklem P. T., and Little J. B., “Significance of the relationship between lung recoil and maximum expiratory flow,” J. Appl. Physiol., vol. 22, no. 1, pp. 95–108, Jan. 1967. [DOI] [PubMed] [Google Scholar]
- [19].Davies H. J., Williams I., Peters N. S., and Mandic D. P., “In-ear SpO2: A tool for wearable, unobtrusive monitoring of core blood oxygen saturation,” Sensors, vol. 20, no. 17, 2020, Art. no. 4879. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20].Davies H. J., Mandic D. P., and Peters N. S., “Detection and monitoring of respiratory conditions with photoplethysmography (PPG),” GB Patent WO/2024/023518, Feb. 1, 2024. [Online]. Available: https://patentscope.wipo.int/search/en/detail.jsf?docId=WO2024023518&_cid=P10-LS3239-97277-1
- [21].Dames K. K., Lopes A. J., and Melo P. L. D., “Airflow pattern complexity during resting breathing in patients with COPD: Effect of airway obstruction,” Respir. Physiol. Neurobiol., vol. 192, no. 1, pp. 39–47, Feb. 2014. [DOI] [PubMed] [Google Scholar]
- [22].Rehman N. and Mandic D. P., “Multivariate empirical mode decomposition,” in Proc. Roy. Soc. A, Math. Phys. Eng. Sci., vol. 466, no. 2117, pp. 1291–1302, May 2010. [Google Scholar]
- [23].Paszke A. et al. , “PyTorch: An imperative style, high-performance deep learning library,” in Proc. Int. Conf. Neural Inf. Process. Syst., 2019, pp. 8024–8035. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



