Skip to main content
Chronic Obstructive Pulmonary Diseases: Journal of the COPD Foundation logoLink to Chronic Obstructive Pulmonary Diseases: Journal of the COPD Foundation
. 2015 Jul 8;2(4):321–342. doi: 10.15326/jcopdf.2.4.2014.0157

Systematic Review of the Association Between Laboratory- and Field-Based Exercise Tests and Lung Function in Patients with Chronic Obstructive Pulmonary Disease

Martin Bell 1, Iain Fotheringham 1, Yogesh Suresh Punekar 2, John H Riley 3, Sarah Cockle 2, Sally J Singh 3
PMCID: PMC5556828  PMID: 28848854

Abstract

Introduction: Typical symptoms of chronic obstructive pulmonary disease (COPD) include breathlessness and reduced exercise capacity. Several laboratory- and field-based exercise tests are used to assess the exercise capacity of patients with COPD. It is unclear whether these exercise tests reflect the spirometric measures recommended for diagnosis of COPD. We therefore aimed to systematically assess the correlation between these exercise tests and common measures of lung function.

Methods: A search of Embase™, MEDLINE® and The Cochrane Library identified primary publications in English that reported data on the correlations (Pearson’s r or Spearman’s rho) between the outcomes of exercise tests and the physiological measures of interest: forced expiratory volume in 1 second (FEV1), forced vital capacity, inspiratory capacity and arterial oxygen saturation. We included studies reporting on the following exercise tests: 6- and 12-minute walk tests (6MWT and 12 MWT), incremental and endurance shuttle walk tests, incremental and endurance cycle ergometer tests, and treadmill tests.

Results: Of 1781 articles screened, 45 were ultimately deemed eligible for inclusion in this review. The most commonly reported lung function variable was FEV1 (reported by 39 studies); the most commonly reported exercise test was the 6-minute walk test (reported by 24 studies). FEV1 appears to correlate moderately-to-strongly with 6MWT and 12MWT; and moderately-to-very strongly with incremental cycle ergometer tests (ICET); evidence for other exercise tests was limited.

Conclusion: There is evidence that 6MWT, 12MWT and ICET correlate with FEV1 to some degree; ­ evidence for associations of other exercise tests with measures of lung function in patients with COPD is limited. Clinicians must consider this when deciding to use these tests. Further comparisons of these tests must be made in order to assess which physiological and hemodynamic characteristics they reflect in patients with COPD.

Keywords: copd, chronic obstructive pulmonary disease, systematic review, forced expiratory volume in 1 second, FEV1, exercise capacity

Introduction

Supplemental Material

This article contains supplemental material.

Chronic obstructive pulmonary disease (COPD) is a leading cause of death worldwide and its global prevalence is projected to increase.1-3 COPD is characterized by breathlessness, episodes of exacerbations and reduced exercise capacity.4 COPD can lead to a progressive loss of daily activities and increased sedentary behavior, further exacerbating exercise capacity impairment.5,6 Even in patients with mild COPD, physical activity7 and exercise performance8 are compromised, and exercise tolerance is increasingly attenuated with disease progression.7 The mechanisms underlying reduced exercise capacity in patients with COPD are varied, but include increased metabolic costs of breathing9; deficits in gas exchange and ventilatory mechanics6; and peripheral muscle dysfunction.8,10

Spirometry is recommended by the Global initiative for chronic Obstructive Lung Disease (GOLD) for the diagnosis of COPD.4 However, spirometry alone is a poor predictor of disability and quality of life in patients with COPD11,12 and correlates only weakly with dyspnea and health status.12,13,14 In contrast, exercise test outcomes have been shown to have good prognostic capabilities in patients with COPD.15-21 Guidelines published by the National Institute for Clinical Excellence and the American Thoracic Society (ATS)/European Respiratory Society on the diagnosis and treatment of COPD now indicate that prognosis and assessment of disease severity is improved by using functional criteria such as exercise capacity.22,23 Furthermore, the European Medicines Agency also supports the assertion that exercise testing in the clinical setting is a useful tool in COPD prognosis and monitoring the effectiveness of therapeutic intervention.24

Several test modalities are available for the assessment of exercise capacity in patients with limited exercise tolerance; the most common include the 6- and 12-minute walk tests (6MWT and 12MWT), the incremental and endurance shuttle walk tests (ISWT and ESWT), incremental and endurance cycle ergometer tests (ICET and ECET), and incremental treadmill tests (TT) and all are well established for clinical use in areas such as cardiovascular disease.25 However, it is currently unknown which of these tests best represents the physiological constraints of the disease. The relationship between exercise test performance and the spirometric measurement forced expiratory volume in 1 second (FEV1)26,27,28, has been established,29 however, other key parameters such as forced vital capacity (FVC)30 and inspiratory capacity (IC),31 32 as well as downstream manifestations of impaired lung function such as reduced partial pressure of arterial oxygen (PaO2),33 have not.

This systematic review was conducted to assess the correlation between the main outcomes of exercise tests and the most commonly reported physiological and systemic measures of impaired lung function (FEV1, FVC, IC and PaO2) in patients with COPD.

Methods

Search Strategy

Literature searches were conducted using Ovid® (Ovid Technologies Inc., New York, New York), incorporating Ovid MEDLINE® (U.S. National Library of Medicine, Bethesda, Maryland) for the period from 1948 to January 22, 2013, Ovid Embase™ (Elsevier Inc., Philadelphia, Pennsylvania) from 1974 to January 22, 2013, and The Cochrane Library (John Wiley & Sons Ltd, Hoboken, New Jersey) from 1962 to January 22, 2013. Search strings were constructed to identify studies reporting primary data on the outcomes of the following exercise tests in patients with COPD (including emphysema- and bronchitis-specific studies): 6MWT, 12MWT, ISWT, ESWT, ICET, ECET and TT. The full search strings used have been published previously.34 An example Embase search string is given in the online supplement Figure S1 (516.1KB, pdf) .

Study Selection

Study selection followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines for performing a systematic literature review.35 One researcher screened each reference for inclusion based on title and abstract, and a second researcher performed a full quality-control check. A third researcher resolved any disputes. All publications that met entry criteria for the review were obtained as full articles and reassessed against the review criteria. Data from the selected studies were subsequently used to populate predefined summary tables. All data were fully checked by a second analyst. The review criteria are shown in Table 1. Publications were initially screened based on titles and abstracts, and full articles were reviewed when their relevance was unclear from the abstract. Publications were excluded if they were review articles, were not in English, studied patients with confounding comorbidities (e.g., cancers or diabetes), were unclear on the precise variables used for regression analysis or examined an inappropriate intervention (e.g., non-bronchodilatory pharmacotherapy or homeopathy). Studies were subsequently included for assessment only if they reported data on the correlations (Pearson’s r [r] and/or Spearman’s rho [ρ]) between the outcomes of any of the pre-specified exercise tests and the physiological measures of interest: FEV1, FVC, IC and PaO2.

graphic file with name JCOPDF-2-321-t001.jpg

Data Abstraction

Data were primarily abstracted by a single author (M.B.) and reviewed by all co-authors. A randomly generated selection of 30% of all articles was reviewed by a second author (I.F.) for quality control purposes. Extracted study characteristics were: 1)Study objectives (prospective/retrospective); 2)Study inclusion/exclusion criteria; 3)Study population size; 4)Population baseline characteristics (age, gender, body mass index [BMI], disease severity [staging method and score] and pulmonary function); 5)Methodological information (ECM [protocol, period] and univariant analysis); 6)Results: Pre-test physiological measures, PaO2, arterial oxygen saturation (SaO2) (%), FEV1 (%), FVC (%), FEV1 (L), FVC (L), IC (L), IC (% pred), functional residual capacity [FRC] (% pred), total lung capacity (TLC) (L), TLC (% pred), residual volume in liters (RV) (L), RV (% pred), IC/TLC (%), measure of strength, strength, measure of physical activity, physical activity; 7)Results: Peak physiological measures during test, arterial oxygen saturation (SpO2), oxygen consumption (VO2), VO2/kg, heart rate (HR); 8)Results: Patient reported outcomes, exertion (Borg scale), measure of dyspnea, dyspnea, measure of health-related quality of life (HRQoL), HRQoL; 9)Results: demographics, age, sex, height, weight, BMI; 10) Multivariate analyses to explain variance in ECM (parameters, analysis, r2); 11) Discussion (conclusions, limitation, comment).

The following outcomes of exercise tests were recorded: distance or stages achieved for the 6MWT, 12MWT, ISWT; duration of exercise for the ESWT and ECET; and the highest recorded volume of oxygen consumption (peak VO2) and maximum workload (Wmax ) for the TT and ICET. Publications involving studies assessing multivariate regressions were not included owing to the multifactorial nature of the statistical approach and the unsuitability of the output for aggregation.

Statistical Analysis

Pearson’s and Spearman’s correlations between lung function test results and the most commonly reported exercise test outcomes are presented. Pearson’s correlations are often used to describe the linear association between 2 variables when comparing continuous variable data. Spearman’s correlations are commonly used to describe the linear association between 2 sets of ranked (ordinal) data. Correlations are presented as the range of significant values reported in the study publications reviewed. Only those correlations deemed to have achieved significance by the authors of the original articles were included in our descriptive data analysis (i.e., p<0.05). However, all correlation statistics generated, regardless of significance were extracted (when available, many studies did not provide r values for non-significant correlations). The strength of (significant) correlations is classified according to British Medical Journal guidelines, which regard significant correlation coefficients of 0.0–0.19 as very weak, 0.20–0.39 as weak, 0.40–0.59 as moderate, 0.60–0.79 as strong and 0.80–1.00 as very strong correlations.

Inclusion/Exclusion Criteria

Owing to a lack of high quality evidence for associations between these tests and our stated measures of lung function, we included observational studies in our final analysis in addition to randomized controlled trials. Within the results of this search, we reviewed articles to identify those presenting Pearson’s and Spearman’s correlations between FEV1, FVC, IC and PaO2, and the most commonly reported exercise test outcomes (described above). Studies reporting lung function variables only as a percentage of age-, sex- and BMI-predicted values were excluded from these results. Publications involving studies assessing multivariate regressions were not included owing to the multifactorial nature of the statistical approach and the unsuitability of the output for aggregation.

Results

Overview of Identified Studies

The PRISMA-compliant search methodology used to identify relevant articles is summarized in Figure 1. Of 1781 articles screened, 45 studies were ultimately deemed eligible for inclusion in this review. Table 2 provides a summary of included studies.

graphic file with name JCOPDF-2-321-f001.jpg

graphic file with name JCOPDF-2-321-t002.jpg

Correlations Between Exercise Test Outcomes and FEV1

A total of 39 studies16,36-65,66-71,72,73 reported significant Pearson’s correlations between an exercise test outcome and baseline FEV1 (Table 3). The ranges of correlations reported by studies are presented in Figure 2. FEV1 and 6MWT

graphic file with name JCOPDF-2-321-t003.jpg

graphic file with name JCOPDF-2-321-f002.jpg

The most commonly reported test was the 6MWT; of 17 studies16,36,37,39-41,43,44,46,47,52,54,56,62,65,66,72 assessing Pearson’s correlations, 12 studies16,36,39-41,43,47,52,62,65,66,72 showed significant correlations (weak to strong; r = 0.23–0.62) and 5 reported no statistically significant Pearson’s correlation between the 6MWT and FEV1.37,44,46,54,56 Additionally, 2 studies 42,63 out of 3 studies42,63,73 assessing Spearman’s correlations between FEV1 and 6MWT reported significant correlations (moderate; ρ = 0.41–0.44), with the remaining study73 reporting no correlation.

FEV1 and 12MWT

The 12MWT was also reported frequently, with 5 studies 38,48,58,60,68 out of 8 studies38,48,58-61,64,68 reporting significant Pearson’s correlations between distance achieved and baseline FEV1 (very weak to strong; r = 0.15–0.62); the remaining 3 studies reported no significant Pearson’s correlations.59,61,64 One study reported significant correlations between ISWT and FEV1 (weak to moderate; r = 0.22–0.51).69

FEV1 and ICET

The strongest relationship between FEV1 and exercise tests was in studies reporting ICET correlations; all 7 studies45,47,49,54,59,62,71 assessing Pearson’s correlations between peak VO2 achieved during ICET reported significant correlations (moderate to very strong; r = 0.42–0.83); in 5 studies in which Wmax was assessed as the ICET outcome, correlations were also significant (weak to very strong; r = 0.34–0.81).41,55,57,71,72 These observations were supported by 1 further study, which reported positive Spearman’s correlations for peak VO2 and Wmax during the ICET (ρ = 0.37 and 0.55, respectively).63

FEV1 and ISWT and TT

Limited evidence was available for other test outcomes and baseline FEV1; 1 study each reported significant Pearson’s correlations between FEV1 and ISWT (weak to moderate; r = 0.22–0.51)69 and TT (moderate; r = 0.47, p<0.05)70 (Table 3; Figure 2a).

Correlations Between Exercise Test Outcomes and FVC

Baseline FVC and exercise test outcomes were reported by 18 studies (Table 3; Figure 2b).38,39,41-46,49,50,53,58-62,70

FVC and 6MWT

Again, the most commonly reported Pearson’s correlations were between the 6MWT and FVC; 4 studies39,41,43,62 out of 6 studies39,41,43,44,46,62 assessing Pearson’s correlations reported significant correlations (weak to moderate; r = 0.32–0.59), with the remaining 2 studies44,46 reporting no significant Pearson’s correlation. One further study reported a significant Spearman’s correlation (moderate; ρ = 0.54).42

FVC and 12MWT

The 12MWT was also assessed frequently, with 4 studies38,58-60 of 6 studies38,58-61,64 reporting significant Pearson’s correlations (very weak to moderate; r = −0.16–0.41), with 1 study38 reporting a negative correlation. The remaining 2 studies reported that Pearson’s correlations were not significant.61,64

FVC and ICET, TT and VO2

Of 4 studies45,49,59,62 assessing associations between FVC and peak VO2 obtained during ICET, 3 studies45,49,59 presented significant Pearson’s correlations (strong; r = 0.54–0.67); the remaining study reported that Pearson’s correlations were not significant.62 One further study reported a significant Pearson’s correlation between FVC and Wmax achieved during ICET (moderate; r = 0.58).41 One other study reported a significant Pearson’s correlation between FVC and peak VO2 during TT (strong; r =0.63).70

Correlations Between Exercise Test Outcomes and IC

Baseline IC was reported in 6 studies (Table 3; Figure 2c).41,42,50,52,73,74

IC and 6MWT

Of these, all 3 studies41,52,74 assessing Pearson’s correlations for IC and the 6MWT found significant relationships (weak to moderate; r = 0.38–0.62); a further 2 studies also reported significant Spearman’s values for this relationship (moderate; ρ = 0.51 and 0.57).42,73

IC and ISWT, ICET and TT

Significant moderate correlations were also reported between IC and ISWT (ρ = 0.50), and Wmax during ICET (r = 0.59) and TT (ρ = 0.48).50

Correlations Between Exercise Test Outcomes and Exercise-induced Changes in PaO2

A total of 14 studies assessed correlations between change in PaO2 and exercise outcomes (Table 3; Figure 2d).

PaO2 and 6MWT

Three 44,47,75 of 5 studies 39,40,44,47,75 studies assessing Pearson’s correlations between 6MWT and PaO2 reported significant correlations (very weak to weak; r = 0.15–0.35), with the 2 remaining studies39,40 reporting no significant correlation. Of 3 studies63,76,77 reporting Spearman’s correlations between 6MWT and PaO2, 2 studies63,76 found no significant association and 1 study 77 found a significant correlation (moderate; ρ = 0.42).

PaO2 and 12MWT

Pearson’s correlations for 12MWT were assessed in 4 studies48,58,68,78: 2 studies 48,68 reported moderate correlations (r = 0.42–0.44) with the remaining 2 studies58,78 reporting no correlation.

PaO2 and ISWT

Of 3 studies reporting Spearman’s correlations for the ISWT,50,76,77 2 studies 50, 77 found moderate correlations (ρ = 0.42–0.53) with 1 study76 reporting no correlation. In the 3 studies assessing ICET (peak VO2),47,63,80 2 studies47,80 reported only weak Pearson’s correlations (r = 0.21–0.28) with 1 study63 reporting no Spearman’s correlation.

Correlations Between Exercise Test Outcomes and TLC

Additionally, it was anticipated that associations between exercise and TLC would be included in the review. However, too few studies were found and therefore not included in the final results.

Discussion

This study has shown that there are limited data supporting strong correlations between exercise test outcomes and commonly used assessments of lung function. FEV1 appears to correlate well with the outcomes of the ICET (both VO2 and Wmax). The association between the most commonly used field-based tests of exercise capacity, the 6MWT and 12MWT, and FEV1 is unclear.

FEV1 is used as the main diagnostic criterion for COPD,4,22,23 and the European Medicines Agency also suggests that pre- and post-bronchodilator FEV1, both at baseline and repeatedly during follow-up, is used to demonstrate the efficacy of therapeutic interventions in clinical trials.24 FEV1 correlates better with laboratory-based tests such as the ICET (primarily moderate to very strong correlations) than with field-based tests, such as 6MWT and the 12MWT (although these do consistently demonstrate weak/moderate correlations). The ICET also appeared to have the closest relationship to FVC and IC, albeit with very limited evidence. Conversely, ICET correlated only weakly with PaO2 in studies reporting this relationship.

Some caution is warranted in placing too much emphasis on lung function alone as a gold standard assessment in COPD relative to exercise tests. While individual lung function measurements, such as FEV1, are used in diagnosing the severity of COPD and predicting mortality,27 it should be remembered that COPD patients have systemic disease manifestations that are not necessarily reflected by a single lung function result. Patients with similar FEV1 may nevertheless have significantly different function defects not captured by this test. Exercise tests, in measuring whole lung functionality, may be expected to correlate imprecisely with individual lung function parameters. Furthermore, their design may capture the systemic aspects of COPD lung dysfunction more effectively and thus provide additional prognostic information. In fact, several prospective studies have shown that 6MWT is a better predictor of mortality than FEV1 in patients with severe COPD 16,81 and coupling 6MWT output to individual lung function parameters like FEV1 and PaO2 has proven utility 82,83 and underpins the rationale for the multidimensional grading system for COPD, the BMI, airflow, Obstruction, Dyspnea and Excercise capacity (BODE) index. 84

The observation that the 6MWT and the 12MWT are the most often reported in conjunction with measures of lung function is unsurprising as they are well established, require little equipment, training or preparation, and (for the 6MWT at least) minimal, clinically important, difference reference values are available. Of the laboratory-based tests, the ICET is by far the most widely used. However, this serves to highlight the paucity of data reporting the relationship between other exercise tests and measures of lung function. When reported, for example, the ISWT and TT exhibited mostly moderate to good correlations with the 4 physiological parameters assessed in this review. However, it is difficult to draw definitive conclusions about the applicability of these tests when the data are so rarely reported. Correlations between the FEV1, FVC, IC and PaO2 and the ESWT and ECET have so seldom been reported that no meaningful interpretation of these relationships can be made.

Exercise tests, such as those reviewed here, are used to assess the exercise capacity of patients with COPD. These tests are important because the systemic consequences of COPD include reduced exercise capacity and ensuing decreases in physical activity. However, the findings of this systematic review suggest that the relationships between exercise and FEV1, FVC, IC and PaO2 are under-reported for most tests, and even for the most commonly reported tests, these associations are often equivocal. This suggests that although the information obtained from these tests may be of use in assessing exercise tolerance, caution should be used before applying the results of these tests to make assessments of physiological effects of COPD. Exercise capacity appears to be such a multi-factorial outcome that it is difficult to conclusively link test performance to any of the physiological variables reviewed. This supports a recent systematic review that qualitatively compared patients’ performance in these exercise tests and found no discernible advantage of any particular test.34

Limitations of this review include the wide range of study designs and patient cohorts involved. Furthermore, the association between lung function and exercise performance are most probably not adjusted for important confounders such as age, gender, height, comorbidities and weight. It is also possible that study results are confounded by limited patient numbers. Using the most commonly reported association as an example, the 12 studies reporting a significant Pearson’s correlation between 6MWT and FEV1 had a median n of 108 (range: 38–1218); the 5 studies that reported no significant association between these parameters had a median n of 39 (range: 20–88). It is therefore possible that significant associations could be underreported owing to a type II statistical reporting error. On the other hand, it can be seen where r/ρ are reported for correlations between the same 2 parameters, that there is a tendency for larger populations to have a lower value and associated lower significance, suggesting that smaller populations can over-emphasize a genuine relationship. Both of these factors must be considered when designing studies assessing these exercise tests as well as the ability of pharmacological interventions to affect their output. Other confounding variables that are difficult to control for in such a review include whether or not the guidelines from the ATS were strictly adhered to in all the tests. This is particularly important regarding the technical aspects of tests such as 6MWT where even small deviations in methodology can influence output.85 Finally, the inclusion criteria and COPD severity are often not clearly stated by the studies included in this review. Therefore, there is a risk that the patients in the included studies are not broadly homogenous. However, in this case the weakness lies in the reporting of studies, and we recommend that future studies clearly state inclusion criteria and clinical rationale for diagnosis whenever possible.

Recent guidelines on the diagnosis and treatment of COPD indicate that assessment of disease severity is improved by using functional criteria such as exercise capacity.4,22,23 However, no distinction is made in these guidelines between the different exercise tests. For example, ICET and 6MWT output are thought to measure different physio-biochemical variables, 86 and it has been argued that the latter is a better reflection of a patient’s ability to carry out daily activities.87 The current findings suggest that clinicians or investigators wishing to assess exercise capacity in patients with COPD must carefully consider the physiological consequences of COPD when interpreting the results of these tests. In particular, based on our review of the available data, it is important not to choose an exercise test based solely on patient lung function. Rather, tests should be chosen based on the ability of a pharmaceutical agent to influence the test based on the effect that agent is anticipated to have. For example, an agent that primarily affects lungs and subsequently improves tests measuring lung function, volume and breathlessness may well differ from an agent anticipated to have more systemic consequences.

Abbreviations

chronic obstructive pulmonary disease, COPD; forced expiratory volume in 1 second, FEV1; 6-minute walk test, 6MWT; 12-minute walk test, 12MWT; incremental cycle ergometer test, ICET; Global initiative for chronic Obstructive Lung Disease, GOLD; American Thoracic Society, ATS; incremental shuttle walk test, ISWT; endurance shuttle walk test, ESWT; endurance cycle ergometer test, ECET; treadmill test, TT; forced vital capacity, FVC; inspiratory capacity, IC; partial pressure of arterial oxygen, PaO2; Preferred Reporting Items for Systematic Reviews and Meta-Analyses, PRISMA; body mass index, BMI; arterial oxygen saturation (pulse oximetry), SaO2; liters, L; functional residual capacity, FRC; total lung capacity, TLC; residual volume, RV; arterial oxygen saturation, SpO2; oxygen consumption, VO2; heart rate, HR; health-related quality of life, HRQoL; BMI, airflow Obstruction, Dyspnea & Exercise Capacity index, BODE; highest volume of oxygen consumption achieved, peak VO2; highest workload achieved, Wmax; National Institute for Health Research, NIHR; Collaboration for Leadership in Applied Health Research and Care East Midlands, CLAHRC EM; National Health Service, NHS; British Thoracic Society, BTS; European Respiratory Society, ERS; interquartile range, IQR; inspiratory slow vital capacity, IVC; kilopascal, kPa; Medical Research Council, MRC; maximal voluntary ventilation, MVV; not reported, NR; partial pressure of arterial carbon dioxide, PaCO2; respiratory exchange ratio, RE; standard deviation, SD; vital capacity, VC; correlation, Corr; studies reporting no significant correlation, NS; Spearman’s rank coefficient,ρ; Pearson’s regression coefficient, r

Funding Statement

The study was funded by GlaxoSmithKline, Uxbridge, United Kingdom.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material

This article contains supplemental material.


Articles from Chronic Obstructive Pulmonary Diseases: Journal of the COPD Foundation are provided here courtesy of COPD Foundation

RESOURCES