Abstract
Incremental exercise consists of three domains of exercise intensity demarcated by two thresholds. The first of these thresholds, derived from gas exchange measurements, is defined as the metabolic threshold (V̇O2θ) above which lactate accumulates. Correctly and reliably identified, V̇O2θ is a non-invasive, sub-maximal marker of aerobic function with practical value. This investigation compared variability in selection of V̇O2θ among interpreters with different levels of experience as well as from auto-detection algorithms employed by a commercially available metabolic cart (MC). Ten healthy young men performed three replicates of incremental cycle exercise during which gas exchange measurements were collected breath-by-breath. Two experienced interpreters (E) and four novice interpreters (N) determined V̇O2θ from plots of specific response variables. Interpreters noted methods used and confidence in their selections. V̇O2θ was automatically determined by the MC. Interclass correlations indicated that E agreed with each other (mean difference, 21 mL·min-1) and with the MC (23 mL·min-1), but not with N (-664 to 364 mL·min-1); N did not agree among themselves. Despite good overall agreement between E and MC, differences >500 mL·min-1 were seen in 50% of individual cases. N expressed unduly higher confidence and used different V̇O2θ selection strategies compared with E. Experience and use of a systematic approach is essential for correctly identifying V̇O2θ. Current guidelines for exercise testing and interpretation do not include recommendations for such an approach. Data from this study suggests that this may be a serious shortcoming. Until an alternative schema for V̇O2θ detection is developed prospectively, strategies based on the present study will give practitioners a systematic and consistent approach to threshold detection.
Key points.
Experience and use of a systematic approach is essential for correctly identifying the metabolic threshold (V̇O2θ).
Current guidelines for exercise testing and interpretation do not include recommendations for such an approach.
Until an alternative schema for V̇O2θ detection is developed prospectively, strategies based on the present study will give practitioners a systematic and consistent approach to threshold detection.
Key words: Incremental exercise testing, oxygen uptake, metabolic threshold, lactate threshold
Introduction
During incremental exercise there exist three domains of exercise intensity separated by two thresholds (Tschakert and Hofmann, 2013). The lower intensity domain represents aerobic muscle metabolism where the increase in carbon dioxide output (V̇CO2) matches the increase in oxygen uptake (V̇O2). Within this domain, there are small increases in blood lactate but its level remains stable. Within the intermediate intensity domain, a sustained increase in blood lactate occurs, and CO2 produced during cellular respiration (“metabolic CO2”) is supplemented by “additional CO2” derived from bicarbonate buffering of lactic acid. This results in a disproportionate increase in V̇CO2 relative to V̇O2 (Cooper et al., 1992). We refer to this threshold between the lower and intermediate domains as the metabolic threshold (V̇O2θ) due to the metabolic shift that defines this transition. The same physiological transition has been described as the lactate or anaerobic threshold (Wasserman and McIlroy, 1964). Finally, the higher intensity domain is defined by further lactate accumulation and a disproportionate increase in expired ventilation (V̇E) relative to V̇CO2 (Cooper and Storer, 2001). Notwithstanding mechanistic and semantic arguments about the nature of the transition, identification of a threshold between the lower and intermediate exercise intensity domains has known practical application and value (Myers and Ashley, 1997). Specifically, V̇O2θ serves as a key parameter of aerobic function (Cooper and Storer, 2001).
As the point of demarcation between the low and intermediate exercise intensity domains, V̇O2θ is a non-invasive, sub-maximal, and effort-independent marker that has proved useful in clinical cardiopulmonary exercise testing. Equations exist for reference values and their lower 95% confidence limits for V̇O2θ as detected by gas exchange measures (Davis et al., 1997). Since V̇O2θ is reduced in cardiovascular disease (Wasserman and Whipp, 1975), chronic pulmonary disease (Cooper, 1995), end-stage renal disease (Mayer et al., 1988), and various forms of myopathy (Inbar et al., 2001; Tirdel et al., 1998), it may be used to detect an abnormal response to exercise when patient responses are compared to these reference values. V̇O2θ also provides an objective means by which to categorize relative exercise intensity. This is of value in prescribing training intensity for endurance exercise training or rehabilitation programs and for evaluating the efficacy of these programs (Coplan et al., 1986; Gibbons, 1987; Hughson and MacFarlane, 1981; Nieuwland et al., 2002). Furthermore, V̇O2θ has been utilized when prioritizing patients for heart transplantation as it is an effective predictor of mortality and morbidity associated with surgery (Older et al., 1999; Older et al., 1993) and a better predictor of 6-month mortality in patients with chronic heart failure (CHF) than V̇O2 max (Gitt et al., 2002).
Consequently, the utility of V̇O2θ hinges upon its precise identification. There is a scarcity of data, however, with which to estimate the ability of laboratory personnel or computer auto-detection routines to reproducibly identify V̇O2θ. Furthermore, there may be a perception that simply having a rudimentary exposure to the concepts surrounding the detection of V̇O2θ is adequate for its reliable measurement. Differing methodologies used for threshold detection contribute to the confusion as well (Svedahl and MacIntosh, 2003). We are unaware of any data that compare levels of training and expertise of interpreters in the somewhat subjective identification of V̇O2θ. Outstandingly, there does not seem to be a systematic approach to threshold detection wherein specific data displays, procedural steps, and decision trees are available to aid the practitioner in reliably measuring this value.
Recommended criteria to be used in selecting V̇O2θ (the first threshold) are available in published reports including use of the “dual criteria” method (systematic rise in the ventilatory equivalent for oxygen, V̇E/V̇O2, while the ventilatory equivalent for carbon dioxide, V̇E/V̇CO2, did not increase) (Caiozzo et al., 1982). The “V-slope” (Beaver et al., 1986) and modified “V-slope” (Sue et al., 1988) methods that identify V̇O2θ from a plot of V̇CO2 versus V̇O2 (Figure 1) were not included in the analysis by Caiozzo et al. but have subsequently been shown to be superior to alternative methods of detection (Beaver et al., 1986; Sue et al., 1988). Current recommendations suggest use of a constellation of variables in selecting V̇O2θ. (Boulay et al., 1984) studied different methods of threshold detection and developed a method whereby interpreters assigned a numerical value representing the confidence of each reader’s choice of V̇O2θ. Notably absent, however, is any systematic evaluation of the ability of laboratory personnel to correctly apply these recommendations in the reliable selection of V̇O2θ, specifically with respect to their expertise and experience in performing this function.
Methods
Subjects
Ten healthy, non-smoking, recreationally active men volunteered as subjects and performed the ramp cycle ergometer exercise tests described below. Six other people interpreted the exercise data and were designated as interpreters. Two were considered experienced (E1 and E2) in detection of the metabolic threshold (V̇O2θ) on the basis of their previous training, research, publications, and teaching. The remaining four interpreters (N1-N4) were pulmonary fellows at various stages of training and although familiar with supervision of exercise tests and the concepts surrounding V̇O2θ, were considered novices in its correct detection. The MC used to acquire the gas exchange data (2900; SensorMedics Corporation, Yorba Linda, CA) has the capability of auto-detecting V̇O2θ using proprietary algorithms. The subjects gave informed consent for their participation in the study which was previously approved by the university’s institutional review board.
Clinical exercise laboratories not only use laboratory personnel to identify V̇O2θ, but often rely upon auto-detection of the value using commercially manufactured metabolic carts (MCs) that use unknown, proprietary algorithms. Such systems are often heavily depended upon to yield a useable value for V̇O2θ, but are rarely systematically validated by comparison with expert human interpreters and could potentially give misleading results and conclusions. The purpose of this investigation, therefore, was to compare the variability in V̇O2θ selected by interpreters of different levels of experience as well as by auto-detection algorithms employed by a commercially available metabolic cart. Implicit in this report is the expertise of the “experienced” interpreters as the standard against which detection of V̇O2θ by novice interpreters and auto detection by a metabolic cart is compared. As a result of this study, we present a systematic approach to V̇O2θ detection as an aid to its reliable identification.
Exercise tests
The ten exercising subjects completed three maximal leg cycling tests using a 20-watt-per-minute ramp protocol administered on non-consecutive days over a 5-day period. A calibrated electrically braked ergometer (Type 800; Ergoline, Bitz, Germany) was used for all tests. The MC measured breath-by-breath pulmonary ventilation and gas exchange throughout the warm-up and exercise phases of each test. The subjects breathed through a one-way valve with V̇E and concentrations of O2 and CO2 measured downstream by a mass flow transducer and paramagnetic and near-infrared gas analyzers, respectively (Markovitz et al., 2004). The expired flow and gas concentration measurements were time aligned and corrected to standard conditions allowing the breath-by-breath calculation of oxygen uptake and carbon dioxide output.
Determination of V̇O2θ
All the interpreters were presented with randomized and coded data sets from the 30 (10 subjects x 3 trials) maximal exercise tests. These data sets included plots of the V̇CO2 versus V̇O2 relationship (Figure 1), as well as plots for the ventilatory equivalents for oxygen (V̇E/V̇O2) and carbon dioxide (V̇E/V̇CO2) versus V̇O2 (Figure 3A), and the end-tidal partial pressures for oxygen (PETO2) and carbon dioxide (PETCO2) versus V̇O2 (Figure 3B). Unlike some investigations which allowed interpreters the freedom to choose their own method for detecting V̇O2θ, (Gladden et al., 1985) we provided each interpreter with an outline of suggested approaches for its selection. No further information was given. The interpreters were allowed unlimited time for choosing V̇O2θ for each of the 30 plots and worked in isolation.
Briefly, it was suggested that interpreters first ascertain V̇O2θ using the V̇CO2 versus V̇O2 relationship alone (Figure 1), assigning a level of confidence (1 = high, 2 = medium, 3 = low) to the chosen V̇O2θ. So as not to confuse V̇O2θ with the ventilatory threshold (V̇CO2θ) interpreters were instructed to review Figure 2, the V̇E versus V̇CO2 relationship to identify the V̇CO2θ: the point at which V̇E increases out of proportion to the increase in V̇CO2 (Cooper and Storer, 2001). If confidence in the selection of V̇O2θ from the V̇CO2 versus V̇O2 relationship was 2 or 3, interpreters were asked to refer to Figures 3A and 3B which displayed V̇E/V̇O2, V̇E/V̇CO2, PETO2, and PETCO2 respectively. For each method used, interpreters were requested to provide V̇O2θ to the nearest 10-50 mL/min, the criteria used to select V̇O2θ, and the overall confidence level with which V̇O2θ was determined. Interpreters noted whether V̇O2θ was determined primarily from the V̇CO2 versus V̇O2 relationship, from data contained in Figures 3A and 3B, or from both.
An option on the automated MC system was selected to provide automated detection of V̇O2θ for each test through use of proprietary algorithms that utilize the V̇CO2 versus V̇O2 relationship alone. None of the interpreters had access to these automated determinations of V̇O2θ.
Statistical analysis
Computed values for V̇O2 and V̇CO2 were obtained breath-by-breath and then smoothed using a 9-breath rolling average technique. The data were presented in tabular and graphical format for interpretation. Descriptive statistics provided means and standard deviations for V̇O2θ, frequency distribution for the method used to detect V̇O2θ, and subjective levels of confidence in its detection. Agreement between E, differences between E and N, and differences between E and the MC’s auto-detection of V̇O2θ, were determined by interclass correlation coefficients, Bland Altman plots, and the 95% agreement limit. Linear regression was used to illustrate individual comparisons of V̇O2θ determined by the six interpreters and the automated MC. The first comparison was E1 versus E2. Due to their close agreement and since we did not have blood lactate measurements, it was posited a priori that V̇O2θ determined by the experienced interpreters would be the gold standard. Whilst we agree that this is not ideal, it does represent a realistic situation faced when interpreting incremental exercise tests. The mean of E1 and E2 was then used in comparison with N1-N4, and V̇O2θ determined by the MC. We fit the data using Huber’s robust regression method and calculated confidence (Fox, 1997). Bland-Altman (Bland and Altman, 1986) plots were used to illustrate individual comparisons of V̇O2θ determined by the six interpreters and the automated MC by plotting pair-wise comparisons for differences between interpreters versus their mean. After comparisons were drawn, systematic bias was identified if the slope of the pair-wise comparison was significantly different than zero. Differences in confidence level as well as methods for detecting V̇O2θ were analyzed by Friedman’s repeated-measures ANOVA. Post-hoc analysis was performed using Dunn’s analysis and a P-value less than 0.05 was considered statistically significant.
Results
All 10 exercising subjects completed the study with a mean (±SD) age of 25(5) years of age, body weight of 77(8) kg, and height of 1.78 (0.09) m. Each reader was able to identify a V̇O2θ on each of the 30 exercise tests. Subject means (±SD) for V̇O2θ determined by each interpreter, as well as the automated MC, are presented in Table 1. Interclass correlation statistics revealed that E1 and E2 had uniform responses in selecting V̇O2θ for the 10 subjects; therefore, this value served as the gold standard to which all other comparisons were made. Linear regression of V̇O2θ selected by E, the mean of E versus N, and the mean of E versus the MC are illustrated in Figure 4, Panels A-F, respectively. The coefficient of determination (r2) for this relationship was 0.97, suggesting a high correlation between V̇O2 selected by E1 and E2. Regression statistics for the remaining relationships shown in Figure 4B-F have substantially poorer correlations as noted by smaller r2 values. Additionally, comparing the individual regression lines with the line of identity (x = y), clear overestimation of V̇O2θ by N1 and N2, and underestimation by N3 is readily apparent. Both N4 and MC estimates of V̇O2θ are high when V̇O2θ is lower and low when V̇O2θ is higher.
Table 1.
Subjects | E1 (a) | E2 (b) | N1 (c) | N2 (d) | N3 (e) | N4 (f) | MC (g) |
---|---|---|---|---|---|---|---|
S1 | 1550(98) | 1547(57) | 2253(590) | 1740(151) | 1520(61) | 1933(153) | 1665(159) |
S2 | 2217(250) | 2258(311) | 2823(91) | 2593(74) | 1793(397) | 2440(373) | 2279(105) |
S3 | 2440(233) | 2488(201) | 2857(423) | 2400(40) | 1763(336) | 220(132) | 1828(345) |
S4 | 1813(297) | 1715(332) | 2570(157) | 2493(290) | 1317(215) | 1970(281) | 2038(38) |
S5 | 1390(69) | 1460(46) | 2070(60) | 1773(306) | 1117(99) | 1660(331) | 1338(195) |
S6 | 1873(410) | 1871(431) | 2630(226) | 2130(279) | 1413(289) | 1967(162) | 1789(199) |
S7 | 2053(223) | 2100(256) | 2987(310) | 2823(222) | 2033(320) | 2700(350) | 2773(138) |
S8 | 2170(384) | 2239(316) | 2960(306) | 2320(72) | 1877(361) | 1873(329) | 1787(171) |
S9 | 1877(261) | 1890(282) | 2527(254) | 2113(390) | 1293(250) | 1817(325) | 1791(369) |
S10 | 1687(318) | 1717(320) | 2030(884) | 1920(442) | 1303(215) | 1767(347) | 1664(344) |
Mean | 1907(254) * d,e; #c |
1928(255) †e; #c |
2571(330) *e; #e,f,g |
2230(227) *g; #e |
1543(254) *g; #f |
2032(278) #c,e |
1895(206) *e; #c |
Superscripts letters denote significantly (* p < 0.05; † p < 0.01; # p < 0.001) differences between the columns. Values for each subject represent means (SD) from triplicate trials. Overall means ± (SD) are shown in last row and represent averages and variability between subjects.
Bland-Altman plots (Figure 5) illustrate the degree of agreement between E1 and E2 (Panel A), the mean of E and N (Panels B, C, D, and E), and the mean of E and the MC (Panel F). Figure 5A indicates that the mean difference between E1 and E2 in selecting V̇O2θ was -21 mL·min-1. The 95% confidence band spanned a range of -217 mL·min-1 to 174 mL·min-1. Thus, on average, E agreed within 21 mL·min-1 across a measurement range of 1200 mL·min-1 to 2800 mL·min-1 and 95 times out of 100, they agreed within ±195 mL·min-1. There was no evidence of systematic bias across the range of measurements as indicated by a regression slope of 0.04 (p = 0.46). Figure 5B-E reveal large mean differences and wide scatters in individual comparisons of E and N. The 95% confidence intervals surrounding the mean differences for the four comparisons illustrated in Panels B-E of Figure 5 were similar (±790, ±676, ±786, and ±775 mL·min-1, respectively) nearly four times greater than the ±195 mL·min-1 confidence band for mean V̇O2θ selected by E.
Slopes of the regression lines in Figure 4B-F were not significantly different from zero, suggesting no systematic bias when comparing E and N. Although the mean difference for the comparison between E and the MC (Figure 5F) was small (-23 mL·min-1) and not significantly different, the dispersion of agreement was large with a 95% confidence interval of ±828 mL·min-1.
The confidence level with which interpreters selected V̇O2θ and the method used in its selection are summarized in Table 2. Overall, differences between interpreters for confidence in identifying V̇O2θ was significant (p = 0.002). Post hoc analysis revealed, however, that N1 reported high confidence in only 5 out of 30 cases while N2 reported high confidence in 21 out of 30 cases. This was the only significant difference in reported level of confidence. Notably, the N who disagreed with the E had an unduly high level of confidence in their selection. Additionally, E and N tended to use different strategies to identify V̇O2θ (p < 0.0001). The multiple comparisons test indicated that on average, E1 or E2 did not differ in their selection method. E tended to rely on both the V̇CO2 versus V̇O2 plot and the dual criteria plots while N tended to select V̇O2θ primarily from the V̇CO2 vs. V̇O2 relationship. No significant difference in V̇O2θ selection method was observed among N although N1 tended to rely more on dual criteria plots for selection of V̇O2θ.
Table 2.
Confidence | E1 | E2 | N1 | N2 | N3 | N4 | Mean E | Mean N |
---|---|---|---|---|---|---|---|---|
High | 40% | 43% | 17%* | 70%* | 57% | 57% | 42% | 50% |
Medium | 50% | 37% | 60% | 23% | 33% | 30% | 44% | 37% |
Low | 10% | 20% | 23% | 7% | 10% | 13% | 15% | 13% |
Method | ||||||||
V̇CO2 vs. V̇O2 only | 30% | 13%* | 17% | 70%* | 67%* | 60%* | 22% | 53% |
Dual Criteria only | 3% | 23% | 47% | 3% | 10% | 10% | 13% | 18% |
Both | 67% | 63% | 37% | 27% | 23% | 30% | 65% | 29% |
Discussion
The metabolic threshold (V̇O2θ), or the first threshold identified during incremental exercise testing, is regarded as one of the four important parameters of aerobic performance (Whipp et al., 1981). Previous reports have identified several variables with specific response patterns that identify the physiological transitions between the light, moderate, and heavy domains of exercise intensity (Beaver et al., 1986; Caiozzo et al., 1982; Henson et al., 1989; Sue et al., 1988; Wasserman and McIlroy, 1964).
We have used the classical approach to threshold determination as described by Wasserman and McIlroy (Wasserman and McIlroy, 1964) and acknowledged three-phases of the exercise response. The first phase represents aerobic muscle metabolism where the evolution of exhaled carbon dioxide is proportional to oxygen uptake. This is best seen on the graphical plot of V̇CO2 versus V̇O2. During the second phase, additional CO2 is derived from bicarbonate buffering of lactic acid and, whilst a change in blood pH is minimized, there is an increase in the ventilatory equivalent for oxygen and the end-tidal oxygen partial pressure. At the same time as this is happening, ventilation remains tightly coupled to V̇CO2 and the ventilatory equivalent for CO2 remains constant, as does the end-tidal partial pressure of CO2. This traditional threshold detection phenomenon of the “dual criteria” is clearly shown by the first vertical line on Figure 3. Our second vertical red line in Figure 3 indicates the point where accumulation of lactic acid overwhelms the bicarbonate buffering capacity of the blood. When this happens, blood pH falls and unbuffered hydrogen ion stimulates the carotid bodies causing true hyperventilation in response to metabolic acidosis. As a result ventilation becomes uncoupled from V̇CO2 and this is reflected in an increase in the ventilatory equivalent for CO2 and a decrease in end-tidal CO2 as shown in Panels A and B of Figure 3, respectively.
There is general agreement that determination of V̇O2θ can be helpful in the interpretation of both clinical and performance exercise tests (Cooper and Storer, 2001), as an important variable in exercise prescription (Coplan et al., 1986; Gibbons, 1987; Hughson and MacFarlane, 1981; Nieuwland et al., 2002), and as a predictor of mortality or morbidity in surgery or CHF (Bechard and Wetstein, 1987; Older et al., 1993). However, if V̇O2θ is to be viewed as a reliable parameter of the exercise response and successfully used for the important purposes mentioned above, laboratory personnel responsible for its identification must be adequately trained and experienced in applying systematic methods for its detection. Furthermore, auto-detection of V̇O2θ by automated MCs should be viewed cautiously.
The principal findings from this study suggest that 1) individuals experienced and practiced with the detection of V̇O2θ who use a systematic approach agree with each other; 2) novices, even when given specific guidelines, do not agree with the experienced users or themselves and 3) automated V̇O2θ detection algorithms agree, on average, with experienced interpreters, but with notable exceptions that suggest the need for human oversight to avoid erroneous interpretations that could have substantial clinical implications. Previous reports on the ability of interpreters to reach acceptable levels of agreement in their detection of V̇O2θ have reached mixed conclusions. Additionally, reader expertise in identifying V̇O2θ is rarely reported and in studies that have done so, absolute differences in choosing V̇O2θ have ranged between 20 mL·min-1 (Hansen et al., 2004) and 530 mL·min-1 (Gladden et al., 1985). If disagreement is the result of inadequate training, experience, or lack of a systematic approach to V̇O2θ selection, we stand to lose a valuable parameter of aerobic performance that might influence clinical decisions about cardiopulmonary status, surgical risk, disease prognosis, prescription of exercise training intensity, and evaluating efficacy of rehabilitation and training programs.
In the present study, the E, on average, agreed within 21 mL·min-1 (1%) in selecting V̇O2θ from a total of 30 incremental exercise tests; in absolute measures, agreement was 79 mL·min-1 (4%). N differed from the experienced interpreters, on average, over a range of -664 mL·min-1 to 364 mL·min-1. In aggregate, the mean absolute difference between E and N was 369 mL·min-1. These data are generally consistent with other reports using experienced interpreters. Shimizu et al. (1991) evaluated the effects of ergometer and work rate increment, response variables, and reviewers in selecting V̇O2θ in healthy men and men with heart disease. In that study, the intraclass correlation coefficient among the three reviewers was 0.60, and the greatest absolute difference among the reviewers across 138 tests was 70 mL·min-1 (7%). Data from the present study are similar with respect to agreement among E who, on average, did not differ by more than 21 mL/min (1%). Simonton et al. (1988) reported somewhat larger inter-reader variability in detecting V̇O2θ in healthy individuals and patients with CHF. Differences among the three experienced interpreters were 4.6% to 12% and 6% to 8.4% in their patients depending on the exercise protocol selected.
Davis et al. (1976) indicated good agreement among their investigators in identifying V̇O2θ. Differences were reported to be 15-30 s in tests that used cycle ergometer work rate increments of approximately 30 watts per minute. Assuming a normal V̇O2-work rate (Ẇ) slope of 10 mL·min-1·watt-1 (Hansen et al., 1988), 15-30 s would represent an inter-evaluator difference of about 75-150 mL·min-1. Selections were reported to rarely differ by more than 60 s (about 300 mL·min-1). These levels of agreement are similar to that reported in the present study between E but not among N. Selection of V̇O2θ in the study by Davis et al. was based on the average time at which non-linear increases in V̇E and V̇CO2 were observed relative to V̇O2 or Ẇ. V̇O2θ was then discerned from a regression of this average time against V̇O2. This methodology is considerably different from that used in the present study and other studies in which selection of V̇O2 are of primary interest.
In contrast to the above where the interpreters appeared to have substantial experience in V̇O2θ detection, Gladden et al. (1985) studied interpreters from nine separate laboratories who ranged from inexperienced to very experienced. They were given blind plots of gas exchange and ventilation data, but no instructions for choosing V̇O2θ. Like the interpreters in the present study, all were familiar with physiological responses to exercise. In eight of the nine evaluators, V̇O2θ ranged greater than 1.75 min in half of all the tests. Considering the 30-watt-per-minute work rate increment, the 1.75 min range would be equivalent to a V̇O2θ of about 540 mL·min-1. Correlations among evaluators for V̇O2θ ranged between 0.30 and 0.93 (median = 0.74). Correlations among the three most experienced evaluators ranged from 0.79 to 0.88. Correlations, however, identify relationships and not agreement. Gladden et al. emphasized this point by reporting that different evaluators did not choose identical V̇O2θ. Mean absolute differences for V̇O2 between any two evaluators ranged from a low of 100 mL·min-1 to a high of 530 mL·min-1. In the present study, the mean difference among E was 21 mL·min-1; the mean absolute differences for the E was 79 mL·min-1; 77% of the time absolute differences were less than 100 mL·min-1 (data not shown). The Bland-Altman plots (Figure 5) further emphasize that E choose different values for V̇O2θ than N. These graphical representations of agreement among E as well as between E and N underscores the value of practice and use of a methodical approach in V̇O2θ selection. Although none of the comparisons exhibited significant bias, the mean differences and individual case differences were large when E were compared to N and the MC.
Observations on the agreement in selecting V̇O2θ within patient groups has also been inconsistent. Belman et al. (1992) reported wide variability in determining V̇O2θ in patients with COPD with only modest intra-and inter-reader agreement between two experienced interpreters. It is possible that the irregular breathing patterns accompanying COPD may have made detection of V̇O2θ by gas exchange difficult in these patients despite the observation of thresholds determined by blood sampling methods. Hansen et al. (2004), however, studied 42 patients with pulmonary arterial hypertension who performed repeated incremental exercise tests, in duplicate, over a period of 15 months. Generally, the two experienced interpreters agreed with each other better than the four lesser experienced interpreters. These data tend to support the findings of the present study; however, the selected group of patients with pulmonary hypertension and the variable time intervals between repeated tests (up to 15 months) may not allow these findings to be generalized to the many other situations in which exercise testing is performed.
Modern, commercially available MCs often include algorithms for automated detection of V̇O2θ. A systematic evaluation of the accuracy of this determination is limited. The MC used in the present study showed good agreement, on average, with E in detecting V̇O2θ. However, as shown in Panel F in both Figures 4 and 5, the magnitude of disagreement was large and variable. The mean absolute difference of 284 mL·min-1 was less than that for any N but 20% of the 30 cases differed by 500 mL·min-1 or more (data not shown). These data make a good case for routine human oversight of V̇O2θ determined by any automated MC.
Differences in reader confidence and method in selecting V̇O2θ
In general, confidence levels in selecting V̇O2θ were higher in N although these differences did not reach statistical significance. N exhibited levels of confidence that were essentially equal to those of E which suggests misperception in abilities given the significant lack of agreement in V̇O2θ selection between these two classes of interpreters. This implies a lack of understanding of response patterns and/or haste in identifying V̇O2θ. One value of using a systematic approach is to avoid hasty decisions based on what might appear to be an “obvious” choice for V̇O2θ. In our experience, this obvious choice is often selection of the more abrupt increase in V̇CO2 near the end of the exercise test.
N used the V̇CO2 versus V̇O2 relationship alone to select V̇O2θ more than twice as frequently as E. E appeared to have been somewhat more thorough by also examining the dual criteria plots shown in Figure 3. While use of the V̇CO2 versus V̇O2 relationship is thought to be the first approach to identifying V̇O2θ because it relatively insensitive to irregular breathing patterns, use of additional variables such as the dual criteria would be confirming and may therefore improve confidence in V̇O2θ identification. A systematic approach should include a thorough examination of an appropriate constellation of variables to ensure accurate and reproducible selection of V̇O2θ. Furthermore, it should carefully eliminate V̇CO2θ as a candidate for V̇O2θ by using the plot of minute ventilation against carbon dioxide output. Examining this graph, therefore, should be considered one of the first steps in the systematic detection of V̇O2θ.
If V̇O2θ is to be interpreted reliably and used, for example, in exercise prescription, pre-operative risk assessment, and the identification of cardiovascular disease or myopathies, then a standardized approach is required. Current guidelines for exercise testing and interpretation do not emphasize such an approach to threshold detection, and we believe this is a serious shortcoming. This paper exposes the lack of consistency in methods of V̇O2θ determination and the likely lack of reliability of reported values. Additionally, our study has provided useful insights into the common pitfalls of gas exchange threshold detection and allowed us to refine our standardized instructions now presented in Table 3. Further research is needed to prospectively evaluate such a schema for detecting V̇O2θ in order to establish recognized standards that will improve the physiological and clinical value of this important measurement. Until then, use of the strategy outlined in Table 3 may give practitioners a systematic and consistent approach desperately needed within the field.
Table 3.
1. | Draw the line of identity on the V̇CO2 versus V̇O2 plot (Figure 1) and identify whether an inflection occurred. |
2. | Identify the ventilatory threshold from the inflection on the V̇E versus V̇CO2 plot (Figure 2) and eliminate this point as a threshold candidate on the V̇CO2 versus V̇O2 plot. |
3. | Mark the lower slope (S1) on the V̇CO2 versus V̇O2 plot which should lie below the line of identity. Note that some subjects might hyperventilate during this stage of the study whereupon the data will lie above the line of identity. These data should be discounted for the purposes of identifying the threshold. |
4. | Mark the upper slope (S2) on the V̇CO2 versus V̇O2 plot and identify the point of intersection between S1 and S2 (This should lie below the line of identity). |
5. | Drop a perpendicular line to the x-axis of the V̇CO2 versus V̇O2 plot and identify V̇O2θ (the metabolic threshold) to the nearest 10-50 mL·min-1. |
6. | Assign a level of confidence to the threshold detection from this plot (1=confident; 2=fairly confident; 3=poorly confident). |
7. | If confidence in choosing V̇O2θ from the V̇CO2 versus V̇O2 plot was 2 or 3, verify V̇O2θ through use of the ventilatory equivalents versus V̇O2 plot (Figure 3A) and the end tidal gas tensions versus V̇O2 plot (Figure 3B). |
8. | For the ventilatory equivalents plot, identify the first systematic upward inflection of V̇E/V̇O2 without a simultaneous increase in V̇E/V̇CO2. From this inflection point, drop a perpendicular line to the x-axis of this plot (V̇O2) and identify V̇O2θ to the nearest 10-50 mL·min-1. |
For the plot of end tidal gas tensions, identify the first systematic increase in PETCO2 without a simultaneous decrease in PETCO2. From this inflection point, drop a perpendicular line to the x-axis of this plot (V̇O2) and identify V̇O2θ to the nearest 10-50 mL·min-1. | |
9. | Where the plots of ventilatory equivalents and end-tidal gas tensions can be plotted one above the other with identical x-axes, use a vertical cursor to select the threshold that best represents upward inflections on both plots (as described above). |
10. | Assign confidence levels (1=confident; 2=fairly confident. 3=poorly confident) to section of V̇O2θ from these plots. |
Conclusion
Experience and use of a systematic approach is essential for correctly identifying the metabolic threshold (V̇O2θ) during incremental exercise at which point a sustained increase in lactate begins. Current guidelines for exercise testing and interpretation do not include recommendations for such an approach. This paper presents a systematic and and practical approach to threshold detection which will aide training of interpreters of incremental cardiopulmonary exercise tests and lead to greater consistency in their conclusions.
Acknowledgements
The experiments comply with the current laws of the country in which they were performed. The authors have no conflict of interest to declare.
Biographies
Brett A. DOLEZAL
Employment
Associate Director, Exercise Physiology Research Laboratory, Departments of Medicine and Physiology, David Geffen School of Medicine at UCLA, Los Angeles
Degree
PhD
Research interest
Exercise physiology
E-mail: bdolezal@mednet.ucla.edu
Thomas W. STORER
Employment
Retired
Degree
PhD
Research interest
Exercise physiology
E-mail: tstorer@partners.org
Eric V. NEUFELD
Employment
Research Associate, Exercise Physiology Research Laboratory, Departments of Medicine and Physiology, David Geffen School of Medicine at UCLA, Los Angeles
Degree
MS
Research interest
Sports medicine, exercise physiology, and pulmonology
E-mail: eneufeld8@ucla.edu
Stephanie SMOOKE
Employment
David Geffen School of Medicine, UCLA
Degree
MD
Research interest
Endocrinology
E-mail: ssmooke@mednet.ucla.edu
Chi-Hong TSENG
Employment
David Geffen School of Medicine, UCLA
Degree
PhD
Research interest
Biostatistics
E-mail: CTseng@mednet.ucla.edu
Christopher B. COOPER
Employment
Professor Emeritus, Departments of Medicine and Physiology, David Geffen School of Medicine at UCLA, Los Angeles
Degree
MD
Research interest
Exercise physiology, COPD
E-mail: ccooper@mednet.ucla.edu
References
- Beaver W.L., Wasserman K., Whipp B.J. (1986) A new method for detecting anaerobic threshold by gas exchange. Journal of Applied Physiology 60, 2020-2027. [DOI] [PubMed] [Google Scholar]
- Bechard D., Wetstein L. (1987) Assessment of exercise oxygen consumption as preoperative criterion for lung resection. The Annals of Thoracic Surgery 44, 344-349. [DOI] [PubMed] [Google Scholar]
- Belman M.J., Epstein L.J., Doornbos D., Elashoff J.D., Koerner S.K., Mohsenifar Z. (1992) Noninvasive determinations of the anaerobic threshold. Reliability and validity in patients with COPD. Chest 102, 1028-1034. [DOI] [PubMed] [Google Scholar]
- Bland J.M., Altman D.G. (1986) Statistical methods for assessing agreement between two methods of clinical measurement. The Lancet 1, 307-310. [PubMed] [Google Scholar]
- Boulay M.R., Hamel P., Simoneau J.A., Lortie G., Prud'homme D., Bouchard C. (1984) A test of aerobic capacity: description and reliability. Canadian Journal of Applied Sport Sciences 9, 122-126. [PubMed] [Google Scholar]
- Caiozzo V.J., Davis J.A., Ellis J.F., Azus J.L., Vandagriff R., Prietto C.A., McMaster W.C. (1982) A comparison of gas exchange indices used to detect the anaerobic threshold. Journal of Applied Physiology 53, 1184-1189. [DOI] [PubMed] [Google Scholar]
- Cohen-Solal A., Zannad F., Kayanakis J.G., Gueret P., Aupetit J.F., Kolsky H. (1991) Multicentre study of the determination of peak oxygen uptake and ventilatory threshold during bicycle exercise in chronic heart failure. Comparison of graphical methods, interobserver variability and influence of the exercise protocol. The VO2 French Study Group. European Heart Journal 12, 1055-1063. [DOI] [PubMed] [Google Scholar]
- Cooper C.B. (1995) Determining the role of exercise in patients with chronic pulmonary disease. Medicine and Science in Sports and Exercise 27, 147-157. [PubMed] [Google Scholar]
- Cooper C.B., Beaver W.L., Cooper D.M., Wasserman K. (1992) Factors affecting the components of the alveolar CO2 output-O2 uptake relationship during incremental exercise in man. Experimental Physiology 77, 51-64. [DOI] [PubMed] [Google Scholar]
- Cooper C.B., Storer T.W. (2001) Exercise testing and interpretation: a practical approach. New York: Cambridge University Press. [Google Scholar]
- Coplan N.L., Gleim G.W., Nicholas J.A. (1986) Using exercise respiratory measurements to compare methods of exercise prescription. American Journal of Cardiology 58, 832-836. [DOI] [PubMed] [Google Scholar]
- Davis J.A., Storer T.W., Caiozzo V.J. (1997) Prediction of normal values for lactate threshold estimated by gas exchange in men and women. European Journal of Applied Physiology and Occupational Physiology 76, 157-164. [DOI] [PubMed] [Google Scholar]
- Davis J.A., Vodak P., Wilmore J.H., Vodak J., Kurtz P. (1976) Anaerobic threshold and maximal aerobic power for three modes of exercise. Journal of Applied Physiology 41, 544-550. [DOI] [PubMed] [Google Scholar]
- Fox J. (1997) Applied regression analysis, linear models, and related methods. Thousand Oaks, Calif.: Sage Publications. [Google Scholar]
- Gibbons E.S. (1987) The significance of anaerobic threshold in exercise prescription. The Journal of Sports Medicine and Physical Fitness 27, 357-361. [PubMed] [Google Scholar]
- Gitt A.K., Wasserman K., Kilkowski C., Kleemann T., Kilkowski A., Bangert M., Schneider S., Schwarz A., Senges J. (2002) Exercise anaerobic threshold and ventilatory efficiency identify heart failure patients for high risk of early death. Circulation 106, 3079-3084. [DOI] [PubMed] [Google Scholar]
- Gladden L.B., Yates J.W., Stremel R.W., Stamford B.A. (1985) Gas exchange and lactate anaerobic thresholds: inter- and intraevaluator agreement. Journal of Applied Physiology 58, 2082-2089. [DOI] [PubMed] [Google Scholar]
- Hansen J.E., Casaburi R., Cooper D.M., Wasserman K. (1988) Oxygen uptake as related to work rate increment during cycle ergometer exercise. European Journal of Applied Physiology and Occupational Physiology 57, 140-145. [DOI] [PubMed] [Google Scholar]
- Hansen J.E., Sun X.G., Yasunobu Y., Garafano R.P., Gates G., Barst R.J., Wasserman K. (2004) Reproducibility of cardiopulmonary exercise measurements in patients with pulmonary arterial hypertension. Chest 126, 816-824. [DOI] [PubMed] [Google Scholar]
- Henson L.C., Poole D.C., Whipp B.J. (1989) Fitness as a determinant of oxygen uptake response to constant-load exercise. European Journal of Applied Physiology and Occupational Physiology 59, 21-28. [DOI] [PubMed] [Google Scholar]
- Hughson R.L., MacFarlane B.J. (1981) Effect of oral propranolol on the anerobic threshold and maximum exercise performance in normal man. Canadian Journal of Physiology and Pharmacology 59, 567-573. [DOI] [PubMed] [Google Scholar]
- Inbar O., Dlin R., Rotstein A., Whipp B.J. (2001) Physiological responses to incremental exercise in patients with chronic fatigue syndrome. Medicine and Science in Sports and Exercise 33, 1463-1470. [DOI] [PubMed] [Google Scholar]
- Markovitz G.H., Sayre J.W., Storer T.W., Cooper C.B. (2004) On issues of confidence in determining the time constant for oxygen uptake kinetics. British Journal of Sports Medicine 38, 553-560. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mayer G., Thum J., Cada E.M., Stummvoll H.K., Graf H. (1988) Working capacity is increased following recombinant human erythropoietin treatment. Kidney International 34, 525-528. [DOI] [PubMed] [Google Scholar]
- Myers J., Ashley E. (1997) Dangerous curves. A perspective on exercise, lactate, and the anaerobic threshold. Chest 111, 787-795. [DOI] [PubMed] [Google Scholar]
- Nieuwland W., Berkhuysen M.A., Van Veldhuisen D.J., Rispens P. (2002) Individual assessment of intensity-level for exercise training in patients with coronary artery disease is necessary. International Journal of Cardiology 84, 15-20. [DOI] [PubMed] [Google Scholar]
- Older P., Hall A., Hader R. (1999) Cardiopulmonary exercise testing as a screening test for perioperative management of major surgery in the elderly. Chest 116, 355-362. [DOI] [PubMed] [Google Scholar]
- Older P., Smith R., Courtney P., Hone R. (1993) Preoperative evaluation of cardiac failure and ischemia in elderly patients by cardiopulmonary exercise testing. Chest 104, 701-704. [DOI] [PubMed] [Google Scholar]
- Shimizu M., Myers J., Buchanan N., Walsh D., Kraemer M., McAuley P., Froelicher V.F. (1991) The ventilatory threshold: method, protocol, and evaluator agreement. American Heart Journal 122, 509-516. [DOI] [PubMed] [Google Scholar]
- Simonton C.A., Higginbotham M.B., Cobb F.R. (1988) The ventilatory threshold: quantitative analysis of reproducibility and relation to arterial lactate concentration in normal subjects and in patients with chronic congestive heart failure. American Journal of Cardiology 62, 100-107. [DOI] [PubMed] [Google Scholar]
- Sue D.Y., Wasserman K., Moricca R.B., Casaburi R. (1988) Metabolic acidosis during exercise in patients with chronic obstructive pulmonary disease. Use of the V-slope method for anaerobic threshold determination. Chest 94, 931-938. [DOI] [PubMed] [Google Scholar]
- Svedahl K., MacIntosh B.R. (2003) Anaerobic threshold: the concept and methods of measurement. Canadian Journal of Applied Physiology 28, 299-323. [DOI] [PubMed] [Google Scholar]
- Tirdel G.B., Girgis R., Fishman R.S., Theodore J. (1998) Metabolic myopathy as a cause of the exercise limitation in lung transplant recipients. The Journal of Heart and Lung Transplantation 17, 1231-1237. [PubMed] [Google Scholar]
- Tschakert G., Hofmann P. (2013) High-intensity intermittent exercise: methodological and physiological aspects. International Journal of Sports Physiology and Performance 8, 600-610. [DOI] [PubMed] [Google Scholar]
- Wasserman K., McIlroy M.B. (1964) Detecting the Threshold of Anaerobic Metabolism in Cardiac Patients during Exercise. American Journal of Cardiology 14, 844-852. [DOI] [PubMed] [Google Scholar]
- Wasserman K., Whipp B.J. (1975) Exercise physiology in health and disease. The American Review of Respiratory Disease 112, 219-249. [DOI] [PubMed] [Google Scholar]
- Whipp B.J., Davis J.A., Torres F., Wasserman K. (1981) A test to determine parameters of aerobic function during exercise. Journal of Applied Physiology 50, 217-221. [DOI] [PubMed] [Google Scholar]