Skip to main content
PLOS One logoLink to PLOS One
. 2021 Feb 17;16(2):e0247057. doi: 10.1371/journal.pone.0247057

Is a verification phase useful for confirming maximal oxygen uptake in apparently healthy adults? A systematic review and meta-analysis

Victor A B Costa 1,2, Adrian W Midgley 3, Sean Carroll 4,#, Todd A Astorino 5,#, Tainah de Paula 6,#, Paulo Farinatti 1,2,#, Felipe A Cunha 1,2,*
Editor: Laurent Mourot7
PMCID: PMC7888616  PMID: 33596256

Abstract

Background

The ‘verification phase’ has emerged as a supplementary procedure to traditional maximal oxygen uptake (VO2max) criteria to confirm that the highest possible VO2 has been attained during a cardiopulmonary exercise test (CPET).

Objective

To compare the highest VO2 responses observed in different verification phase procedures with their preceding CPET for confirmation that VO2max was likely attained.

Methods

MEDLINE (accessed through PubMed), Web of Science, SPORTDiscus, and Cochrane (accessed through Wiley) were searched for relevant studies that involved apparently healthy adults, VO2max determination by indirect calorimetry, and a CPET on a cycle ergometer or treadmill that incorporated an appended verification phase. RevMan 5.3 software was used to analyze the pooled effect of the CPET and verification phase on the highest mean VO2. Meta-analysis effect size calculations incorporated random-effects assumptions due to the diversity of experimental protocols employed. I2 was calculated to determine the heterogeneity of VO2 responses, and a funnel plot was used to check the risk of bias, within the mean VO2 responses from the primary studies. Subgroup analyses were used to test the moderator effects of sex, cardiorespiratory fitness, exercise modality, CPET protocol, and verification phase protocol.

Results

Eighty studies were included in the systematic review (total sample of 1,680 participants; 473 women; age 19–68 yr.; VO2max 3.3 ± 1.4 L/min or 46.9 ± 12.1 mL·kg-1·min-1). The highest mean VO2 values attained in the CPET and verification phase were similar in the 54 studies that were meta-analyzed (mean difference = 0.03 [95% CI = -0.01 to 0.06] L/min, P = 0.15). Furthermore, the difference between the CPET and verification phase was not affected by any of the potential moderators such as verification phase intensity (P = 0.11), type of recovery utilized (P = 0.36), VO2max verification criterion adoption (P = 0.29), same or alternate day verification procedure (P = 0.21), verification-phase duration (P = 0.35), or even according to sex, cardiorespiratory fitness level, exercise modality, and CPET protocol (P = 0.18 to P = 0.71). The funnel plot indicated that there was no significant publication bias.

Conclusions

The verification phase seems a robust procedure to confirm that the highest possible VO2 has been attained during a ramp or continuous step-incremented CPET. However, given the high concordance between the highest mean VO2 achieved in the CPET and verification phase, findings from the current study would question its necessity in all testing circumstances.

PROSPERO Registration ID

CRD42019123540.

Introduction

Maximal oxygen uptake (VO2max) represents the upper physiological limit of the utilization of oxygen for producing energy during strenuous exercise performed until volitional exhaustion [1, 2]. The VO2max is widely regarded as the gold standard measure of cardiorespiratory fitness and is typically determined using a cardiopulmonary exercise test (CPET) in clinical, applied physiology, and sport and exercise science settings [1, 36]. The VO2max is often used to diagnose cardiovascular disease [7], predict all-cause mortality [810], develop exercise prescriptions [3, 11, 12], and evaluate the efficacy of exercise programmes [1315]. Consequently, the validity of VO2max values obtained during CPETs has widespread importance in clinical, sporting, and research-related contexts.

The use of indirect calorimetry for the determination of VO2max during exercise testing to volitional exhaustion on a treadmill or cycle ergometer has become common during the past few decades [1618]. This has largely been attributed to the development of fast-responding metabolic gas analyzers allowing the time-efficient acquisition of real-time, breath-by-breath, respiratory gas exchange and flow rate data during CPET [see 19 for a review]. These technological advances have contributed to a transition from the Douglas bag method and time-consuming discontinuous step-incremented protocols to more time-efficient continuous ramp or pseudo-ramp protocols for determining VO2max [2025]. Despite the considerable progress in the efficiency by which CPET can be conducted and evaluated, there is still much to be learned about the determination of VO2max [2, 2430]. One particularly problematic aspect has been the challenge in identifying a lack of VO2max attainment due to inappropriate test protocols, premature fatigue, or poor participant motivation and lack of effort [31].

The concept of a VO2max originated almost 100 years ago with the seminal works of Hill and colleagues [32, 33]. They proposed the existence of an individual upper limit or ‘ceiling’ of VO2 during maximal exercise, beyond which no further increase in VO2 occurs despite increasing work rate (WR) and higher metabolic demand. The primary criterion for confirming that a VO2max has been elicited has historically been based on the occurrence of a VO2 plateau, commonly defined as a small or no increase in VO2 despite a continued increase in WR [34]. The landmark study of Taylor et al. [34] was the first to use a formal VO2 plateau criterion, which was defined as an increase in VO2 of less than 0.150 L/min (or ≤ 2.1 mL·kg-1·min-1, considering an average body mass of 72 kg from 115 male participants) in response to a specific discontinuous step-incremented protocol performed over 3–5 laboratory visits. Subsequent studies have often used the Taylor et al. [34] criterion or alternative thresholds to confirm the attainment of a VO2 plateau [see 29 for a review]. Since the widespread adoption of continuous short-duration and ramp-based CPET protocols, several studies have reported low incidences of the VO2 plateau [3539]. The variability in VO2 plateau incidence has been attributed to differences in the criteria used for detecting the VO2 plateau [29, 40], VO2 sampling intervals [36, 41, 42], exercise modality [43], the warm-up prior to the CPET [44], type of CPET protocol [4548], and various participant characteristics [4951].

In the absence of a VO2 plateau, secondary VO2max criteria based upon achievement of threshold values for the respiratory exchange ratio (RER), percentage of age-predicted maximal heart rate, post-exercise blood lactate concentration, and ratings of perceived exertion (RPE) have become commonly used to evaluate whether a true VO2max has been attained [29, 40]. However, this approach has been widely criticized by numerous investigators due to the individual variability in maximal physiological responses for these variables and lack of specificity in identifying individuals who did not continue the CPET to their limit of exercise tolerance. Research has shown that some individuals can satisfy some of the secondary criteria thresholds long before the highest VO2 value observed in the CPET has been attained [2, 29, 37, 39]. The maximal RER criterion, for example, can be satisfied at VO2 values 27–39% lower than the highest VO2 value achieved in the CPET [37, 39]. Like the VO2 plateau, secondary VO2max criteria are often dependent on exercise modality, test protocol, and participant characteristics [29].

A review by Midgley et al. [29] suggested a new set of standardized VO2max criteria should be developed that are independent of exercise modality, test protocol, and participant characteristics, so they can be universally applied. In 2009, Midgley and Carroll [28] provided an early narrative review of an evolving test procedure that showed promise for developing more standardized VO2max criteria, the so-called ‘verification phase’. The verification phase consists of an appended square wave bout of severe-intensity exercise (e.g. above critical power), or similar multistage exercise bout, performed until the limit of exercise tolerance [28]. It is commonly applied after a short recovery period from a CPET, however, longer recovery periods of up to 24–48 hours also have been used [52]. The verification phase is based on the premise that when the highest VO2 values in the CPET are consistent with the verification phase (typically within 2–3% in accordance with the test-retest reliability of VO2max), this provides substantial empirical support that the highest possible VO2 has been elicited. Poole and Jones [2] recently stated that to confirm the attainment of VO2max a verification phase should be performed at a higher WR than the last load attained in the CPET (i.e. > WRpeak) in all future studies. Conversely, Iannetta et al. [25] recommended WRs within the upper limit of the severe exercise intensity domain to allow the verification phase to be maintained long enough for VO2max attainment. According to their recent findings, verification phases performed at 110% of the WRpeak attained during CPETs with increment rates of 25 and 30 W/min resulted in exercise durations that were too short to allow VO2 to reach the highest VO2 recorded at the end of the preceding ramp CPETs [25]. Along with exercise intensity and duration, it is also unclear whether other factors affect the utility of the verification phase such as exercise modality, differences in the type and duration of the recovery period between the verification phase and CPET, whether a verification criterion threshold is adopted, and participant characteristics such as sex and cardiorespiratory fitness levels.

Given the considerable uncertainty regarding the application of the verification phase, it is feasible to think that a systematic review and meta-analysis is needed to comprehensively summarize the evidence for improving our understanding of the strengths and weaknesses of the substantial number of different verification procedures that have been utilized and its impact on the attainment of VO2max. Thus, the aim of the present study was to systematically review and provide a meta-analysis on the application of the verification phase for confirming whether the highest possible VO2 has been attained during ramp or step-incremented CPETs in apparently healthy adults.

Methods

Protocol and registration

The systematic review was performed in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. A completed PRISMA checklist is shown in S1 Checklist. The protocol for this study was recorded at http://www.crd.york.ac.uk/PROSPERO (CRD42019123540). The main questions addressed by the present study were: To what extent does the highest VO2 attained in the CPET differ from that attained in the verification phase? Secondly, are the highest VO2 values in the CPET and verification phase affected by the verification-phase characteristics (e.g. intensity, adoption of a criterion threshold, and aspects of the recovery period between the CPET and the verification phase), or even with respect to particular subgroups (e.g. sex, cardiorespiratory fitness levels, exercise test modality, and CPET protocol design) in apparently healthy adults?

Search strategy

MEDLINE (accessed through PubMed), Web of Science, SPORTDiscus, and Cochrane (accessed through Wiley) were searched for peer-reviewed literature using a combination of medical subject heading (MeSH) descriptors, with a time frame that spanned the inception of each database until the search date (September 30th, 2020). The search strategy was developed based on the PICO method [i.e. Participants: apparently healthy humans; Interventions: any intervention involving exercise; Comparisons: incremental CPET and an appended square-wave or multistage verification phase; and Outcome: VO2max confirmation]. The electronic search strategies for all databases are provided in S1 Text.

The terms were adapted for use with other bibliographic databases. Reference lists and citations of eligible articles were also hand searched for additional relevant studies. The search was performed in a standardized manner by two independent researchers (VABC and TP). Only English language studies were eligible for inclusion and only if they satisfied three a priori criteria: (1) involved apparently healthy participants who were ≥ 18 years of age; (2) determined VO2max using expired gas analysis indirect calorimetry; and (3) the CPET was carried out using bipedal cycle ergometer or bipedal treadmill running or walking. Studies were excluded if they involved: (1) participants who had taken dietary supplements or drugs that could affect body mass, metabolic profile, or exercise performance; or (2) the use of non-maximal test protocols.

Study selection

Potential studies were screened for inclusion using three methods: (1) title only; (2) title and abstract; and (3) full-text review. Two investigators independently searched and selected articles, and coauthors subsequently confirmed articles to be included in the analysis. Disagreements were resolved by consensus. Agreement between investigators with respect to inclusion and/or exclusion of potential trials was ratified in 252 randomly selected abstracts by means of Cohen’s kappa (κ = 0.811, P < 0.05). Fig 1 summarizes the screening and selection process.

Fig 1. Flowchart of the systematic review and meta-analysis according to the PRISMA guidelines.

Fig 1

VO2max: maximal oxygen uptake.

Data extraction and management

Two independent reviewers extracted data using a standardized form. The following data were summarized: (1) characteristics of study participants (total sample number, sex, age, body mass index [BMI], and cardiorespiratory fitness); (2) type of intervention (CPET and verification-phase duration, exercise modality, and exercise test protocol used); and (3) outcome measures (mean ± standard deviation [SD] for group VO2max and protocol duration during the CPET and verification phase). Disagreements were resolved by consensus. When the relevant quantitative data were not reported, authors of the original studies were contacted to request the data.

Quality assessment

The risk of bias for all eligible studies was not assessed because it does not apply to the characteristics of the present review. For example, randomization sequence generation and treatment allocation concealment were not applied, since there were no comparison groups and each individual acted as their own control. It is also noteworthy to mention the absence of blinding in both participants undergoing testing and evaluators who applied the CPET and verification phases, because procedurally all exercise protocols were performed in a fixed order (i.e. CPET followed by the verification phase). Given that VO2max is the evaluation of an objective numerical variable, the blinding of the evaluator does not generate a different interpretation of the VO2max values obtained in a CPET and verification phase. Finally, the assessment of incomplete outcome data (sample loss) and selective reporting of outcomes also does not apply, because it is a cross-sectional study with a single outcome of interest.

Statistical analysis

All meta-analyses were performed using Review Manager (RevMan) software version 5.3 (Copenhagen, The Nordic Cochrane Centre, The Cochrane Collaboration, 2014). Data are presented as the mean ± SD unless otherwise stated. The outcome was the mean difference (95% confidence interval [CI]) between the CPET and verification phase for the highest absolute VO2 (L/min). Given that absolute VO2 are continuous data, the weighted mean difference (WMD) method was used for combining study effect size estimates. With the WMD method, the pooled effect estimate represents a weighted mean of all included study group comparisons. The weighting assigned to each individual study group (i.e. the comparison of the CPET and verification phase results) in the analysis is inversely proportional to the variance of the absolute VO2 (L/min). This method typically assigns more weight in the meta-analysis to studies with the highest precision (inverse variance) /larger sample sizes. The WMDs were calculated using random-effects models given the study group differences in CPET modalities and protocols, types of recovery, and verification phase protocols.

Heterogeneity of net study group changes in VO2max (L/min) was examined using the Q statistic. Cochran’s Q statistic is computed by summing the squared deviations of each trial’s estimate from the overall meta-analytic estimate and weighting each trial’s contribution in the same manner as in the meta-analysis. P-values were obtained by comparing the statistic with a χ2 distribution with k-1 degrees of freedom (where k is the number of trials). A P-value of < 0.10 was adopted since the Q statistic tends to suffer from low differential power. The formal Q statistic was used in conjunction with the methods for assessing heterogeneity. The I2 statistic measures the extent of inconsistency among the results of the primary study groups, interpreted approximately as the proportion of total variation in point estimates that is due to heterogeneity rather than sampling error. Effect sizes with a corresponding I2 value of ≤ 50% were considered to have low heterogeneity. The publication bias of the articles was assessed using a funnel plot.

Subgroup analyses were defined a priori to investigate the magnitude of differences between CPETs and verification phases due to variations in sex, cardiorespiratory fitness level, exercise modality, CPET protocol design, or how the verification phase was performed. Forest plots were constructed to display values at the 95% confidence level. Effect sizes were calculated by subtracting the highest mean values for VO2 (L/min) observed in the CPET from the verification phase values, on the basis of grouping studies with selected verification-phase characteristics for intensity (i.e. sub vs. supra WRpeak) and type of recovery between the CPET and verification phase (i.e. active vs. passive). The studies were also classified according to whether a criterion threshold for VO2max was used for the verification phase (i.e. yes vs. no), whether the verification phase was performed in the same testing session as the CPET or on a different day, and the duration of the verification phase (i.e. ≤ 80 s, 81–120 s, and > 120 s). Stratified analyses were also conducted according to particular subgroups such as sex (i.e. male and female), cardiorespiratory fitness level using the cut-off points proposed by Astorino et al. [53] (i.e. low: < 40 mL·kg-1·min-1; moderate: 40–50 mL·kg-1·min-1; high: > 50 mL·kg-1·min-1), exercise test modality (i.e. cycling and running), and CPET protocol design (i.e. discontinuous step-incremented, continuous step-incremented, and ramp protocols).

Results

The literature search identified 371 potential articles, with 334 obtained from electronic database searches and 37 from the wider inspection of reference lists and electronic citations of these articles. Eighty studies published between 1980 and 2020 met the eligibility criteria and were included in the systematic review (see Fig 1).

Participants

The total number of participants recruited across all included studies was 1,680 (1,077 men, 473 women, and the sex of 130 participants was not specified). Included studies had a median (interquartile range [IQR]) sample size of 13 [10] participants. Participants were aged between 19 and 68 yr, all apparently healthy, and with a physical activity status ranging from sedentary to highly-trained endurance athletes. Thirty-six studies included only men, two included only women, 41 included both men and women, and one study did not specify the sex of the participants (see Table 1). On average, participants had a BMI within the normal range (mean ± SD [range]: 24.4 ± 2.5 [19.4–32.0] kg/m2) and a moderate level of cardiorespiratory fitness (VO2max mean ± SD [range]: 46.9 ± 12.1 [23.9–68.6] mL·kg-1·min-1).

Table 1. Sample characteristics for studies that incorporated a cardiopulmonary exercise test (CPET) (k = 80).

Study mean values
Population Sex N Age BMI VO2max
M/F Years kg/m2 mL·kg-1·min-1
Alexander and Mier [54] Soccer players M/F 5/6 21.3 22.7 57.7
Arad et al. [55] Sedentary M 19 33.4 25.8 30.0
F 16 26.8 26.6 27.1
Astorino and DeRevere [56] Recreationally trained M/F 19/11 26 NS 47.2
M/F 41/38 23.3 NS 40.5
Astorino and White [57] Physically active M 13 23.5 24.3 43.8
F 17 22.9 22.0 40.7
Astorino et al. [53] Low CRF M/F 5/5 25.7 22.7 36.2
Moderate CRF M/F 5/5 26.3 24.1 46.4
High CRF M/F 9/1 26 23.7 57.9
Astorino et al. [58] Active adults (HIIT-Baseline) M/F 3/11 27 22 38.0
Active adults (HIIT—Week 3) 40.4
Active adults (Control—Baseline) M/F 8/6 23 24 40.2
Active adults (Control—Week 3) 40.5
Astorino et al. [59] Active adults M/F 14 27 22.5 38.0
Astorino et al. [60] Sedentary M/F 6/9 22.4 24.5 32.7
Sedentary M/F 1/8 21.8 22.9 42.1
Beltrami et al. [61] Runners or cross-country skiers M/F 23/3 29 23.5 61.3
Beltz et al. [62] Recreationally trained M 16 23.6 26.6 47.4
Bisi et al. [63] Healthy adults M 11 23.5 22.6 35.0
Chidnok et al. [64] Active adults M 7 20 24.8 57.7
Clark et al. [65] Adults of various fitness levels M/F 3/12 22 22.0 NS
Colakoglu et al. [66] Athletes M 9 24.2 23.0 59.7
Colakoglu et al. [67] Well-trained athletes M 9 23.6 23.1 60.2
Colakoglu et al. [68] Athletes M 9 23.6 23.1 60.2
Dalleck et al. [69] Healthy adults M/F 9/9 59.7 27.8 27.7
Day et al. [35] Healthy adults M 38 19–61 NS NS
Del Giudice et al. [70] Healthy adults M 14 21.5 22.8 60.2
Dexheimer et al. [71] Active adults M 12 29 31.4 50.6
F 5 25.6 24.4 43.7
Dicks et al. [72] Firefighters M 30 34.5 28.7 41.0
Dogra et al. [73] Older adults (trained) F 7 62.7 23.4 37.8
Older adults (untrained) F 10 68.8 26.1 24.1
Ducrocq et al. [74] Recreationally trained M/F 9/4 21.2 22.5 56.0
Elliott et al. [75] Cyclists M 8 40.5 25.2 53.7
Faulkner et al. [76] Recreationally trained M 13 25.5 24.5 63.9
Foster et al. [77] Physically active non-athletes (cycling) M 16 31.5 24.0 51.7
F 4 28 21.6
Competitive runners (treadmill running) M 12 21.6 22.9 56.3
F 8 21 20.5
Freeberg et al. [78] Healthy adults M/F 17/13 21.7 23.7 49.9
Goodall et al. [79] Cyclists M 9 28.1 23.1 61.1
Hanson et al. [80] Recreationally trained M/F 8/5 24 24.7 56.2
Hawkins et al. [81] Distance runners M/F 36/16 NS NS 63.3
Hogg et al. [82] Highly trained M 14 28 23.2 68.6
Iannetta et al. [25] Recreationally trained M/F 6/5 28 21.8 52.6
James et al. [83] Squash players M/F 6/2 20.3 22.1 48.8
Jamnick et al. [84] Trained cyclists M 17 36.2 24.1 62.1
Jamnick et al. [85] Active adults M 31 29 25.2 48.6
F 26 27 23.4 39.8
Johnson et al. [86] Recreationally trained runners and cyclists M/F 6/5 22 24.1 46.9
Keiller and Gordon [87] Recreationally trained M/F 9/2 22.4 24.4 51.6
Kirkeberg et al. [88] Recreational-trained men M 12 29 27.5 49.2
Knaier et al. [89] Athletes M 10 27.5 23.1 61.1
F 7 28.4 22.5 54.3
Knaier et al. [90] High cardiorespiratory fitness M 8 27.4 22.8 62.8
F 5 27.6 22.7 55.2
Kramer et al. [91] Soccer players M 15 23.1 23.0 50.5
Mann et al. [92] Runners M 20 30 24.2 60.2
F 12 28 21.7 51.9
Mann et al. [93] Runners M 8 36 24.1 57.9
F 2 32 24.9 49.9
Mauger et al. [94] Well-trained runners M 14 22.7 23.4 64.4
McGawley [95] Recreational runners M/F 5/5 32 NS 59.8
McKay et al. [96] Healthy adults M 12 25 NS 44.5
Midgley et al. [39] Runners M 10 39.3 23.6 53.6
Cyclists M 10 36.0 23.2 57.7
Midgley et al. [97] Middle- and long-distance runners M 16 38.7 23.0 57.1
Midgley et al. [98] Distance runners M 9 38.2 24.6 55.0
Mier et al. [99] College athletes M/F 8/27 20 23.5 55.5
Murias et al. [100] Younger adults M 30 25 24.9 49.4
Older adults M 31 68 25.8 33.0
Murias et al. [101] Older adults F 6 69 27.0 23.9
Younger adults F 8 25 23.8 41.2
Murias et al. [102] Older adults M 8 68 26.0 28.3
Younger adults M 8 23 25.2 48
Nalcakan [103] Healthy adults M 15 21.7 25.0 40.3
Niemela et al. [104] Healthy adults M 16 25–35 23.3 42.5
Niemeyer et al. [105] Physically active M 24 26.2 24.2 49.8
Niemeyer et al. [106] Recreationally trained M 46 25.6 24.0 50.8
Nolan et al. [107] Active adults M/F 6/6 23 22.7 57.5
Poole et al. [37] Healthy adults M 8 27 NS 50.8
Possamai et al. [108] Recreationally trained cyclists M 19 23 25.3 48.0
Riboli et al. [109] Soccer players M 16 22.5 22.4 59.2
Rossiter et al. [38] Healthy adults M 7 26 25.1 51.5
Sabino-Carvalho et al. [110] Runners M 14 22.3 21.2 67.0
F 4 24 20.4 60.1
Scharhag-Rosenberger et al. [111] Healthy adults M/F 20/20 24 23.0 50.0
Scheadler and Devor [112] Experienced runners NS 13 25 22.5 64.9
Sedgeman et al. [113] Recreationally trained M/F 6/7 29 23.9 50.1
Stachenfeld et al. [114] Healthy adults M/F 33/18 30.6 NS 49.2
Straub et al. [115] Trained cyclists M 12 33 24.8 56.5
F 4 38 22.1
Strom et al. [116] Healthy adults M/F 21/29 30.3 24.0 47.3
Taylor et al. [117] Runners and triathlon athletes M 11 28.5 22.6 63.7
F 8 26.3 21.8 52.3
Tucker et al. [118] Nonexercise-trained youth M 17 27 25.6 41.6
Vogiatzis et al. [119] Cyclists M 11 38 22.1 62.0
Weatherwax et al. [120] Sedentary adults M 5 53.6 32.0 32.3
F 11 52.2 29.4 24.8
Weatherwax et al. [15] Sedentary adults (standardized—baseline) M/F 4/16 51.2 29.6 24.3
Sedentary adults (standardized—week 4) M/F 4/16 51.2 29.7 25.0
Sedentary adults (standardized—week 8) M/F 4/16 51.2 29.6 26.3
Sedentary adults (standardized—week 12) M/F 4/16 51.2 29.6 26.3
Sedentary adults (individualized—baseline) M/F 5/14 44.9 27.2 29.5
Sedentary adults (individualized—week 4) M/F 5/14 44.9 27.2 31.1
Sedentary adults (individualized—week 8) M/F 5/14 44.9 27.1 31.3
Sedentary adults (individualized—week 12) M/F 5/14 44.9 27.0 32.8
Weatherwax et al. [121] Sedentary adults (control—baseline) M/F 2/6 45.6 25.5 28.4
Sedentary adults (control—week 12) 25.5 27.7
Sedentary adults (standardized—baseline) M/F 4/16 51.2 29.6 24.3
Sedentary adults (standardized—week 12) 29.6 26.0
Sedentary adults (individualized—baseline) M/F 5/14 44.9 27.1 29.5
Sedentary adults (individualized—week 12) 26.8 32.8
Weatherwax et al. [122] Elite endurance-trained M 18 21.9 19.8 62.8
F 6 20.2 19.4 51.7
Wilhelm et al. [123] Healthy adults M 9 25 25.1 41.0
Williams et al. [124] Healthy adults M 8 27 NS 43.0
M 5 23 NS 48.0
Wingo et al. [125] Healthy adults M 9 25 22.4 61.2
Yeh et al. [126] Healthy adults M/F 14/1 23.3 21.9 48.9

Abbreviations: BMI = body mass index; CRF = cardiorespiratory fitness level; F = female; HIIT = high-intensity interval training; M = male; NS = not stated; VO2max = maximal oxygen uptake. Note: Whenever possible, authors were contacted to provide unpublished data.

Characteristics of studies regarding the CPET and verification phase protocols to evaluate VO2max

Table 2 summarizes the characteristics of the CPET and verification phase protocols of the 80 studies included in this systematic review. Forty-three studies (54%) performed the CPET on a cycle ergometer, 35 (44%) on a treadmill, and two studies (3%) used both modalities. Seventy-three studies (91%) used continuous step-incremented or ramp/pseudo-ramp CPET protocols. Three (4%) used only discontinuous step-incremented protocols. Two studies (3%) used both discontinuous and continuous step-incremented protocols and another two studies (3%) applied self-paced protocols. Thirty-three (41%) of the 80 studies included in the review used one or more VO2 plateau or secondary VO2max criteria to confirm the attainment of VO2max. Thirty studies used the VO2 plateau, 21 used the heart rate plateau or a criterion based on age-predicted maximal heart rate, 18 used the maximal RER attained in the CPET (RERmax), and 8 used the post-CPET blood lactate concentration.

Table 2. Characteristics of the cardiopulmonary exercise test (CPET) and verification phase protocols used in the reviewed studies (k = 80).

Study VO2 data sampling method Traditional VO2max criteria adopted Exercise Modality CPET Protocol Recovery Phase Protocol Verification Phase (VP) Protocol Verification Criteria Threshold
Alexander and Mier [54] 30-s time average VO2 plateau of 2.1 mL·kg-1·min-1 TR CSI 10-min walking 1st min: ↑ WR until matching the final stage of CPET; then ↑ slope to 2.5% and encouraged to running for 2-min NS
DisCSI
Arad et al. [55] 20-s rolling mean VO2 plateau (linear portion of the VO2-WR relationship); RERmax ≥ 1.10; ≥ 95% APMHR CYC Ramp 10-min active and 2–3 min passive 100% WRpeak NS
Astorino and DeRevere [56] 2×15-s NS CYC Ramp 8-min active 105% WRpeak CPET vs. VP: VO2max difference ≤ 3.0% and 3.3% and HRmax ≤ 4 bpm
10-min active 110% WRpeak
Astorino and White [57] 15-s time average NS CYC CSI 10-min active one stage > CPET-Stagefinal CPET vs. VP: VO2max difference ≤ 3% and HRmax ≤ 4 bpm
Astorino et al. [53] 2×15-s NS CYC Ramp 8-min active 2-min at 40–45% WRpeak and then 105% WRpeak CPET vs. VP: VO2max difference < 2 mL·kg-1·min-1
Astorino et al. [58] 30-s time average NS CYC Ramp 10-min active 105% WRpeak NS
Astorino et al. [59] 30-s time average NS CYC CSI 10-min active 105% WRpeak VO2max identified as the average of CPET and VP values
Astorino et al. [60] 2×15-s NS CYC Ramp ≥ 24h 105%WRpeak reached in the CPET NS
30-s time average 1–1.5h 115%WRpeak reached in the CPET
Beltrami et al. [61] 30-s intervals VO2 plateau (difference between modelled and actual value >50% of the regression slope for the linear portion of the VO2-WR relationship—an average of 1.7 mL·kg-1·min-1) TR CSI (control) 15-min active or passive (self-choose: walk, jog or rest) 1st min at 10 km/h (5% slope) and then ↑ 1 km/h > CPET-Speedpeak CPET vs. VP: VO2max difference ≤ 123 ± 18 mL/min (or 1.7 mL·kg-1·min-1)
CSI (reverse)
Beltz et al. [62] 2×15-s NS TR SPV 20-min passive 2-min at 30% CPET-WRpeak, 1-min at 40–45% CPET-WRpeak and then until exhaustion at 105% CPET-WRpeak CPET vs. VP: VO2max difference ≤ 3%
Ramp
Bisi et al. [63] 25-s moving-average VO2 plateau (increase < than 3% or 2.1 mL·kg-1·min-1 between 2 steps of increment); RERmax ≥ 1.08 or 1.15; HRmax within 10 bpm of APMHR CYC CSI 6-min active at least 3 min of cycling at 105% of the WRpeak CPET vs. VP: VO2max difference ≤ 3%
Chidnok et al. [64] 30-s rolling-mean NS CYC Ramp Different day See the formula for a proper reporting 3-min of ’all-out’ cycling NS
Clark et al. [65] 15-s time average NS CYC CSI 3-min active WRpeak minus 2 stages NS
Colakoglu et al. [66] 30-s average VO2 plateau of 150 mL/min; RERmax ≥ 1.10; ≥ 90% APMHR; CYC Ramp Different day 100% WRpeak NS
Colakoglu et al. [67] 30-s average VO2 plateau of 150 mL/min; RERmax ≥ 1.10; HRmax within 10 bpm of APMHR; RPE ≥? CYC CSI Different day 100% WRpeak VO2 plateau of 150 mL/min; RERmax ≥ 1.10; HRmax within 10 bpm of APMHR; RPE ≥?
Colakoglu et al. [68] 30-s average VO2 plateau of 150 mL/min; RERmax ≥ 1.10; ≥ 90% APMHR; RPE ≥ 19–20 CYC CSI Different day 100%, 105%, and 110% WRpeak to attain the highest VO2peak value VO2 plateau of 150 mL/min; RERmax ≥ 1.10; ≥ 90% APMHR; RPE ≥ 19–20
Dalleck et al. [69] 2×15-s NS CYC CSI 60-min passive 2-min at 50 Watts; then increased 105% WRpeak CPET vs. VP: VO2max difference ≤ 3% and HRmax ≤ 4 bpm
Day et al. [35] 30-s time average NS CYC CSI Different day 90% WRpeak reached in the CPET NS
Del Giudice et al. [70] 30-s time average NS TR CSI 10-min passive 0.8 km/h > CPET-Speedpeak NS
Dexheimer et al. [71] 2×15-s NS TR Pseudo-ramp protocol 5-10-min active 105% WRpeak CPET vs. VP: VO2max difference ≤ 3%
Dicks et al. [72] 15-s time average NS TR Pseudo-ramp protocol 3-min active WRpeak minus 2 stages NS
Dogra et al. [73] every 20 ms NS CYC Ramp Different day 85%WRpeak reached in the CPET NS
Ducrocq et al. [74] breath-by-breath NS TR CSI 5-min passive 105% WRpeak NS
Elliott et al. [75] 10-s epochs NS CYC CSI 60-min 110%WRpeak reached in the CPET NS
Faulkner et al. [76] 20-s time average VO2 plateau of 2 mL·kg-1·min-1; RERmax ≥ 1.10; RPE ≥ 17; HRmax within 10 bpm of APMHR; Lamax ≥ 8 mmol TR CSI 15-min passive ↑ speed over a 30-second period up to a 1 km/h > CPET-Speedpeak NS
Foster et al. [77] 30-s time average rate of increase in VO2 during the last min < 50% when compared to the mid portion of the test CYC CSI 1-min active 25 Watts > CPET-WRpeak NS
TR CSI 3-min active 1.6 km/h > CPET-Speedpeak or 0.8 km/h if in the non-athlete group
Freeberg et al. [78] 2×15-s NS TR Incline-based protocol 10-min active 110% WRpeak CPET vs. VP: VO2max difference ≤ 3%
Goodall et al. [79] 30-s mean NS CYC CSI 5-min passive as described by [38]; however, the intensity was not stated (i.e. 95 or 105%WRpeak reached in the CPET) NS
Hanson et al. [80] 15-breath moving average VO2 plateau of 2 mL·kg-1·min-1; RERmax ≥ 1.10 TR CSI 10-min active one stage > CPET-WRpeak CPET vs. VP: VO2max difference ≤ 50 mL/min
Hawkins et al. [81] 40-s Douglas bag collection NS TR CSI Different day 130% WRpeak NS
Hogg et al. [82] 30-s time average VO2 plateau (difference between modelled and actual value > 50% of the regression slope for the linear portion of the VO2-WR relationship); RERmax ≥ 1.10; RPE ≥ 17; HRmax within 10 bpm of APMHR TR CSI 10-min active (walking around the laboratory and stretching) ↑ speed over a 30-second period up to a speed stage > CPET-Stagefinal CPET vs. VP: VO2max difference ≤ 3%
Incline-based SPV speed halfway between speedpeak from the SPVincline vs. predicted verification-stage speed of the CSI protocol
Speed-based SPV speed halfway between speedpeak from the SPVspeed vs. predicted stage speed of the CSI protocol
Ianetta et al. [25] 20-s rolling mean VO2 plateau (linear portion of the VO2-WR relationship) CYC Ramp 10-min 110% WRpeak CPET vs. VP: VO2max difference ≤ 0.1 L/min
James et al. [83] 10-s average VO2 plateau of 2 mL·kg-1·min-1 TR CSI 5-min active ↑ 1% > CPET-Slope VO2 plateau of 2 mL·kg-1·min-1
Jamnick et al. [84] 20-s average NS CYC CSI1 (1-min stage length) 5-min passive 90% WRpeak—CSI1 NS
CSI3 (3-min stage length)
CSI5 (5-min stage length)
CSI7 (7-min stage length)
CSI10 (10-min stage length)
Jamnick et al. [85] 15-s time average NS CYC CSI 3-min active mean WRpeak minus 2 stages CPET vs. VP: VO2max difference ≤ 1.5 mL·kg-1·min-1 (or 3% CV)
Johnson et al. [86] 15-s intervals NS CYC CSI 3-min active (50%WRpeak) WRpeak minus 2 stages CPET vs. VP: VO2max difference ≤ 3%
Keiller and Gordon [87] 30-s intervals VO2 plateau (increase < than 50 or 100 mL/min) and HR plateau (increase < than 2 or 4 bpm) over the final two consecutive 30 s sampling periods TR CSI (Trials 1 and 2) 6-min passive 10 (female) and 9 (male) km/h and the ↑ 1% > CPET-Slope CPET vs. VP: HRmax difference ≤ 2 or ≤ 4 bpm
Kirkeberg et al. [88] 30-s time average NS TR CSI (short-term) 3-min active CPET-Speedend minus 2 stages, where stages were derived using specific equation NS
CSI (middle-term)
CSI (large-term)
Knaier et al. [89] 30-s time average RERmax ≥ 1.10; ≥ 95% APMHR; RPE ≥ 19; Lamax ≥ 8 mmol CYC CSI 10-min active 2 min at 50% WRpeak, 1 min at 70% WRpeak, and then 1 stage > CPET-WRpeak CPET vs. VP: VO2max difference ≤ 3%
Knaier et al. [90] 30-s time average RERmax ≥1.05, 1.10 and 1.15; 90, 95 and 100% APMHR; RPE ≥ 19 and = 20; Lamax ≥ 8 and 10 mmol CYC CSI 10-min active 2 min at 50% WRpeak, 1 min at 70% WRpeak, and then 1 stage > CPET-WRpeak CPET vs. VP: VO2max difference ≤ 3%
Kramer et al. [91] 30-s intervals NS TR CSI 3-min active 2 stages < CPET-WRpeak CPET vs. VP: VO2max difference ≤ 3%
Mann et al. [92] 15-s NS TR CSI 8-10-min 0.5 km/h > CPET-Speedpeak NS
Mann et al. [93] 15-s NS TR CSI 8-10-min 0.5 km/h > CPET-Speedpeak NS
Mauger et al. [94] 5-s time average VO2 plateau (increase < than 1.8 mL·kg-1·min-1 between 2 steps of increment); RERmax ≥ 1.10; HRmax within 10 bpm of APMHR; RPE ≥ 17; Lamax ≥ 8 mmol TR CSI 10-min active one stage > the last completed stage of the CPET CPET vs. VP: VO2max difference ≤ 1.8 mL·kg-1·min-1
McGawley [95] 30-s time average VO2 plateau (increase < than 3% or 2 mL·kg-1·min-1 between 2 steps of increment); RERmax ≥ 1.15; HRmax within 10 bpm of APMHR; Lamax ≥ 8 mmol TR CSI 9-min passive 105% at CPET-WRpeak (Trials 1 to 5) CPET vs. VP: VO2max difference ≤ 3%
McKay et al. [96] 15-s time average NS CYC Ramp 5-min active 105%WRpeak reached in the CPET NS
Midgley et al. [39] 30-s time average VO2 plateau (difference between modelled and actual value > 50% of the regression slope for the linear portion of the VO2-WR relationship) CYC CSI 10-min passive 2 min at 50% WRpeak, 1 min at 70% WRpeak, and then 1 stage > CPET-WRpeak, 2 min at 50% WRpeak, 1 min at 70% WRpeak, and then 1 stage > CPET-WRpeak CPET vs. VP: modelled and verification VO2 difference > 50% of the regression slope of the individual VO2-WR relationship; HRmax ≤ 4 bpm
TR
Midgley et al. [97] 30-s time average absolute plateau in VO2; RERmax ≥ 1.10; HRmax within 10 bpm of APMHR TR CSI 10-min active 0.5 km/h > CPET-Speedpeak CPET vs. VP: VO2max difference ≤ 2% and HRmax ≤ 2 bpm
Midgley et al. [98] 15 and 30-s time average NS TR CSI 1-min stages 5-min passive one stage > CPET NS
DisCSI 2-min stages
DisCSI 3-min stages
Mier et al. [99] 30-s VO2 plateau (2 mL·kg-1·min-1 and ≤ SD of the expected increase); RERmax ≥ 1.05, 1.10 and 1.15; ≥ 85% APMHR and HRmax within 10 bpm of APMHR TR CSI 10-min active (walking at slow pace) intensity gradually increased over 2-min until match CPET-WRpeak; after 1 min, the slope was increased 2.5% to running for 2-min CPET vs. VP: VO2max difference ≤ 2.2 mL·kg-1·min-1
Murias et al. [100] 20-s average time NS CYC Ramp 5-min active 85% WRpeak CPET vs. VP: VO2max difference ≤ 2.0 mL·kg-1·min-1
105% WRpeak
Murias et al. [101] 20-s NS CYC Ramp 5-min active 85%WRpeak reached in the CPET NS
Murias et al. [102] 20-s NS CYC Ramp 5-min active 85%WRpeak reached in the CPET NS
Nalcakan [103] 30-s VO2 plateau; RERmax ≥ 1.20; ≥ 90% APMHR CYC CSI Different day 100% WRpeak NS
Niemela et al. [104] every min VO2 plateau (≤60 mL/min for men and ≤50 mL/min for women); adequacy of a subjective criterion for establishing the end point; RERmax ≥ 1.15; HRmax within 10 bpm of APMHR CYC CSI I Different day 1 or 2 sub peak WRs, then 100% of the highest VO2max reached from two CPET ≤5% difference between the ramp test and VP
CSI II
Niemeyer et al. [105] 30-s time average < half of expected increase in VO2 (i.e. <4.5 mL·kg-1·min-1) CYC Ramp 10-min active 90% WRpeak CPET vs. VP: VO2max difference ≤ 5%
Niemeyer et al. [106] 30-s time average VO2 plateau (difference between modelled and actual value > 50% of the regression slope for the linear portion of the VO2-WR relationship) CYC Ramp Different day 90% WRpeak CPET vs. VP: VO2max difference ≤ 5%
Nolan et al. [107] 2×15-s NS TR CSI 20-min passive 105% WRpeak CPET vs. VP: VO2max difference ≤ 3%
115% WRpeak
60-min passive 105% WRpeak
115% WRpeak
Poole et al. [37] 20 s VO2 plateau of regarding the mL/min; RERmax ≥ 1.10, 1.15; HRmax within 10 bpm of APMHR; Lamax ≥ 8 mmol CYC Ramp Different day 105%WRpeak reached in the CPET NS
Possamai et al. [108] 30-s intervals plateau in VO2 and HR (i.e. ≤ 50 mL/min or ≤ 2 bpm) over the final two consecutive 30 s sampling periods; HRmax within 10 bpm of APMHR CYC CSI 15-min passive 5-min warm-up at the first stage of the CPET; 3-min of passive recovery; 2-min at 20 Watts; then increased 100% WRpeak CPET vs. VP: VO2max difference ≤ 3%
Riboli et al. [109] 30-s intervals VO2 plateau of 2.1 mL·kg-1·min-1 TR CSI with 1 min stages 5-min passive if the CPET did not show a VO2 plateau, a verification bout was performed as described by [38]; however, the intensity was not stated (i.e. 95 or 105%WRpeak reached in the CPET) NS
CSI with 2 min stages
DisCSI
Rossiter et al. [38] 15-s average VO2 plateau (linear least squares fitting technique) CYC Ramp 5-min active 105%WRpeak reached in the CPET NS
95%WRpeak reached in the CPET
Sabino-Carvalho et al. [110] 20-s average NS TR DisCSI 3-min passive (standing on treadmill) and 7-min active (walking at 5 km/h) 2-min at 60% WRpeak and then ↑ 0.5 km/h > CPET-Speedpeak CPET vs. VP: VO2max difference ≤ 2%
Scharhag-Rosenberger et al. [111] 3×10-s average VO2 plateau (increase < than one-third of the oxygen requirement of a stage change ~ 150 mL/min); RERmax ≥ 1.10; ± 10 bpm APMHR; Lamax > 8 mmol TR DisCSI 10-min passive (VerifDay1) 1 min at 60% CPET-Speedpeak and then continued at 110% (or 115% if necessary, a second VF bout in VerifDay1) CPET-Speedpeak CPET vs. VP: VO2max difference ≤ 5.5%
Different day (VerifDay2)
Scheadler and Devor [112] 30-s NS TR CSI Different day 8% slope/ individualized speed for a WR greater than CPET (mean estimated 10.2% WRpeak) CPET vs. VP: VO2max difference ≤ 50 mL/min
Sedgeman et al. [113] 15-s time average VO2 plateau of 2.1 mL·kg-1·min-1 during the last two 15-s average samples CYC CSI 3-min active WRpeak minus 2-stages CPET vs. VP: VO2max difference ≤ 3%
105%WRpeak
Stachenfeld et al. [114] 20-s averaging VO2 plateau of 150 mL/min; RERmax ≥ 1.10, 1.15; ≥ 85% APMHR; Lamax ≥ 8 mmol CYC CSI Different day 115% WRpeak reached in the CPET or 125% if the plateau has not been attained VO2 plateau of 150 mL/min
Straub et al. [115] 15-s time average NS CYC Ramp 10-min passive 1st min: 60% WRpeak and then 110% WRpeak NS
Strom et al. [116] 30-s time average NS TR CSI 3-min active (walking pace of 67 m/min) 2 stages < CPET-WRpeak CPET vs. VP: VO2max difference ≤ 3%
Taylor et al. [117] 15-breath average NS TR CSI 15-min active or passive 1st min at 10 km/h (5% slope) and then ↑ 1 km/h > CPET-Speedpeak NS
Tucker et al. [118] 2×15-s NS CYC CSI 5–10 min active 100%WRpeak NS
Vogiatzis et al. [119] NS NS CYC CSI 20-min passive 110% WRpeak NS
Weatherwax et al. [120] 2×15-s NS TR Pseudo-ramp protocol 20-min passive 105% WRpeak (Trials 1 and 2) CPET vs. VP: VO2max difference ≤ 3%
Weatherwax et al. [15] 2×15-s NS TR Pseudo-ramp protocol 20-min passive 105% WRpeak CPET vs. VP: VO2max difference ≤ 3%
Weatherwax et al. [121] 2×15-s NS TR Pseudo-ramp protocol 20-min passive 105% WRpeak CPET vs. VP: VO2max difference ≤ 3%
Weatherwax et al. [122] 2×15-s NS TR DisCSI 20-min passive 3 min at 4.82 km/h and then ↑ 0.64 km/h > CPET-Speedpeak (males) CPET vs. VP: VO2max difference ≤ 3%
3 min at 4.82 km/h and then ↑ 0.48 km/h > CPET-Speedpeak (females)
Wilhelm et al. [123] 10-s moving average NS CYC CSI 5-min passive 105%WRpeak NS
Williams et al. [124] 20-s NS CYC Ramp 5-min active 105%WRpeak NS
Wingo et al. [125] 2×30-s VO2 plateau of 135 mL/min; HR within 5 bpm of that on the control test was obtained CYC CSI control 20-min passive 100% WRpeak (if <1-min was completed during the last stage of the CPET) or 25 Watts > CPET-WRpeak (if ≥1-min was completed during the last stage of the CPET) VO2 plateau of 135 mL/min
CSI post-15 min
CSI post-45 min
Yeh et al. [126] NS NS TR CSI 10-min passive 1 km/h > CPET-Speedpeak or 5% slope every minute until exhaustion NS

Abbreviations: APMHR = age-predicted maximal heart rate; CPET = cardiopulmonary exercise test; CSI = continuous step-incremented; CV = coefficient of variation; CYC = cycling; DisCSI = discontinuous step-incremented; HR = heart rate; HRmax = maximal heart rate; Lamax = maximal blood lactate concentration; NS = not stated; RERmax = maximal respiratory exchange ratio; RPE = rating of perceived exertion; SD = standard deviation; SPV = self-paced maximal oxygen uptake; TR = treadmill; VO2 = oxygen uptake; VO2max = maximal oxygen uptake; VP = verification phase; WR = work rate; WRpeak = peak work rate. Note: whenever possible, authors were contacted to provide unpublished data.

In terms of processing respiratory VO2 data at volitional exhaustion, the most common approach was based on time averages. Thirty-eight studies (48%) reported stationary time averages of 5- to 30-s, whereas 29 (36%) used VO2 data points at fixed intervals of 15- to 30-s, two studies (3%) used 15-breath averages, two studies (3%) used 10-25-s moving averages, one (1%) used 10-s epochs, two (3%) used 20-s rolling averages, one (1%) used 30-s rolling means, and one study (1%) used Douglas bag collections. Four studies (5%) did not detail which VO2 data processing method was applied.

Regarding the period between the CPET and verification phase procedure, 34 studies (43%) used a short-term active recovery (e.g. pedaling at light-intensity, walking at a slow pace, or stretching) of 1, 3, 5, 6, 8, 10, or 5–10 min, while 26 studies (33%) employed passive recovery of 5, 6, 9, 10, 15, 20, 60, or 60–90 min. Two studies (3%) employed a combination of passive and active recovery and another (1%) used a self-paced approach where participants were permitted to choose their own WR. Three studies (4%) employed short-term recovery (e.g. 8–10 min) without stating whether it was active or passive. Fifteen studies (19%) carried out the verification phase on a different day to the CPET.

Sixty studies (75%) used square-wave verification phase protocols, while 20 studies (25%) used multistage verification protocols characterized by an initial warm-up stage. Overall, 53 studies (66%) adopted “supra WRpeak” verification phases based upon the WRpeak achieved during the CPET (e.g. one treadmill or cycle ergometer WR stage higher than that completed in the CPET, or 105–130% of the WRpeak achieved in the previous CPET). Seven studies (9%) used only 100% of WRpeak, while two other studies (3%) used both WRpeak and supra WRpeak verification phases. Three studies (4%) examined both sub and supra WRpeak within the same study and one study (1%) used a predicted WR based on the following formula to elicit the participant’s limit of tolerance within 180 s: power output = (finite work capacity ÷ 180 s) + critical power. Fourteen studies (18%) used only sub WRpeak verification phases ranging from 85%-95% WRpeak (typically two stages below the WRpeak achieved during the CPET) (see Table 2).

Forty-two studies (53%) employed cut-off points to analyze differences between the highest VO2 values obtained during the CPET and verification phase to confirm that VO2max was likely attained. Criteria for VO2max verification were frequently based on the intra-subject coefficient of variation acquired from the researchers’ laboratories or from published literature, including a VO2 difference ≤ 2%, ≤ 3%, ≤ 5.0–5.5%, ≤ 1.5–2.2 mL·kg-1·min-1, ≤ 50–150 mL/min, or alternative methods.

Quantitative data synthesis: Differences between the highest VO2 attained in the CPET and verification phase

Table 3 shows comparisons between the highest VO2 values elicited in the CPET and verification phase for each study. Fig 2 displays the forest plots of effect sizes and 95% CIs for the highest VO2 values (54 studies) based on the random effects meta-analysis results. Notably, the mean highest VO2 values were similar between the CPET and verification phase (mean difference = 0.03 [95% CI = -0.01 to 0.06] L/min, P = 0.15). Pooled data for VO2max following the CPET and verification phase showed no significant heterogeneity among the studies overall (see Fig 2). Except for one of the included studies judged to have a high risk of bias [68], the meta-analyzed studies were judged to have a low-risk of bias as shown by the funnel plot (Fig 3).

Table 3. Overall comparisons in the meta-analyzed studies for the highest VO2 values attained in the cardiopulmonary exercise test (CPET) and verification phase (VP) (k = 54).

Study Specific Experimental Condition CPET VP % Weight Mean Difference
Mean [L/min] SD [L/min] Total Mean [L/min] SD [L/min] Total IV, Random, 95%CI [L/min]
Alexander and Mier [54] CPET protocol (CSI) 3.79 0.39 11 3.80 0.49 11 1.00% -0.01 [-0.38, 0.36]
CPET protocol (DisCSI) 3.94 0.40 11 3.84 0.45 11 1.00% 0.10 [-0.25, 0.46]
Arad et al. [55] N/A 2.18 0.61 35 2.26 0.65 35 1.40% -0.08 [-0.38, 0.22]
Astorino and DeRevere [56] CPET-VP recovery (8 min) VP intensity (105% WRpeak) 3.35 1.01 30 3.32 1.00 30 0.50% 0.03 [-0.48, 0.54]
CPET-VP recovery (10 min) VP intensity (110% WRpeak) 2.82 0.62 79 2.78 0.59 79 3.70% 0.04 [-0.15, 0.23]
Astorino and White [57] N/A 3.00 0.45 30 3.00 0.45 30 2.50% 0.00 [-0.23, 0.23]
Astorino et al. [53] Experimental groups (low CRF) 2.35 0.37 10 2.36 0.33 10 1.40% -0.01 [-0.32, 0.30]
Experimental groups (moderate CRF) 3.32 0.58 10 3.28 0.60 10 0.50% 0.04 [-0.48, 0.56]
Experimental groups (high CRF) 4.38 0.70 10 4.29 0.74 10 0.30% 0.09 [-0.54, 0.72]
Astorino et al. [58] Training effect (HIIT-Baseline) 2.51 0.62 14 2.50 0.61 14 0.60% 0.01 [-0.45, 0.47]
Training effect (HIIT—Week 3) 2.66 0.67 14 2.60 0.64 14 0.60% 0.06 [-0.43, 0.55]
Training effect (Control—Baseline) 2.94 0.72 14 2.87 0.71 14 0.50% 0.07 [-0.46, 0.60]
Training effect (Control—Week 3) 2.97 0.74 14 2.84 0.69 14 0.50% 0.13 [-0.40, 0.66]
Astorino et al. [59] N/A 2.55 0.62 14 2.57 0.61 14 0.60% -0.02 [-0.47, 0.43]
Astorino et al. [60] CPET-VP recovery (at least 24h) 2.37 0.69 15 2.31 0.75 15 0.50% 0.06 [-0.45, 0.58]
CPET-VP recovery (60 to 90 min) 2.72 0.65 9 2.73 0.72 9 0.30% -0.01 [-0.64, 0.62]
Beltrami et al. [61] Experimental groups (control group) 4.50 0.58 13 4.43 0.46 13 0.80% 0.07 [-0.33, 0.47]
Experimental groups (reverse group) 4.52 0.36 13 4.54 0.33 13 1.90% -0.02 [-0.28, 0.24]
Beltz et al. [62] CPET protocol (SPV) 3.84 0.28 16 3.74 0.50 16 1.70% 0.10 [-0.18, 0.38]
CPET protocol (Ramp) 3.86 0.28 16 3.77 0.50 16 1.70% 0.09 [-0.19, 0.37]
Bisi et al. [63] N/A 2.41 0.13 11 2.56 0.36 11 2.60% -0.15 [-0.38, 0.08]
Chidnok et al. [64] N/A 4.32 0.61 7 4.32 0.69 7 0.30% 0.00 [-0.68, 0.68]
Colakoglu et al. [68] N/A 4.11 0.69 9 4.56 0.60 9 0.40% -0.45 [-1.05, 0.15]
Dalleck et al. [69] N/A 2.33 0.76 18 2.31 0.76 18 0.50% 0.02 [-0.48, 0.52]
Day et al. [35] N/A 3.64 0.70 38 3.64 0.70 38 1.30% 0.00 [-0.31, 0.31]
Dicks et al. [72] N/A 3.84 0.65 28 3.72 0.60 28 1.20% 0.12 [-0.21, 0.45]
Ducrocq et al. [74] N/A 3.73 0.47 13 3.76 0.45 13 1.10% -0.03 [-0.39, 0.32]
Elliott et al. [75] N/A 4.26 0.61 8 4.26 0.70 8 0.30% 0.00 [-0.64, 0.64]
Foster et al. [77] VP exercise modality (TR) 4.09 0.97 20 4.03 1.16 20 0.30% 0.06 [-0.60, 0.72]
VP exercise modality (CYC) 3.95 0.75 20 4.06 0.75 20 0.60% -0.11 [-0.57, 0.35]
Freeberg et al. [78] N/A 3.49 0.85 30 3.49 0.85 30 0.70% 0.00 [-0.43, 0.43]
Goodall et al. [79] N/A 4.11 0.56 9 3.82 0.71 9 0.40% 0.29 [-0.30, 0.88]
Hogg et al. [82] N/A 4.87 0.43 14 4.82 0.48 14 1.20% 0.05 [-0.29, 0.39]
Iannetta et al. [25] WRpeak 5 W/min 1st VP at 110% WRpeak (25 W/min) 3.35 0.68 11 3.30 0.65 11 0.4% 0.05 [-0.51, 0.61]
2nd VP at 110% WRpeak (5 W/min) 3.35 0.68 11 3.45 0.68 11 0.4% -0.10 [-0.67, 0.47]
WRpeak 10 W/min 1st VP at 110% WRpeak (25 W/min) 3.44 0.67 11 3.33 0.62 11 0.4% 0.11 [-0.43, 0.65]
2nd VP at 110% WRpeak (10 W/min) 3.44 0.67 11 3.47 0.7 11 0.4% -0.03 [-0.60, 0.54]
WRpeak 15 W/min 1st VP at 110% WRpeak (25 W/min) 3.44 0.69 11 3.3 0.68 11 0.4% 0.14 [-0.43, 0.71]
2nd VP at 110% WRpeak (15 W/min) 3.44 0.69 11 3.39 0.64 11 0.4% 0.05 [-0.51, 0.61]
WRpeak 25 W/min 1st VP at 110% WRpeak (25 W/min) 3.44 0.74 11 3.28 0.67 11 0.4% 0.16 [-0.43, 0.75]
2nd VP at 110% WRpeak (25 W/min) 3.44 0.74 11 3.29 0.66 11 0.4% 0.15 [-0.44, 0.74]
WRpeak 30 W/min 1st VP at 110% WRpeak (25 W/min) 3.44 0.72 11 3.31 0.67 11 0.4% 0.13 [-0.45, 0.71]
2nd VP at 110% WRpeak (30 W/min) 3.44 0.72 11 3.28 0.65 11 0.4% 0.16 [-0.41, 0.73]
Jamnick et al. [84] CPET protocol (CSI1: 1-min stage length) 4.72 0.41 17 4.65 0.45 17 1.60% 0.07 [-0.22, 0.36]
CPET protocol (CSI3: 3-min stage length) 4.62 0.42 17 4.56 0.46 17 1.50% 0.06 [-0.23, 0.36]
CPET protocol (CSI5: 5-min stage length) 4.55 0.46 17 4.55 0.47 17 1.30% 0.00 [-0.31, 0.31]
CPET protocol (CSI7: 7-min stage length) 4.44 0.42 17 4.37 0.46 17 1.50% 0.07 [-0.22, 0.36]
CPET protocol (CSI10: 10-min stage length) 4.35 0.43 17 4.23 0.51 17 1.30% 0.12 [-0.20, 0.43]
Jamnick et al. [85] N/A 3.24 0.57 57 3.25 0.57 57 3.00% -0.02 [-0.23, 0.19]
Johnson et al. [86] N/A 3.31 0.76 11 3.34 0.82 11 0.30% -0.03 [-0.69, 0.63]
Keiller and Gordon [87] N/A 3.65 0.71 11 3.50 0.58 11 0.50% 0.15 [-0.39, 0.69]
Kirkeberg et al. [88] CPET protocol (short-term CSI) 4.43 0.48 12 4.41 0.54 12 0.80% 0.03 [-0.38, 0.43]
CPET protocol (middle-term CSI) 4.40 0.46 12 4.27 0.40 12 1.00% 0.13 [-0.21, 0.47]
CPET protocol (large-term CSI) 4.42 0.42 12 4.36 0.45 12 1.00% 0.06 [-0.29, 0.41]
Kramer et al. [91] N/A 3.45 0.29 15 3.42 0.25 15 3.50% 0.03 [-0.16, 0.22]
Mann et al. [93] N/A 4.11 0.78 10 4.13 0.85 10 0.30% -0.02 [-0.74, 0.70]
Mann et al. [92] N/A 3.80 0.87 32 3.78 0.92 32 0.70% 0.03 [-0.41, 0.46]
Mauger et al. [94] N/A 4.66 0.55 14 4.65 0.59 14 0.70% 0.01 [-0.42, 0.43]
McGawley [95] N/A 4.08 0.47 10 4.01 0.46 10 0.80% 0.08 [-0.33, 0.48]
Midgley et al. [39] VP exercise modality (CYC) 3.86 0.39 10 3.92 0.47 10 0.90% -0.05 [-0.43, 0.33]
VP exercise modality (TR) 4.05 0.47 10 3.96 0.38 10 0.90% 0.10 [-0.28, 0.47]
Midgley et al. [98] CPET protocol (CSI 1-min stages) 4.09 0.54 9 4.07 0.53 9 0.50% 0.03 [-0.47, 0.52]
CPET protocol (DisCSI 2-min stages) 4.10 0.52 9 4.08 0.52 9 0.60% 0.02 [-0.46, 0.50]
CPET protocol (DisCSI 3-min stages) 3.98 0.49 9 4.07 0.53 9 0.60% -0.09 [-0.56, 0.38]
Midgley et al. [97] N/A 4.03 0.42 16 4.01 0.44 16 1.50% 0.01 [-0.28, 0.31]
Mier et al. [99] N/A 3.64 0.38 10 3.77 0.38 10 1.20% -0.13 [-0.46, 0.20]
Murias et al. [100] VP intensity (younger: 85% WRpeak) 3.73 0.51 8 3.76 0.48 8 0.60% -0.03 [-0.52, 0.45]
VP intensity (younger: 105% WRpeak) 3.90 0.65 22 3.89 0.64 22 0.90% 0.02 [-0.36, 0.40]
VP intensity (older: 85% WRpeak) 2.18 0.55 8 2.18 0.55 8 0.50% 0.00 [-0.54, 0.54]
VP intensity (older: 105% WRpeak) 2.52 0.54 23 2.57 0.51 23 1.40% -0.05 [-0.36, 0.25]
Niemela et al. [104] N/A 3.05 0.55 16 3.05 0.49 16 1.00% 0.00 [-0.36, 0.35]
Niemeyer et al. [105] N/A 4.06 0.43 24 4.06 0.46 24 2.10% 0.00 [-0.25, 0.24]
Niemeyer et al. [106] N/A 4.01 0.47 46 3.95 0.51 46 3.30% 0.06 [-0.14, 0.26]
Nolan et al. [107] CPET-VP recovery (20 min) VP intensity (105% WRpeak) 3.64 0.61 12 3.66 0.58 12 0.60% -0.02 [-0.50, 0.46]
CPET-VP recovery (20 min) VP intensity (115% WRpeak) 3.68 0.59 12 3.64 0.61 12 0.60% 0.04 [-0.44, 0.52]
CPET-VP recovery (60 min) VP intensity (105% WRpeak) 3.60 0.58 12 3.60 0.58 12 0.60% 0.00 [-0.46, 0.46]
CPET-VP recovery (60 min) VP intensity (115% WRpeak) 3.65 0.54 12 3.58 0.60 12 0.60% 0.07 [-0.38, 0.52]
Poole et al. [37] N/A 4.03 0.28 7 3.95 0.29 7 1.50% 0.08 [-0.22, 0.38]
Possamai et al. [108] N/A 3.83 0.41 19 3.72 0.42 19 1.90% 0.11 [-0.15, 0.37]
Rossiter et al. [38] VP intensity (105%WRpeak) 4.15 0.50 5 4.09 0.45 5 0.40% 0.06 [-0.53, 0.65]
VP intensity (95%WRpeak) 4.11 0.48 5 4.12 0.53 5 0.30% -0.01 [-0.64, 0.61]
Sabino-Carvalho et al. [110] Pre-CPET intervention (IPC) 4.24 0.46 16 4.23 0.40 16 1.50% 0.01 [-0.29, 0.31]
Pre-CPET intervention (Sham) 4.23 0.48 16 4.23 0.43 16 1.30% 0.01 [-0.31, 0.32]
Pre-CPET intervention (Control) 4.23 0.38 16 4.15 0.32 16 2.20% 0.08 [-0.17, 0.32]
Scharhag-Rosenberger et al. [111] CPET-VP recovery (same day after 10 min) 3.82 0.99 34 3.72 0.99 34 0.60% 0.10 [-0.37, 0.57]
CPET-VP recovery (different day) 3.82 0.99 34 3.75 1.00 34 0.60% 0.07 [-0.40, 0.54]
Sedgeman et al. [113] VP intensity (WRpeak minus 2-stages) 3.69 0.41 13 3.70 0.49 13 1.10% -0.01 [-0.36, 0.34]
VP intensity (105%WRpeak) 3.71 0.51 13 3.64 0.50 13 0.90% 0.07 [-0.31, 0.46]
Straub et al. [115] N/A 3.86 0.73 16 3.84 0.68 16 0.60% 0.02 [-0.47, 0.51]
Taylor et al. [117] N/A 4.03 0.53 19 3.83 0.52 19 1.20% 0.21 [-0.13, 0.54]
Weatherwax et al. [120] N/A 2.29 0.73 16 2.29 0.73 16 0.50% 0.00 [-0.50, 0.51]
Weatherwax et al. [15] Training effect (standardized—baseline) 2.03 0.62 20 2.03 0.60 20 0.90% 0.00 [-0.38, 0.38]
Training effect (standardized—week 12) 2.17 0.62 20 2.18 0.63 20 0.90% -0.01 [-0.40, 0.38]
Training effect (individualized—baseline) 2.37 0.79 19 2.37 0.77 19 0.50% 0.00 [-0.50, 0.50]
Training effect (individualized—week 12) 2.63 0.89 19 2.65 0.89 19 0.40% -0.02 [-0.59, 0.55]
Weatherwax et al. [121] Training effect (control—baseline) 2.18 0.74 8 2.16 0.73 8 0.30% 0.02 [-0.70, 0.74]
Training effect (control—week 12) 2.11 0.73 8 2.10 0.69 8 0.30% 0.01 [-0.69, 0.71]
Training effect (standardized—baseline) 2.03 0.62 20 2.03 0.60 20 0.90% 0.00 [-0.38, 0.38]
Training effect (standardized—week 12) 2.17 0.62 20 2.18 0.63 20 0.90% -0.01 [-0.40, 0.38]
Training effect (individualized—baseline) 2.37 0.79 19 2.37 0.77 19 0.50% 0.00 [-0.50, 0.50]
Training effect (individualized—week 12) 2.63 0.89 19 2.65 0.89 19 0.40% -0.02 [-0.59, 0.55]
Weatherwax et al. [122] Experimental groups (males) 3.98 0.36 18 3.94 0.32 18 2.60% 0.04 [-0.19, 0.26]
Experimental groups (females) 2.68 0.13 6 2.67 0.10 6 8.00% 0.01 [-0.12, 0.14]

Abbreviations: CI = confidence interval; CPET = cardiopulmonary exercise test; CRF = cardiorespiratory fitness level; CSI = continuous step-incremented; CYC = cycling; DisCSI = discontinuous step-incremented; HIIT = high-intensity interval training; IPC = ischemic preconditioning; N/A = not applicable; TR = treadmill; SD = standard deviation; SPV = self-paced maximal oxygen uptake; VO2 = oxygen uptake; VP = verification phase; WRpeak = peak work rate; W/min = incremental phase based on watts per minute. Note: whenever possible, authors were contacted to provide unpublished data. %Weight = weight attributed to each study due to its statistical power.

Fig 2. Forest plot of all studies included in the meta-analysis (k = 54) for the highest VO2 responses attained in the cardiopulmonary exercise test and verification phase using random effects analyses.

Fig 2

Data are reported as mean differences (MD) adjusted for control data (95% CIs).

Fig 3. Funnel plot assessment of publication bias for the studies investigating the highest VO2 responses attained in the cardiopulmonary exercise test and verification phase.

Fig 3

Results of subgroup analyses according to the characteristics of the verification phase protocol are summarized in Fig 4. There were no significant differences between the CPET and verification phase for the highest VO2 values attained after stratifying studies for verification-phase intensity (mean difference = 0.03 [95% CI = -0.01 to 0.07] L/min, P = 0.11), type of recovery utilized (mean difference = 0.02 [95%CI = -0.02 to 0.07] L/min, P = 0.36), VO2max verification criterion adoption (mean difference = 0.02 [95% CI = -0.02 to 0.06] L/min, P = 0.29), verification procedure with regards to whether or not it was performed on the same day as the CPET (mean difference = 0.03 [95%CI -0.01 to 0.06] L/min, P = 0.21), or verification-phase duration (i.e. no longer than 80 s, from 81 to 120 s and longer than 120 s) (mean difference = 0.03 [95%CI -0.03 to 0.09] L/min, P = 0.35).

Fig 4. Mean differences (95% CIs) between the highest VO2 responses in the cardiopulmonary exercise test (CPET) and verification phase according to the verification-phase characteristics for intensity (i.e. sub WRpeak vs. supra WRpeak), recovery (i.e. active vs. passive), adoption of criterion threshold (i.e. yes vs. no), timing (performed on the same day vs. a different day to the CPET), and duration (i.e. no longer than 80 s, from 81 to 120 s and longer than 120 s).

Fig 4

Subgroup analyses regarding sex, cardiorespiratory fitness level, exercise modality, and CPET protocol are summarized in Table 4. The median time to exhaustion was 665 s (IQR, 600 s) for the CPET and 148 s (IQR, 110 s) for the verification phase. Considering all sub-analyses presented in Table 4, there were no significant differences between the CPET and verification phase for VO2max (P = 0.18 to P = 0.71).

Table 4. Subgroup analyses for the cardiopulmonary exercise test (CPET) and verification phase (VP).

Time to exhaustion (s) VO2max (L/min)
N CPET Mean ± SD VP Mean ± SD N CPET Mean ± SD VP Mean ± SD Effect Size (95% CI) P-value
Sex
Male 146 734 ± 90 244 ± 43 630 3.95 ± 0.48 3.93 ± 0.50 0.02 (-0.02 to 0.08) 0.25
Female 23 659 ± 119 152 ± 46 68 2.63 ± 0.39 2.58 ± 0.40 0.05 (-0.08 to 0.12) 0.71
Both 677 765 ± 140 146 ± 28 941 3.24 ± 0.67 3.21 ± 0.67 0.03 (-0.04 to 0.08) 0.50
Cardiorespiratory fitness level
Low 170 617 ± 111 150 ± 36 322 2.30 ± 0.65 2.32 ± 0.65 0.02 (-0.07 to 0.11) 0.63
Moderate 362 790 ± 101 200 ± 40 565 3.49 ± 0.61 3.45 ± 0.63 0.04 (-0.02 to 0.11) 0.21
High 346 792 ± 149 161 ± 27 716 3.94 ± 0.55 3.90 ± 0.55 0.04 (-0.02 to 0.08) 0.18
Exercise modality
CYC 477 823 ± 143 155 ± 29 916 3.47 ± 0.59 3.45 ± 0.59 0.02 (-0.03 to 0.07) 0.43
TR 386 688 ± 110 189 ± 34 771 3.59 ± 0.58 3.56 ± 0.58 0.03 (-0.02 to 0.08) 0.22
CPET protocol
DisCSI 92 876 ± 120 156 ± 28 169 3.90 ± 0.52 3.87 ± 0.51 0.03 (-0.05 to 0.11) 0.49
CSI 472 696 ± 105 209 ± 40 924 3.71 ± 0.56 3.69 ± 0.58 0.02 (-0.03 to 0.07) 0.38
Ramp 284 848 ± 171 121 ± 23 578 3.16 ± 0.63 3.13 ± 0.62 0.03 (-0.04 to 0.09) 0.44

Group weighted mean differences in maximal oxygen uptake (VO2max) according to sex, cardiorespiratory fitness level, exercise testing modality, and CPET protocol.

Abbreviations: CI = confidence interval; CPET = cardiopulmonary exercise test; CSI = continuous step-incremented; CYC = cycling; DisCSI = discontinuous step-incremented; TR = treadmill; SD = standard deviation; VP = verification phase.

Discussion

A growing number of studies have included the verification phase procedure to increase confidence that the highest possible VO2 has been elicited by apparently healthy adults during a CPET. To the best of our knowledge this is the first systematic review and meta-analysis of these studies, and evidences that 90% of which have been published since 2009. The major findings were: (a) in general, the verification phase protocols elicited similar highest VO2 values to those obtained in the preceding CPET protocols; and (b) concordance between the highest VO2 values in the CPETs and verification phases were not affected by sex, cardiorespiratory fitness level, exercise modality, CPET protocol, or verification phase protocol.

The present systematic review and meta-analysis shows that the highest mean VO2 values elicited by verification phase bouts were similar to those elicited in continuous ramp or pseudo-ramp CPET protocols in the majority of studies. In fact, the mean absolute difference of 0.03 L/min for the 54 studies included in the meta-analysis represents a relative difference of only 0.85% between the highest VO2 values attained in the CPET and verification phase. This is within the most commonly adopted measures of test variability of 2–3% [57, 97]. The present findings also provide evidence that the similarity between the highest VO2 values attained during the CPETs and verification phases are not affected by sex, cardiorespiratory fitness, exercise modality, CPET protocol design, or how the verification phase was performed (see Table 4 and Fig 4). This contrasts with traditional VO2max criteria, which are test-protocol dependent and vary according to the individual’s physical characteristics [28, 29]. Day et al. [35], for example, observed that participants with lower cardiorespiratory fitness had a lower tendency to exhibit a deceleration in the VO2 response at the end of a CPET compared to those with higher cardiorespiratory fitness and, therefore, are less likely to exhibit a VO2 plateau.

Six of the 54 meta-analyzed studies reported significant mean differences between the highest VO2 values observed in the CPET and verification phase [25, 55, 56, 68, 87, 95]. Astorino and DeRevere [56], for example, observed significantly higher mean VO2max values by 0.03 and 0.04 L/min during the CPET than in the verification phase for two samples of participants heterogeneous for cardiorespiratory fitness. However, sub-group analyses revealed that while maximal VO2 in the CPET was higher than that attained in the verification phase for participants with moderate and high cardiorespiratory fitness, the opposite was true for those with lower cardiorespiratory fitness. Similar findings have been reported by Arad et al. [55], indicating that cardiorespiratory fitness level may be a key moderator of the differences between the highest VO2 values attained in the CPET and verification phase. A plausible explanation is that individuals with low cardiorespiratory fitness are more susceptible to stopping early during the CPET due to fatigue-associated symptoms [29], which would tend to result in lower VO2 values. In the present meta-analyses, the mean VO2max in the verification phase was 8% higher than in the CPET in the low cardiorespiratory fitness group, but 12% and 10% higher in the CPET than in the verification phase in the moderate and high cardiorespiratory fitness groups, respectively (see Table 4). The lack of statistical significance, however, highlights the uncertainty regarding the effects of cardiorespiratory fitness on the differences between the highest VO2 values in the CPET and verification phase.

Regarding verification-phase duration, Keiller and Gordon [87] observed significantly higher VO2 values during the incremental treadmill CPETs versus the verification phase with a mean duration of approximately 2 min. This is consistent with the findings of McGawley [95] for 10 recreational runners who performed five consecutive treadmill CPET trials, plus an appended verification phase with a mean duration of < 2 min. Iannetta et al. [25] analyzed the VO2 responses to ramp-incremented cycling CPETs with WR increments of 5, 10, 15, 25, and 30 W/min, each followed by two verification phases performed at different WRs. The verification phase bouts performed at 110% of the WRpeak from ramp protocols with ramp rates of 25 and 30 W/min (i.e. short verification phase bouts of ~ 80 s) yielded VO2 values significantly lower than those attained in the CPETs. In contrast, the highest VO2 values attained during verification phase bouts based on slower WR increments of 5, 10, and 15 W/min, which allowed sufficient time for VO2max attainment (i.e. 162, 122 and 103 s, respectively) were not different to those achieved in the preceding CPETs. Although the aforementioned studies suggest that verification phase duration is a key moderator for the mean differences between the highest VO2 observed in the CPET and verification phase, our sub-analysis found no difference for verification-phase durations of ≤ 80 s, ranging from 81 to 120 s, and > 120 s (see Fig 4). Notably, however, only three studies reported short durations of 80 s or less [25, 79, 113] and the lack of statistical significance may be due to the paucity of data.

In contrast to the aforementioned studies [25, 87, 95], Colakoglu et al. [68] observed significantly lower VO2 values in the CPET versus the verification phase in nine cycling and track and field athletes. According to Midgley et al. [97], if the mean highest VO2 attained in the verification phase is significantly higher than in the CPET, the investigator should consider that the CPET protocol was inadequate in eliciting the highest possible VO2 response in all or some of the participants. In the study by Colakoglu et al. [68], participants performed a prolonged step-incremented CPET consisting of one 4-min, three 2-min, and then 1-min increments until volitional exhaustion after 1 h of recovery from a submaximal CPET of at least four 5-min stages. It is feasible that the procedures performed before the maximal CPET may have led to poor participant motivation, lack of effort and premature fatigue in the following test. Additionally, the four verification phase bouts at 100%, 105%, 110%, and 115% of the WRpeak attained in the CPET were performed on four different days to the CPET without any preceding maximal exercise. This also may have positively favored the significantly higher mean VO2 values in the verification phase compared to the CPET and contrasts with the same-day verification phase used by Keiller and Gordon [87], McGawley [95], and Iannetta et al. [25].

An aim of the present systematic review was to suggest best practices for the application of verification phase protocols. The subgroup analyses revealed no systematic bias between the highest VO2 values observed in the CPET and verification phase according to the verification-phase intensity (i.e. sub WRpeak vs. supra WRpeak), type of recovery between the CPET and verification phase (i.e. active vs. passive), whether a VO2max criterion threshold was used for the CPET (i.e. yes vs. no), whether the verification phase was performed in the same testing session or on a different day, and the verification-phase duration (see Fig 4). Considering that differences in the verification phase procedure do not appear to influence its effectiveness, a specific verification procedure currently cannot be recommended. However, some caution must be exercised to avoid an inappropriately high verification-phase WR that results in a short test duration and insufficient time for the highest possible VO2 to be elicited [25], especially in untrained individuals characterized by slow VO2 kinetics [127]. Midgley et al. [97] stated that this is a plausible rationale for the early recommendations of Thoden [128], that individuals who do not reach 3 min in a supra WRpeak verification phase should undertake a subsequent verification phase at the same WR or one stage lower than verification-phase the last completed WR stage in the CPET. Poole and Jones [2] suggested that researchers should select a WR that is sufficiently higher than the WRpeak attained in the CPET, such as ~110% WRpeak, to give the VO2 signal for the higher WR the opportunity to emerge from the extant noise. If the subsequent verification phase produces a VO2 plateau signifying VO2max, this signal would be lower than expected for the WR based on the previous VO2-WR slope. Conversely, Iannetta et al. [25] advocated a verification-phase WR lower than the WRpeak attained in the CPET in order to allow VO2max to be elicited, since WRs above critical power should elicit VO2max if the time to exhaustion is sufficiently long. Midgley et al. [39] proposed an alternative approach based on a multistage verification phase protocol that combines WRs below and above WRpeak to obtain a protocol that incorporates a supra WRpeak intensity with a relatively prolonged verification-phase duration. This approach has since been adopted in other studies [39, 53, 54, 61, 62, 64, 69, 76, 82, 87, 89, 90, 99, 104, 108, 110, 111, 115, 117, 122]. Notably, the only study to observe a statistically significant influence of verification phase intensity employed a multistage verification phase protocol incorporating 2 min at 50% of WRpeak, increasing to 70% for an additional minute, and then 105 or 115% until volitional exhaustion [107]. Based on their findings, the authors recommended the use of 105% of the WRpeak attained in the CPET rather than 115% WRpeak. The confounding results and various recommended approaches regarding the verification phase intensity indicates that more research is required before an evidence-based recommendation can be made.

Regarding the recovery time between the CPET and verification phase, intervals between 10–20 min have been commonly used, although in total a wide range of intervals from 1–3 min [65, 77, 88, 113] to 90 min [41] have been used. The present meta-analysis found no significant effect of recovery time on minimizing the difference between the mean VO2 elicited in the CPET and verification phase. An alternative method is to perform the verification phase on a separate day, although the additional visit to the laboratory and the day-to-day variability in VO2max [129] might considerably reduce the utility and robustness of this approach. Scharhag-Rosenberger et al. [111] specifically investigated this issue by comparing a 10-min recovery to a verification phase performed on a separate day. No significant difference was observed between the two verification protocols, even though the time to exhaustion was significantly longer when the verification phase was performed on a separate day (2:06 ± 0:22 min vs. 2:42 ± 0:38 min). These findings suggest no advantage in performing the CPET and verification phase on separate days.

Inadequate data processing may negatively impact the utility of the verification phase procedure. Myers et al. [36] suggested small sampling intervals such as 5 and 10 s result in unacceptable variability in VO2 data, whereas large intervals such as 60 s may not be sufficiently sensitive to accurately track rapid changes in VO2 such as those observed in ramp and pseudo-ramp CPET protocols. Midgley et al. [130] observed that the reproducibility of VO2max during continuous step-incremented treadmill CPETs is not affected by the length of the VO2 time-average interval between the range of 10 to 60 s, however, the actual VO2max values were significantly different between time averages. The authors suggested that a 30-s stationary time-average for CPETs provides a good compromise between removing noise while maintaining the underlying trend in the VO2 data. However, no study to date has addressed the effect of the VO2 sampling interval on the verification phase.

A final issue to be addressed refers to appropriate criteria to accept that the highest possible VO2 has been achieved. The most common criterion used in the reviewed studies is that the highest VO2 observed in the verification phase should not exceed 3% of the highest VO2 obtained in the CPET. This threshold can be justified by the technical error of measurement and intra-individual biological variation associated with the determination of VO2max [15, 56, 57, 62, 63, 69, 71, 78, 82, 86, 8991, 95, 107, 108, 113, 116, 120122]. The more restrictive value of ≤ 2% [97, 110] and the less restrictive values of ≤ 5–5.5% [104106, 111] may also be appropriate for single or different-day variability. Further research is required before an appropriate verification-phase threshold can be recommended, which provides a high degree of confidence that the difference between the highest VO2 values observed in the CPET and verification phase are beyond the technical error of measurement and intra-individual biological variation.

Some limitations of the present review need to be acknowledged. First, the meta-analysis only included 79% of the participants that underwent CPET with verification phase protocols in the 80 studies included in the systematic review. This issue was due to unsuccessful attempts to acquire the required unpublished information from some authors. Second, the meta-analysis was based on comparison of the highest VO2 responses in the CPET and verification phase averaged across study participants. Noakes [131] criticized this approach, stating that the CPET is performed on individuals and not groups and, therefore, the group average approach does not identify individuals who may not have attained VO2max. A meta-analysis using individual participant data is therefore required. Finally, the present systematic review and meta-analysis comprised only apparently healthy adults and it is still unclear to what extent the use of the verification phase procedure is applicable to special or clinical populations. A growing number of studies have included special or clinical populations such as obese adults [132, 133], breast and prostate cancer survivors [134], wheelchair athletes [135], individuals with spinal-cord injuries [136], patients with heart failure [137] or cystic fibrosis [138140], and pediatric populations [141147], including children with spina bifida in an outpatient condition [148], and adolescents with cystic fibrosis [149].

Conclusions

The present meta-analysis showed that the effect sizes calculated from the highest mean VO2 in apparently healthy adults were similar between CPETs and verification phases performed on a cycle ergometer or treadmill. Furthermore, mean differences between the highest VO2 values elicited in the CPETs and verification phases were not affected by participant characteristics, exercise modality, or the CPET and verification protocol design. Our findings indicate that from a practical perspective, different procedures may be applied to establish similar highest mean VO2 responses during the verification phase as compared to the ramp or continuous step-incremented CPETs. It is worth mentioning, however, that some caution must be exercised concerning the selection of sub or supra WRpeak verification phases, since any exercise above the critical power must be of sufficient duration to allow the achievement of the highest possible VO2 response in the verification phase. Our data reinforce the notion that a verification phase applied after ramp or continuous step-incremented CPETs may provide additional and unbiased evidence that the highest possible VO2 has been achieved. On the other hand, the invalidation of the highest VO2 obtained in CPETs by subsequent verification phases was less likely on a group basis. The mean differences in highest VO2 responses were typically within the test-retest variability of the experimental protocols employed. Accordingly, our findings support the usefulness of the verification phase to confirm the likely attainment of VO2 on incremental CPET. However, the necessity or mandatory application of the verification phase, especially constant supra WRpeak verification bouts, in all CPET situations remains open to question.

Supporting information

S1 Checklist. PRISMA 2009 checklist.

(DOCX)

S1 Text. Search strategy.

(DOCX)

Acknowledgments

We would sincerely like to thank the following authors for kindly providing additional data: Dr. Fernando Beltrami, Dr. Nicholas Beltz, Dr. Maria Cristina Bisi, Dr. Nathan Dicks, Dr. Kaitlin Freeberg, Dr. Stuart Goodall, Dr. Gianni Gnudi, Dr. Nicholas Jamnick, Dr. Theresa Mann, Dr. Lex Mauger, Dr. Max Niemeyer, Dr. Jeann Sabino-Carvalho, Dr. Brandon Sawyer, Dr. Bruno Silva, Dr. Rita Stagni, Dr. Katie Taylor, Dr. Chantal A. Vella, Dr. Ryan Weatherwax, and Dr. Eurico Wilhelm.

Data Availability

All relevant data are within the paper and its Supporting information files.

Funding Statement

The present systematic review and meta-analysis derived from a research project involving cardiorespiratory fitness assessment in healthy and clinical populations only financed for material support from the Carlos Chagas Filho Foundation for the Research Support in Rio de Janeiro (FAPERJ, E-26/202.70 /2019, recipient FC; E-26/202.880/2017, recipient PF) and Brazilian Council for Technological and Scientific Development (CNPq, 303629/2019-3, recipient PF). The funders had no role in study design, data collection, and analysis, decision to publish, or preparation of the manuscript. The authors have not received a salary from any of your funders.

References

  • 1.Fletcher GF, Ades PA, Kligfield P, Arena R, Balady GJ, Bittner VA et al. Exercise standards for testing and training: a scientific statement from the American Heart Association. Circulation. 2013;128(8):873–934. 10.1161/CIR.0b013e31829b5b44 [DOI] [PubMed] [Google Scholar]
  • 2.Poole DC, Jones AM. Measurement of the maximum oxygen uptake VO2max: VO2peak is no longer acceptable. J Appl Physiol. 2017;122(4):997–1002. 10.1152/japplphysiol.01063.2016 [DOI] [PubMed] [Google Scholar]
  • 3.Garber CE, Blissmer B, Deschenes MR, Franklin BA, Lamonte MJ, Lee IM et al. American College of Sports Medicine position stand. Quantity and quality of exercise for developing and maintaining cardiorespiratory, musculoskeletal, and neuromotor fitness in apparently healthy adults: guidance for prescribing exercise. Med Sci Sports Exerc. 2011;43(7):1334–59. 10.1249/MSS.0b013e318213fefb [DOI] [PubMed] [Google Scholar]
  • 4.Franklin BA. Fitness: the ultimate marker for risk stratification and health outcomes? Prev Cardiol. 2007;10(1):42–5. 10.1111/j.1520-037x.2007.05759.x [DOI] [PubMed] [Google Scholar]
  • 5.Vanhees L, Lefevre J, Philippaerts R, Martens M, Huygens W, Troosters T et al. How to assess physical activity? How to assess physical fitness? Eur J Cardiovasc Prev Rehabil. 2005;12(2):102–14. 10.1097/01.hjr.0000161551.73095.9c [DOI] [PubMed] [Google Scholar]
  • 6.di Prampero PE. Factors limiting maximal performance in humans. Eur J Appl Physiol. 2003;90(3–4):420–9. 10.1007/s00421-003-0926-z [DOI] [PubMed] [Google Scholar]
  • 7.McMurray RG, Ainsworth BE, Harrell JS, Griggs TR, Williams OD. Is physical activity or aerobic power more influential on reducing cardiovascular disease risk factors? Med Sci Sports Exerc. 1998;30(10):1521–9. 10.1097/00005768-199810000-00009 [DOI] [PubMed] [Google Scholar]
  • 8.Blair SN, Kohl HW 3rd, Barlow CE, Paffenbarger RS Jr., Gibbons LW, Macera CA. Changes in physical fitness and all-cause mortality. A prospective study of healthy and unhealthy men. JAMA. 1995;273(14):1093–8. [PubMed] [Google Scholar]
  • 9.Myers J, Prakash M, Froelicher V, Do D, Partington S, Atwood JE. Exercise capacity and mortality among men referred for exercise testing. N Engl J Med. 2002;346(11):793–801. 10.1056/NEJMoa011858 [DOI] [PubMed] [Google Scholar]
  • 10.Strasser B, Burtscher M. Survival of the fittest: VO2max, a key predictor of longevity? Front Biosci (Landmark Ed). 2018;23:1505–16. [DOI] [PubMed] [Google Scholar]
  • 11.ACSM. ACSM’s guidelines for exercise testing and prescription. Tenth edition ed Philadelphia: Wolters Kluwer; 2018. [Google Scholar]
  • 12.da Cunha FA, Farinatti Pde T, Midgley AW. Methodological and practical application issues in exercise prescription using the heart rate reserve and oxygen uptake reserve methods. J Sci Med Sport. 2011;14(1):46–57. 10.1016/j.jsams.2010.07.008 [DOI] [PubMed] [Google Scholar]
  • 13.Astorino TA, Schubert MM, Palumbo E, Stirling D, McMillan DW, Cooper C et al. Magnitude and time course of changes in maximal oxygen uptake in response to distinct regimens of chronic interval training in sedentary women. Eur J Appl Physiol. 2013;113(9):2361–9. 10.1007/s00421-013-2672-1 [DOI] [PubMed] [Google Scholar]
  • 14.Scharhag-Rosenberger F, Meyer T, Walitzek S, Kindermann W. Time course of changes in endurance capacity: a 1-yr training study. Med Sci Sports Exerc. 2009;41(5):1130–7. 10.1249/MSS.0b013e3181935a11 [DOI] [PubMed] [Google Scholar]
  • 15.Weatherwax R, Harris N, Kilding AE, Dalleck L. Time course changes in confirmed ‘true’VO2max after individualized and standardized training. Sports Med Int Open. 2019;3(02):E32–E9. 10.1055/a-0867-9415 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Herdy AH, Ritt LEF, Stein R, Araújo CGSd, Milani M, Meneghelo RS et al. Cardiopulmonary exercise test: Background, applicability and interpretation. Arq Bras Cardiol. 2016;107:467–81. 10.5935/abc.20160171 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Mezzani A. Cardiopulmonary exercise testing: Basics of methodology and measurements. Ann Am Thorac Soc. 2017;14(Supplement_1):S3–S11. 10.1513/AnnalsATS.201612-997FR [DOI] [PubMed] [Google Scholar]
  • 18.Albouaini K, Egred M, Alahmar A, Wright DJ. Cardiopulmonary exercise testing and its application. Postgrad Med J. 2007;83(985):675–82. 10.1136/hrt.2007.121558 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Macfarlane DJ. Open-circuit respirometry: a historical review of portable gas analysis systems. Eur J Appl Physiol. 2017;117(12):2369–86. 10.1007/s00421-017-3716-8 [DOI] [PubMed] [Google Scholar]
  • 20.Buchfuhrer MJ, Hansen JE, Robinson TE, Sue DY, Wasserman K, Whipp BJ. Optimizing the exercise protocol for cardiopulmonary assessment. J Appl Physiol 1983;55(5):1558–64. 10.1152/jappl.1983.55.5.1558 [DOI] [PubMed] [Google Scholar]
  • 21.Davis JA, Whipp BJ, Lamarra N, Huntsman DJ, Frank MH, Wasserman K. Effect of ramp slope on determination of aerobic parameters from the ramp exercise test. Med Sci Sports Exerc. 1982;14(5):339–43. [PubMed] [Google Scholar]
  • 22.Myers J, Buchanan N, Walsh D, Kraemer M, McAuley P, Hamilton-Wessler M et al. Comparison of the ramp versus standard exercise protocols. J Am Coll Cardiol. 1991;17(6):1334–42. 10.1016/s0735-1097(10)80144-5 [DOI] [PubMed] [Google Scholar]
  • 23.Whipp BJ, Davis JA, Torres F, Wasserman K. A test to determine parameters of aerobic function during exercise. J Appl Physiol Respir Environ Exerc Physiol. 1981;50(1):217–21. 10.1152/jappl.1981.50.1.217 [DOI] [PubMed] [Google Scholar]
  • 24.Keir DA, Paterson DH, Kowalchuk JM, Murias JM. Using ramp-incremental VO2 responses for constant-intensity exercise selection. Appl Physiol Nutr Metab. 2018;43(9):882–92. 10.1139/apnm-2017-0826 [DOI] [PubMed] [Google Scholar]
  • 25.Iannetta D, de Almeida Azevedo R, Ingram CP, Keir DA, Murias JM. Evaluating the suitability of supra-POpeak verification trials after ramp-incremental exercise to confirm the attainment of maximum O2 uptake. Am J Physiol Regul Integr Comp Physiol. 2020;319(3):R315–R22. 10.1152/ajpregu.00126.2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Bassett DR Jr., Howley ET. Maximal oxygen uptake: "classical" versus "contemporary" viewpoints. Med Sci Sports Exerc. 1997;29(5):591–603. 10.1097/00005768-199705000-00002 [DOI] [PubMed] [Google Scholar]
  • 27.Bergh U, Ekblom B, Astrand PO. Maximal oxygen uptake "classical" versus "contemporary" viewpoints. Med Sci Sports Exerc. 2000;32(1):85–8. 10.1097/00005768-200001000-00013 [DOI] [PubMed] [Google Scholar]
  • 28.Midgley AW, Carroll S. Emergence of the verification phase procedure for confirming ’true’ VO2max. Scand J Med Sci Sports. 2009;19(3):313–22. 10.1111/j.1600-0838.2009.00898.x [DOI] [PubMed] [Google Scholar]
  • 29.Midgley AW, McNaughton LR, Polman R, Marchant D. Criteria for determination of maximal oxygen uptake: a brief critique and recommendations for future research. Sports Med. 2007;37(12):1019–28. 10.2165/00007256-200737120-00002 [DOI] [PubMed] [Google Scholar]
  • 30.Noakes TD. Maximal oxygen uptake: "classical" versus "contemporary" viewpoints: a rebuttal. Med Sci Sports Exerc. 1998;30(9):1381–98. 10.1097/00005768-199809000-00007 [DOI] [PubMed] [Google Scholar]
  • 31.Midgley AW, Marchant DC, Levy AR. A call to action towards an evidence-based approach to using verbal encouragement during maximal exercise testing. Clin Physiol Funct Imaging. 2018;38(4):547–53. 10.1111/cpf.12454 [DOI] [PubMed] [Google Scholar]
  • 32.Hill A, Lupton H. Muscular exercise, lactic acid, and the supply and utilization of oxygen. QJM: An International Journal of Medicine. 1923(62):135–71. [Google Scholar]
  • 33.Hill AV, Long CNH, Lupton H. Muscular exercise, lactic acid and the supply and utilisation of oxygen. Parts VII-VIII. Proceedings of the Royal Society of London Series B, Containing Papers of a Biological Character. 1924;97(682):155–76. 10.1098/rspb.1924.0048 [DOI] [Google Scholar]
  • 34.Taylor HL, Buskirk E, Henschel A. Maximal oxygen intake as an objective measure of cardio-respiratory performance. J Appl Physiol 1955;8(1):73–80. 10.1152/jappl.1955.8.1.73 [DOI] [PubMed] [Google Scholar]
  • 35.Day JR, Rossiter HB, Coats EM, Skasick A, Whipp BJ. The maximally attainable VO2 during exercise in humans: the peak vs. maximum issue. J Appl Physiol 2003;95(5):1901–7. 10.1152/japplphysiol.00024.2003 [DOI] [PubMed] [Google Scholar]
  • 36.Myers J, Walsh D, Sullivan M, Froelicher V. Effect of sampling on variability and plateau in oxygen uptake. J Appl Physiol. 1990;68(1):404–10. 10.1152/jappl.1990.68.1.404 [DOI] [PubMed] [Google Scholar]
  • 37.Poole DC, Wilkerson DP, Jones AM. Validity of criteria for establishing maximal O2 uptake during ramp exercise tests. Eur J Appl Physiol. 2008;102(4):403–10. 10.1007/s00421-007-0596-3 [DOI] [PubMed] [Google Scholar]
  • 38.Rossiter HB, Kowalchuk JM, Whipp BJ. A test to establish maximum O2 uptake despite no plateau in the O2 uptake response to ramp incremental exercise. J Appl Physiol 2006;100(3):764–70. 10.1152/japplphysiol.00932.2005 [DOI] [PubMed] [Google Scholar]
  • 39.Midgley AW, Carroll S, Marchant D, McNaughton LR, Siegler J. Evaluation of true maximal oxygen uptake based on a novel set of standardized criteria. Appl Physiol Nutr Metab. 2009;34(2):115–23. 10.1139/H08-146 [DOI] [PubMed] [Google Scholar]
  • 40.Howley ET, Bassett DR, Welch HG. Criteria for maximal oxygen uptake: review and commentary. Med Sci Sports Exerc. 1995;27:1292-. [PubMed] [Google Scholar]
  • 41.Astorino TA. Alterations in VO2max and the VO2 plateau with manipulation of sampling interval. Clin Physiol Funct Imaging 2009;29(1):60–7. 10.1111/j.1475-097X.2008.00835.x [DOI] [PubMed] [Google Scholar]
  • 42.Astorino TA, Willey J, Kinnahan J, Larsson SM, Welch H, Dalleck LC. Elucidating determinants of the plateau in oxygen consumption at VO2max. Br J Sports Med. 2005;39(9):655–60; discussion 60. 10.1136/bjsm.2004.016550 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Gordon D, Mehter M, Gernigon M, Caddy O, Keiller D, Barnes R. The effects of exercise modality on the incidence of plateau at VO2max. Clin Physiol Funct Imaging. 2012;32(5):394–9. 10.1111/j.1475-097X.2012.01142.x [DOI] [PubMed] [Google Scholar]
  • 44.Gordon D, Schaitel K, Pennefather A, Gernigon M, Keiller D, Barnes R. The incidence of plateau at VO2max is affected by a bout of prior-priming exercise. Clin Physiol Funct Imaging. 2012;32(1):39–44. 10.1111/j.1475-097X.2011.01052.x [DOI] [PubMed] [Google Scholar]
  • 45.Duncan GE, Howley ET, Johnson BN. Applicability of VO2max criteria: discontinuous versus continuous protocols. Med Sci Sports Exerc. 1997;29(2):273–8. 10.1097/00005768-199702000-00017 [DOI] [PubMed] [Google Scholar]
  • 46.Froelicher VF Jr., Brammell H, Davis G, Noguera I, Stewart A, Lancaster MC. A comparison of three maximal treadmill exercise protocols. J Appl Physiol. 1974;36(6):720–5. 10.1152/jappl.1974.36.6.720 [DOI] [PubMed] [Google Scholar]
  • 47.McArdle WD, Katch FI, Pechar GS. Comparison of continuous and discontinuous treadmill and bicycle tests for max VO2. Med Sci Sports. 1973;5(3):156–60. [PubMed] [Google Scholar]
  • 48.Stamford BA. Step increment versus constant load tests for determination of maximal oxygen uptake. Eur J Appl Physiol Occup Physiol. 1976;35(2):89–93. 10.1007/BF02333798 [DOI] [PubMed] [Google Scholar]
  • 49.Cumming GR, Friesen W. Bicycle ergometer measurement of maximal oxygen uptake in children. Can J Physiol Pharmacol. 1967;45(6):937–46. 10.1139/y67-111 [DOI] [PubMed] [Google Scholar]
  • 50.Sidney KH, Shephard RJ. Maximum and submaximum exercise tests in men and women in the seventh, eighth, and ninth decades of life. J Appl Physiol Respir Environ Exerc Physiol. 1977;43(2):280–7. 10.1152/jappl.1977.43.2.280 [DOI] [PubMed] [Google Scholar]
  • 51.Edvardsen E, Hem E, Anderssen SA. End criteria for reaching maximal oxygen uptake must be strict and adjusted to sex and age: a cross-sectional study. PLoS One. 2014;9(1):e85276 10.1371/journal.pone.0085276 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Schaun GZ. The maximal oxygen uptake verification phase: A light at the end of the tunnel? Sports Med Open. 2017;3(1):44 10.1186/s40798-017-0112-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Astorino TA, McMillan DW, Edmunds RM, Sanchez E. Increased cardiac output elicits higher VO2max in response to self-paced exercise. Appl Physiol Nutr Metab. 2015;40(3):223–9. 10.1139/apnm-2014-0305 [DOI] [PubMed] [Google Scholar]
  • 54.Alexander RP, Mier CM. Intermittent vs continuous graded exercise test for VO2max in college soccer athletes. International Journal of Exercise Science. 2011;4(3):3. [Google Scholar]
  • 55.Arad AD, Bishop K, Adimoolam D, Albu JB, DiMenna FJ JPo. Severe-intensity constant-work-rate cycling indicates that ramp incremental cycling underestimates VO2max in a heterogeneous cohort of sedentary individuals. 2020;15(7):e0235567. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Astorino TA, DeRevere J. Efficacy of constant load verification testing to confirm VO2max attainment. Clin Physiol Funct Imaging. 2018;38(4):703–9. 10.1111/cpf.12474 [DOI] [PubMed] [Google Scholar]
  • 57.Astorino TA, White AC. Assessment of anaerobic power to verify VO2max attainment. Clin Physiol Funct Imaging. vol 4 England2010. p. 294–300. [DOI] [PubMed] [Google Scholar]
  • 58.Astorino TA, deRevere J, Anderson T, Kellogg E, Holstrom P, Ring S et al. Change in VO2max and time trial performance in response to high-intensity interval training prescribed using ventilatory threshold. Eur J Appl Physiol. 2018;118(9):1811–20. 10.1007/s00421-018-3910-3 [DOI] [PubMed] [Google Scholar]
  • 59.Astorino TA, DeRevere JL, Anderson T, Kellogg E, Holstrom P, Ring S et al. Blood lactate concentration is not related to the increase in cardiorespiratory fitness induced by high intensity interval training. Int J Environ Res Public Health. 2019;16(16):2845. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Astorino T, White A, Dalleck L. Supramaximal testing to confirm attainment of VO2max in sedentary men and women. Int J Sports Med. 2009;30(04):279–84. [DOI] [PubMed] [Google Scholar]
  • 61.Beltrami FG, Froyd C, Mauger AR, Metcalfe AJ, Marino F, Noakes TD. Conventional testing methods produce submaximal values of maximum oxygen consumption. Br J Sports Med. 2012;46(1):23–9. 10.1136/bjsports-2011-090306 [DOI] [PubMed] [Google Scholar]
  • 62.Beltz NM, Amorim FT, Gibson AL, Janot JM, Kravitz L, Mermier CM et al. Hemodynamic and metabolic responses to self-paced and ramp-graded exercise testing protocols. Appl Physiol Nutr Metab. 2018;43(6):609–16. 10.1139/apnm-2017-0608 [DOI] [PubMed] [Google Scholar]
  • 63.Bisi MC, Stagni R, Gnudi G. Automatic detection of maximal oxygen uptake and ventilatory threshold. Comput Biol Med. 2011;41(1):18–23. 10.1016/j.compbiomed.2010.11.001 [DOI] [PubMed] [Google Scholar]
  • 64.Chidnok W, Dimenna FJ, Bailey SJ, Burnley M, Wilkerson DP, Vanhatalo A et al. VO2max is not altered by self-pacing during incremental exercise. Eur J Appl Physiol. 2013;113(2):529–39. 10.1007/s00421-012-2478-6 [DOI] [PubMed] [Google Scholar]
  • 65.Clark IE, Murray SR, Pettitt RW. Alternative procedures for the three-minute all-out exercise test. J Strength Cond Res. 2013;27(8):2104–12. 10.1519/JSC.0b013e3182785041 [DOI] [PubMed] [Google Scholar]
  • 66.Colakoglu M, Ozkaya O, Balci GA, Yapicioglu B. Shorter intervals at peak SV vs.VO2max may yield high SV with less physiological stress. Eur J Sport Sci. 2015;15(7):623–30. 10.1080/17461391.2014.966762 [DOI] [PubMed] [Google Scholar]
  • 67.Colakoglu M, Ozkaya O, Balci GA, Yapicioglu B. Re-evaluation of old findings on stroke volume responses to exercise and recovery by nitrous-oxide rebreathin. J Hum Kinet. 2016;53:73–9. 10.1515/hukin-2016-0011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Colakoglu M, Ozkaya O, Balci GA, Yapicioglu B. Stroke volume responses may be related to the gap between peak and maximal O2 consumption. Isokinet Exerc Sci. 2016;24(2):133–9. [Google Scholar]
  • 69.Dalleck LC, Astorino TA, Erickson RM, McCarthy CM, Beadell AA, Botten BH. Suitability of verification testing to confirm attainment of VO2max in middle-aged and older adults. Res Sports Med. 2012;20(2):118–28. 10.1080/15438627.2012.660825 [DOI] [PubMed] [Google Scholar]
  • 70.Del Giudice M, Bonafiglia JT, Islam H, Preobrazenski N, Amato A, Gurd BJ. Investigating the reproducibility of maximal oxygen uptake responses to high-intensity interval training. J Sci Med Sport. 2020;23(1):94–9. 10.1016/j.jsams.2019.09.007 [DOI] [PubMed] [Google Scholar]
  • 71.Dexheimer JD, Schroeder ET, Sawyer BJ, Pettitt RW, Aguinaldo AL, Torrence WA. Physiological performance measures as indicators of crossfit® performance. Sports. 2019;7(4):93 10.3390/sports7040093 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Dicks ND, Joe TV, Hackney KJ, Pettitt RW. Validity of critical velocity concept for weighted sprinting performance. Int J Exerc Sci. 2018;11(4):900–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Dogra S, Spencer MD, Paterson DH. Higher cardiorespiratory fitness in older trained women is due to preserved stroke volume. J Sports Sci Med. 2012;11(4):745–50. [PMC free article] [PubMed] [Google Scholar]
  • 74.Ducrocq GP, Hureau TJ, Meste O, Blain GM. Similar cardioventilatory but greater neuromuscular stimuli with interval drop jump than with interval running. Int J Sports Physiol Perform. 2019:1–10. 10.1123/ijspp.2019-0031 [DOI] [PubMed] [Google Scholar]
  • 75.Elliott AD, Skowno J, Prabhu M, Noakes TD, Ansley L. Evidence of cardiac functional reserve upon exhaustion during incremental exercise to determine VO2max. Br J Sports Med. 2015;49(2):128–32. 10.1136/bjsports-2012-091752 [DOI] [PubMed] [Google Scholar]
  • 76.Faulkner J, Mauger AR, Woolley B, Lambrick D. The efficacy of a self-paced VO2max test during motorized treadmill exercise. Int J Sports Physiol Perform. 2015;10(1):99–105. 10.1123/ijspp.2014-0052 [DOI] [PubMed] [Google Scholar]
  • 77.Foster C, Kuffel E, Bradley N, Battista RA, Wright G, Porcari JP et al. VO2max during successive maximal efforts. Eur J Appl Physiol. 2007;102(1):67–72. 10.1007/s00421-007-0565-x [DOI] [PubMed] [Google Scholar]
  • 78.Freeberg KA, Baughman BR, Vickey T, Sullivan JA, Sawyer BJ. Assessing the ability of the Fitbit Charge 2 to accurately predict VO2max. Mhealth. 2019;5:39 10.21037/mhealth.2019.09.07 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Goodall S, Gonzalez-Alonso J, Ali L, Ross EZ, Romer LM. Supraspinal fatigue after normoxic and hypoxic exercise in humans. J Physiol. 2012;590(11):2767–82. 10.1113/jphysiol.2012.228890 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Hanson NJ, Scheadler CM, Lee TL, Neuenfeldt NC, Michael TJ, Miller MG. Modality determines VO2max achieved in self-paced exercise tests: validation with the Bruce protocol. Eur J Appl Physiol. 2016;116(7):1313–9. 10.1007/s00421-016-3384-0 [DOI] [PubMed] [Google Scholar]
  • 81.Hawkins MN, Raven PB, Snell PG, Stray-Gundersen J, Levine BD. Maximal oxygen uptake as a parametric measure of cardiorespiratory capacity. Med Sci Sports Exerc. 2007;39(1):103–7. 10.1249/01.mss.0000241641.75101.64 [DOI] [PubMed] [Google Scholar]
  • 82.Hogg JS, Hopker JG, Mauger AR. The self-paced VO2max test to assess maximal oxygen uptake in highly trained runners. Int J Sports Physiol Perform. 2015;10(2):172–7. 10.1123/ijspp.2014-0041 [DOI] [PubMed] [Google Scholar]
  • 83.James C, Tenllado Vallejo F, Kantebeen M, Farra S. Validity and reliability of an on-court fitness test for assessing and monitoring aerobic fitness in squash. J Strength Cond Res. 2019;33(5):1400–7. 10.1519/JSC.0000000000002465 [DOI] [PubMed] [Google Scholar]
  • 84.Jamnick NA, Botella J, Pyne DB, Bishop DJ. Manipulating graded exercise test variables affects the validity of the lactate threshold and VO2peak. PLoS One. 2018;13(7):e0199794. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Jamnick NA, By S, Pettitt CD, Pettitt RW. Comparison of the YMCA and a custom submaximal exercise test for determining VO2max. Med Sci Sports Exerc. 2016;48(2):254–9. 10.1249/MSS.0000000000000763 [DOI] [PubMed] [Google Scholar]
  • 86.Johnson TM, Sexton PJ, Placek AM, Murray SR, Pettitt RW. Reliability analysis of the 3-min all-out exercise test for cycle ergometry. Med Sci Sports Exerc. 2011;43(12):2375–80. 10.1249/MSS.0b013e318224cb0f [DOI] [PubMed] [Google Scholar]
  • 87.Keiller D, Gordon D. Confirming maximal oxygen uptake: Is heart rate the answer? Int J Sports Med. 2018;39(3):198–203. 10.1055/s-0043-121148 [DOI] [PubMed] [Google Scholar]
  • 88.Kirkeberg JM, Dalleck LC, Kamphoff CS, Pettitt RW. Validity of 3 protocols for verifying VO2max. Int J Sports Med. 2011;32(4):266–70. 10.1055/s-0030-1269914 [DOI] [PubMed] [Google Scholar]
  • 89.Knaier R, Infanger D, Niemeyer M, Cajochen C, Schmidt-Trucksass A. In athletes, the diurnal variations in maximum oxygen uptake are more than twice as large as the day-to-day variations. Front Physiol. 2019;10:219 10.3389/fphys.2019.00219 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Knaier R, Niemeyer M, Wagner J, Infanger D, Hinrichs T, Klenk C et al. Which cutoffs for secondary VO2max criteria are robust to diurnal variations? Med Sci Sports Exerc. 2019;51(5):1006–13. 10.1249/MSS.0000000000001869 [DOI] [PubMed] [Google Scholar]
  • 91.Kramer M, Du Randt R, Watson M, Pettitt RW. Oxygen uptake kinetics and speed-time correlates of modified 3-minute all-out shuttle running in soccer players. PLoS One. 2018;13(8):e0201389 10.1371/journal.pone.0201389 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Mann TN, Webster C, Lamberts RP, Lambert MI. Effect of exercise intensity on post-exercise oxygen consumption and heart rate recovery. Eur J Appl Physiol. 2014;114(9):1809–20. 10.1007/s00421-014-2907-9 [DOI] [PubMed] [Google Scholar]
  • 93.Mann TN, Platt CE, Lamberts RP, Lambert MI. Faster heart rate recovery with increased RPE: Paradoxical responses after an 87-km ultramarathon. J Strength Cond Res. 2015;29(12):3343–52. 10.1519/JSC.0000000000001004 [DOI] [PubMed] [Google Scholar]
  • 94.Mauger AR, Metcalfe AJ, Taylor L, Castle PC. The efficacy of the self-paced VO2max test to measure maximal oxygen uptake in treadmill running. Appl Physiol Nutr Metab. 2013;38(12):1211–6. 10.1139/apnm-2012-0384 [DOI] [PubMed] [Google Scholar]
  • 95.McGawley K. The reliability and validity of a four-minute running time-trial in assessing VO2max and performance. Front Physiol. 2017;8:270 10.3389/fphys.2017.00270 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.McKay BR, Paterson DH, Kowalchuk JM. Effect of short-term high-intensity interval training vs. continuous training on O2 uptake kinetics, muscle deoxygenation, and exercise performance. J Appl Physiol 2009;107(1):128–38. 10.1152/japplphysiol.90828.2008 [DOI] [PubMed] [Google Scholar]
  • 97.Midgley AW, McNaughton LR, Carroll S. Verification phase as a useful tool in the determination of the maximal oxygen uptake of distance runners. Appl Physiol Nutr Metab. 2006;31(5):541–8. 10.1139/h06-023 [DOI] [PubMed] [Google Scholar]
  • 98.Midgley AW, McNaughton LR, Carroll S. Time at VO2max during intermittent treadmill running: test protocol dependent or methodological artefact? Int J Sports Med. 2007;28(11):934–9. 10.1055/s-2007-964972 [DOI] [PubMed] [Google Scholar]
  • 99.Mier CM, Alexander RP, Mageean AL. Achievement of VO2max criteria during a continuous graded exercise test and a verification stage performed by college athletes. J Strength Cond Res. 2012;26(10):2648–54. 10.1519/JSC.0b013e31823f8de9 [DOI] [PubMed] [Google Scholar]
  • 100.Murias JM, Pogliaghi S, Paterson DH. Measurement of a true VO2max during a ramp incremental test is not confirmed by a verification phase. Front Physiol. 2018;9:143 10.3389/fphys.2018.00143 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Murias JM, Kowalchuk JM, Paterson DH. Mechanisms for increases in VO2max with endurance training in older and young women. Med Sci Sports Exerc. 2010;42(10):1891–8. 10.1249/MSS.0b013e3181dd0bba [DOI] [PubMed] [Google Scholar]
  • 102.Murias JM, Kowalchuk JM, Paterson DH. Time course and mechanisms of adaptations in cardiorespiratory fitness with endurance training in older and young men. J Appl Physiol 2010;108(3):621–7. 10.1152/japplphysiol.01152.2009 [DOI] [PubMed] [Google Scholar]
  • 103.Nalcakan GR. The effects of sprint interval vs. continuous endurance training on physiological and metabolic adaptations in young healthy adults. J Hum Kinet. 2014;44:97–109. 10.2478/hukin-2014-0115 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Niemela K, Palatsi I, Linnaluoto M, Takkunen J. Criteria for maximum oxygen uptake in progressive bicycle tests. Eur J Appl Physiol Occup Physiol. 1980;44(1):51–9. 10.1007/BF00421763 [DOI] [PubMed] [Google Scholar]
  • 105.Niemeyer M, Leithaeuser R, Beneke R. Oxygen uptake plateau occurrence depends on oxygen kinetics and oxygen deficit accumulation. Scand J Med Sci Sports. 2019. 10.1111/sms.13493 [DOI] [PubMed] [Google Scholar]
  • 106.Niemeyer M, Bergmann TGJ, Beneke R. Oxygen uptake plateau: calculation artifact or physiological reality? Eur J Appl Physiol. 2020;120(1):231–42. 10.1007/s00421-019-04267-7 [DOI] [PubMed] [Google Scholar]
  • 107.Nolan PB, Beaven ML, Dalleck L. Comparison of intensities and rest periods for VO2max verification testing procedures. Int J Sports Med. 2014;35(12):1024–9. 10.1055/s-0034-1367065 [DOI] [PubMed] [Google Scholar]
  • 108.Possamai LT, Campos FS, Salvador P, Aguiar RA, Guglielmo LGA, De Lucas RD et al. Similar VO2max assessment from a step cycling incremental test and verification tests on the same or different day. Appl Physiol Nutr Metab. 2019. 10.1139/apnm-2019-0405 [DOI] [PubMed] [Google Scholar]
  • 109.Riboli A, Rampichini S, Ce E, Limonta E, Coratella G, Esposito F. Effect of ramp slope on different methods to determine lactate threshold in semi-professional soccer players. Res Sports Med. 2019;27(3):326–38. 10.1080/15438627.2018.1523790 [DOI] [PubMed] [Google Scholar]
  • 110.Sabino-Carvalho JL, Lopes TR, Obeid-Freitas T, Ferreira TN, Succi JE, Silva AC et al. Effect of ischemic preconditioning on endurance performance does not surpass placebo. Med Sci Sports Exerc. 2017;49(1):124–32. 10.1249/MSS.0000000000001088 [DOI] [PubMed] [Google Scholar]
  • 111.Scharhag-Rosenberger F, Carlsohn A, Cassel M, Mayer F, Scharhag J. How to test maximal oxygen uptake: a study on timing and testing procedure of a supramaximal verification test. Appl Physiol Nutr Metab. 2011;36(1):153–60. 10.1139/H10-099 [DOI] [PubMed] [Google Scholar]
  • 112.Scheadler CM, Devor ST. VO2max measured with a self-selected work rate protocol on an automated treadmill. Med Sci Sports Exerc. 2015;47(10):2158–65. 10.1249/MSS.0000000000000647 [DOI] [PubMed] [Google Scholar]
  • 113.Sedgeman D, Dalleck L, Clark IE, Jamnick N, Pettitt RW. Analysis of square-wave bouts to verify VO2max. Int J Sports Med. 2013;34(12):1058–62. 10.1055/s-0033-1341436 [DOI] [PubMed] [Google Scholar]
  • 114.Stachenfeld NS, Eskenazi M, Gleim GW, Coplan NL, Nicholas JA. Predictive accuracy of criteria used to assess maximal oxygen consumption. Am Heart J. 1992;123(4 Pt 1):922–5. 10.1016/0002-8703(92)90697-t [DOI] [PubMed] [Google Scholar]
  • 115.Straub AM, Midgley AW, Zavorsky GS, Hillman AR. Ramp-incremented and RPE-clamped test protocols elicit similar VO2max values in trained cyclists. Eur J Appl Physiol. 2014;114(8):1581–90. 10.1007/s00421-014-2891-0 [DOI] [PubMed] [Google Scholar]
  • 116.Strom CJ, Pettitt RW, Krynski LM, Jamnick NA, Hein CJ, Pettitt CD. Validity of a customized submaximal treadmill protocol for determining VO2max. Eur J Appl Physiol. 2018;118(9):1781–7. 10.1007/s00421-018-3908-x [DOI] [PubMed] [Google Scholar]
  • 117.Taylor K, Seegmiller J, Vella CA. The decremental protocol as an alternative protocol to measure maximal oxygen consumption in athletes. Int J Sports Physiol Perform. 2016;11(8):1094–9. 10.1123/ijspp.2015-0488 [DOI] [PubMed] [Google Scholar]
  • 118.Tucker WJ, Sawyer BJ, Jarrett CL, Bhammar DM, Ryder JR, Angadi SS et al. High-intensity interval exercise attenuates but does not eliminate endothelial dysfunction after a fast food meal. Am J Physiol Heart Circ Physiol. 2018;314(2):H188–H94. 10.1152/ajpheart.00384.2017 [DOI] [PubMed] [Google Scholar]
  • 119.Vogiatzis I, Louvaris Z, Habazettl H, Athanasopoulos D, Andrianopoulos V, Cherouveim E et al. Frontal cerebral cortex blood flow, oxygen delivery and oxygenation during normoxic and hypoxic exercise in athletes. J Physiol. 2011;589(Pt 16):4027–39. 10.1113/jphysiol.2011.210880 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 120.Weatherwax RM, Harris NK, Kilding AE, Dalleck LC. Using a site-specific technical error to establish training responsiveness: a preliminary explorative study. Open Access J Sports Med. 2018;9:47–53. 10.2147/OAJSM.S155440 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 121.Weatherwax RM, Harris NK, Kilding AE, Dalleck LC. Incidence of VO2max responders to personalized versus standardized exercise prescription. Med Sci Sports Exerc. 2019;51(4):681–91. 10.1249/MSS.0000000000001842 [DOI] [PubMed] [Google Scholar]
  • 122.Weatherwax RM, Richardson TB, Beltz NM, Nolan PB, Dalleck L. Verification testing to confirm VO2max in altitude-residing, endurance-trained runners. Int J Sports Med. 2016;37(7):525–30. 10.1055/s-0035-1569346 [DOI] [PubMed] [Google Scholar]
  • 123.Wilhelm EN, Gonzalez-Alonso J, Parris C, Rakobowchuk M. Exercise intensity modulates the appearance of circulating microvesicles with proangiogenic potential upon endothelial cells. Am J Physiol Heart Circ Physiol. 2016;311(5):H1297–H310. 10.1152/ajpheart.00516.2016 [DOI] [PubMed] [Google Scholar]
  • 124.Williams AM, Paterson DH, Kowalchuk JM. High-intensity interval training speeds the adjustment of pulmonary O2 uptake, but not muscle deoxygenation, during moderate-intensity exercise transitions initiated from low and elevated baseline metabolic rates. J Appl Physiol. 2013;114(11):1550–62. 10.1152/japplphysiol.00575.2012 [DOI] [PubMed] [Google Scholar]
  • 125.Wingo JE, Lafrenz AJ, Ganio MS, Edwards GL, Cureton KJ. Cardiovascular drift is related to reduced maximal oxygen uptake during heat stress. Med Sci Sports Exerc. 2005;37(2):248–55. 10.1249/01.mss.0000152731.33450.95 [DOI] [PubMed] [Google Scholar]
  • 126.Yeh YJ, Law LY, Lim CL. Gastrointestinal response and endotoxemia during intense exercise in hot and cool environments. Eur J Appl Physiol. 2013;113(6):1575–83. 10.1007/s00421-013-2587-x [DOI] [PubMed] [Google Scholar]
  • 127.Caputo F, Mello MT, Denadai BS. Oxygen uptake kinetics and time to exhaustion in cycling and running: a comparison between trained and untrained subjects. Arch Physiol Biochem. 2003;111(5):461–6. 10.3109/13813450312331342337 [DOI] [PubMed] [Google Scholar]
  • 128.Thoden J. Evaluation of the aerobic power In: MacDougall JD, Wenger HA, Green HJ, editors. Physiological testing of the high-performance athlete. Champaign: Human Kinetics; 1991. [Google Scholar]
  • 129.Katch VL, Sady SS, Freedson P. Biological variability in maximum aerobic power. Med Sci Sports Exerc. 1982;14(1):21–5. 10.1249/00005768-198201000-00004 [DOI] [PubMed] [Google Scholar]
  • 130.Midgley AW, McNaughton LR, Carroll S. Effect of the VO2 time-averaging interval on the reproducibility of VO2max in healthy athletic subjects. Clin Physiol Funct Imaging. 2007;27(2):122–5. 10.1111/j.1475-097X.2007.00725.x [DOI] [PubMed] [Google Scholar]
  • 131.Noakes TD. Maximal oxygen uptake as a parametric measure of cardiorespiratory capacity: comment. Med Sci Sports Exerc. 2008;40(3):585; author reply 6. 10.1249/MSS.0b013e3181617350 [DOI] [PubMed] [Google Scholar]
  • 132.Wood RE, Hills AP, Hunter GR, King NA, Byrne NM. VO2max in overweight and obese adults: do they meet the threshold criteria? Med Sci Sports Exerc. 2010;42(3):470–7. 10.1249/MSS.0b013e3181b666ad [DOI] [PubMed] [Google Scholar]
  • 133.Sawyer BJ, Tucker WJ, Bhammar DM, Gaesser GA. Using a verification test for determination of VO2max in sedentary adults with obesity. J Strength Cond Res. 2015;29(12):3432–8. 10.1519/JSC.0000000000001199 [DOI] [PubMed] [Google Scholar]
  • 134.Schneider J, Schluter K, Wiskemann J, Rosenberger F. Do we underestimate maximal oxygen uptake in cancer survivors? Findings from a supramaximal verification test. Appl Physiol Nutr Metab. 2020;45(5):486–92. 10.1139/apnm-2019-0560 [DOI] [PubMed] [Google Scholar]
  • 135.Leicht CA, Tolfrey K, Lenton JP, Bishop NC, Goosey-Tolfrey VL. The verification phase and reliability of physiological parameters in peak testing of elite wheelchair athletes. Eur J Appl Physiol. 2013;113(2):337–45. 10.1007/s00421-012-2441-6 [DOI] [PubMed] [Google Scholar]
  • 136.Astorino TA, Bediamol N, Cotoia S, Ines K, Koeu N, Menard N et al. Verification testing to confirm VO2max attainment in persons with spinal cord injury. J Spinal Cord Med. 2019;42(4):494–501. 10.1080/10790268.2017.1422890 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 137.Bowen TS, Cannon DT, Begg G, Baliga V, Witte KK, Rossiter HB. A novel cardiopulmonary exercise test protocol and criterion to determine maximal oxygen uptake in chronic heart failure. J Appl Physiol (1985). 2012;113(3):451–8. 10.1152/japplphysiol.01416.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 138.Saynor ZL, Barker AR, Oades PJ, Williams CA. Reproducibility of maximal cardiopulmonary exercise testing for young cystic fibrosis patients. J Cyst Fibros. 2013;12(6):644–50. 10.1016/j.jcf.2013.04.012 [DOI] [PubMed] [Google Scholar]
  • 139.Saynor ZL, Barker AR, Oades PJ, Williams CA. A protocol to determine valid VO2max in young cystic fibrosis patients. J Sci Med Sport. 2013;16(6):539–44. 10.1016/j.jsams.2013.01.010 [DOI] [PubMed] [Google Scholar]
  • 140.Causer AJ, Shute JK, Cummings MH, Shepherd AI, Bright V, Connett G et al. Cardiopulmonary exercise testing with supramaximal verification produces a safe and valid assessment of Vo2max in people with cystic fibrosis: a retrospective analysis. J Appl Physiol (1985). 2018;125(4):1277–83. 10.1152/japplphysiol.00454.2018 [DOI] [PubMed] [Google Scholar]
  • 141.Barker AR, Jones AM, Armstrong N. The influence of priming exercise on oxygen uptake, cardiac output, and muscle oxygenation kinetics during very heavy-intensity exercise in 9- to 13-yr-old boys. J Appl Physiol (1985). 2010;109(2):491–500. 10.1152/japplphysiol.00139.2010 [DOI] [PubMed] [Google Scholar]
  • 142.Barker AR, Williams CA, Jones AM, Armstrong N. Establishing maximal oxygen uptake in young people during a ramp cycle test to exhaustion. Br J Sports Med. 2011;45(6):498–503. 10.1136/bjsm.2009.063180 [DOI] [PubMed] [Google Scholar]
  • 143.Robben KE, Poole DC, Harms CA. Maximal oxygen uptake validation in children with expiratory flow limitation. Pediatr Exerc Sci. 2013;25(1):84–100. 10.1123/pes.25.1.84 [DOI] [PubMed] [Google Scholar]
  • 144.Barker AR, Trebilcock E, Breese B, Jones AM, Armstrong N. The effect of priming exercise on O2 uptake kinetics, muscle O2 delivery and utilization, muscle activity, and exercise tolerance in boys. Appl Physiol Nutr Metab. 2014;39(3):308–17. 10.1139/apnm-2013-0174 [DOI] [PubMed] [Google Scholar]
  • 145.Bhammar DM, Stickford JL, Bernhardt V, Babb TG. Verification of maximal oxygen uptake in obese and nonobese children. Med Sci Sports Exerc. 2017;49(4):702–10. 10.1249/MSS.0000000000001170 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 146.Lambrick D, Jakeman J, Grigg R, Kaufmann S, Faulkner J. The efficacy of a discontinuous graded exercise test in measuring peak oxygen uptake in children aged 8 to 10 years. Biol Sport. 2017;34(1):57–61. 10.5114/biolsport.2017.63734 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 147.Sansum KM, Weston ME, Bond B, Cockcroft EJ, O’Connor A, Tomlinson OW et al. Validity of the supramaximal test to verify maximal oxygen uptake in children and adolescents. Pediatr Exerc Sci. 2019;31(2):213–22. 10.1123/pes.2018-0129 [DOI] [PubMed] [Google Scholar]
  • 148.de Groot JF, Takken T, de Graaff S, Gooskens RH, Helders PJ, Vanhees L. Treadmill testing of children who have spina bifida and are ambulatory: does peak oxygen uptake reflect maximum oxygen uptake? Phys Ther. 2009;89(7):679–87. 10.2522/ptj.20080328 [DOI] [PubMed] [Google Scholar]
  • 149.Werkman MS, Hulzebos HJ, van de Weert-van Leeuwen PB, Arets HG, Helders PJ, Takken T. Supramaximal verification of peak oxygen uptake in adolescents with cystic fibrosis. Pediatr Phys Ther. 2011;23(1):15–21. 10.1097/PEP.0b013e318208ca9e [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Laurent Mourot

8 Sep 2020

PONE-D-20-25408

‘Verification phase’ for confirming ‘true’ maximal oxygen uptake in apparently healthy adults: Systematic review, meta-analysis, and recommendations for best practice

PLOS ONE

Dear Dr. Cunha,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Both reviewers underlined conceptual limitations that should be adressed. The manuscript should be also shortened to help the reader to catch the main aim of this review, i.e., the use of verification phases.

Please submit your revised manuscript by Oct 23 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Laurent Mourot

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please clarify if and how you assessed for publication bias. Please provide graphs supporting your analysis of publication bias. Please confirm if unpublished studies/ grey literature had been searched?

3. Please confirm whether the quality of studies was assessed by more than one person and whether there was a consensus procedure for disagreements.

4. Thank you for stating the following in the Funding Section of your manuscript:

[This study was partially supported by grants from the Carlos Chagas Filho Foundation for the Research

 Support in Rio de Janeiro (FAPERJ, E-26/202.705/2019, recipient FC; E-26/202.880/2017, recipient PF)

 and Brazilian Council for Technological and Scientific Development (CNPq, 248023/2012-8 and

Manuscript Click here to access/download;Manuscript;Manuscript.docx

303629/2019-3, recipient PF). The funders had no role in study design,  data collection and analysis,

decision to publish, or preparation of the manuscript.]

We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form.

Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows:

 [The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.]

5. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: I Don't Know

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Please see attached file for full comments as my feedback exceeded the allowed character count. I include below the introductory paragraph to my review.

This meta-analysis evaluated the validity of a verification phase to confirm the achievement of VO2max. Although the authors will notice that I am very critical of the model for reasons that I hope are clear and evident in my comments, I have to recognize that this is a very detailed analysis and that the data have the potential to make a meaningful contribution to the literature. However, I believe that the authors need to make some major adjustments in the structure of this manuscript so that the correct message is delivered. I believe that this could be solved relatively easily, but this would imply a shift in the interpretation of the physiological bases for supporting the idea of the verification phase as a valid approach to satisfy the plateau criterion for achievement of VO2max. In fact, towards the end of the manuscript, I realized that the authors fully understand the limitations of the model. However, for reasons that are difficult for me to understand at this point, they still present the model as valid to test something that it cannot test. I have provided extensive (and admittedly repetitive) feedback on this aspect throughout my review. I apologize for this, but I was trying to be as clear as possible with my words so that the chances of misunderstanding my position are minimized. I truly hope that the authors are willing to make some important changes, as I consider the data very strong and I fully respect and value the amount of work put into this analysis.

Reviewer #2: Comments to the Authors

General Comments

This study aimed to provide a systematic review and meta-analysis of the validity of deriving VO2max and HRmax from a verification phase, with a cardiopulmonary exercise test (CPET) considered the gold standard. It is a thorough investigation of the studies published within this field to date, and the data appears to have been presented objectively and in a clear manner. The research question is relevant and the analysis necessary.

One fundamental problem I have with studies discussing the attainment of a “true” VO2max (or maximal anything) via CPET and VP (or equivalent) is that we actually don’t know in any scenario whether a “true” maximal value really has been attained. This almost philosophical point (although I find it to be quite obvious and, in this case, physiological) is almost always overlooked and it has been once again in this paper. I think it would be judicious of the authors to acknowledge – and discuss in some detail – this fact (rather than just skim over it at the end of the discussion, in relation to Noakes’ [150] critique).

Another fairly major concern I have is that the analyses are limited to comparing maximal VO2 measures derived from a CPET versus a verification phase at a group level (also highlighted as an issue by Noakes [150]), which is not the same as measuring validity (i.e., through the agreement of two measures). Two measures can be similar (or not significantly different) at a group level, but the agreement can still be very poor. Without an analysis of agreement, how can validity be inferred?

Specific Comments

Title

I’m not sure why single quotation marks are needed around the selected terms. Also, “recommendations for best practice” were not scientifically investigated, but are (a small) part of the discussion/conclusion (as is the case in many papers). So I suggest removing these components of the title. This would give something more succinct and specific, like: “The verification phase for confirming maximal oxygen uptake in apparently healthy adults: A systematic review and meta-analysis”.

Abstract

On analyzing the title and aim of the study, I don’t feel this abstract is necessarily a clear summary of the most relevant findings. The results are rather heavily focused on the HR data, which is not a main focus of the study. Please reflect and re-consider.

L37: CPET on “a” cycle… (the article seems to be missing).

L41: The punctuation suggests that this is the age range (and even VO2max) of the women only. Please clarify.

L42, 44: n = 52/36… presumably these are the number of studies? Please clarify.

L44: Can (should) bpm be expressed to decimal places? Also, 5 d.p. on the P value seems quite excessive (as is the case throughout the manuscript for very small P values – is there any reason for this?).

L44-47: It is unclear what the comparisons (3 bpm greater HR) relate to for the three P values. What are they being compared to? Please clarify. That said, I’m not sure why so much focus is given to HR in the abstract, when the study is really about verifying VO2max.

L50. Why would concordance (agreement) “put [this] into question”? This seems contradictory. Please clarify.

Introduction

The Introduction is long and there is rather a lot of discussion around the general topic of VO2max testing, but sparse detail relating to the actual topic of the study (i.e., the use of verification phases in all their various make-ups). The sentence at L126-129 to me is the crux of the problem and the study, and this is what the introduction should focus on more exclusively. The study referenced in L130-131 ([53]) requires more discussion/explanation, so that there is clearer context for the current study and justification for the sentence at L131-134. I have no issue with the importance of this work, but a bit more clarity around the actual problem (and existing literature) is required.

L55-58: This is a very long opening sentence. I suggest breaking it in two (if all content is to remain).

L61: Should the refs be listed in numerical order (e.g. [1, 3-6])? (There are other examples throughout the manuscript as well.)

L69-70: A transition from that stated, but to what? Continuous fast ramp tests? It feels like the end of the sentence is missing – maybe combine (and condense?) it with some of the text from the following sentence (L70-72).

L74: Should this be “limitations of VO2max” (i.e., the method of measurement)? The limitations “to” VO2max is a different topic, as I see it, more related to training, genetics, etc.

L83-86: There seem to be two contrasting definitions of the Taylor et al. VO2 plateau criterion here. Please clarify.

L87: Should the ref ([32]) not be included with Taylor et al. even here?

L94: Typo on VO2max (the second time).

L97-99: Check and change the grammar/punctuation. Something is going on around “investigators; however, due to”, which makes the sentence incoherent (to me).

L77-104: This is a very long paragraph. I suggest splitting it in two – the first relating to VO2max criteria, the second to the secondary criteria. Or write the content more concisely to produce one shorter paragraph.

L105-139: These two paragraphs are most important in justifying the current study, so I think a more in-depth discussion of this literature is required (instead of the level of detail presented in the three preceding paragraphs, which could be significantly condensed).

Methods

L162: “ergometer or treadmill” – was this limited to bi-pedal running on a treadmill? Please specify, as many other modes of exercise are possible on a treadmill (e.g., cycling, hand-cycling, wheelchair running, inline skating, roller skiing, etc.). It would be useful to make this clarification through a clear definition somewhere in the paper, that by “treadmill” (see for example Figure 1, “Only treadmill”) you actually mean “treadmill running” (if that’s the case).

L171-172: “In the final review, we provided…” – is this referring to what is presented in the current manuscript? Please clarify, as “final review” and “we provided” is a bit unclear to me.

L196: You have previously written abbreviations within round brackets in square brackets: (95% confidence interval [CI]).

L197: Out of interest, what did you do in cases where VO2max was reported in mL/kg/min?

L210: P-values “were” obtained…?

L215: Is this less than 50%, or less than or equal to? The symbol looks unclear to me.

L223: The studies were also…

L225: Stratified analyses were also…

Results

The main issue for me throughout this section is the long lists of references accompanying each result. This is not common in other studies of this type that I’m familiar with, and to me it makes deciphering the interesting information nigh on impossible. I would recommend removing these long number strings.

L241: (interquartile range [IQR]) – or consistent with previous presentation.

L240-245: You write that “the sex of 130 participants was not specified”, and then that “one study did not specify the sex of the participants (see Table 1)”. In that study (Scheadler and Devor [92]) n = 13, so I don’t understand the mismatch between 130 and 13. Please clarify.

L246: BMI should presumably be defined after the words (body mass index) and doesn’t then need to be included in the brackets.

L246-247: The square and round brackets seem to have switched places in this sentence, any reason?

L247: Writing “(VO2max normalized to body mass)” seems superfluous when you have the unit as mL/kg/min. Consider removing.

L251: “Characteristics of studies using CPET…”?

L253, 255: “on a cycle/treadmill” (again, the article is missing).

L253-307: These long strings of references make the results very unreadable. I suggest removing them all, as finding the actual interesting numbers (i.e., the results) through the long lists is so difficult.

L272-274: Could you re-phrase to fix the grammar and clarity on: “whereas 29 (37%) used fixed intervals of 15- to 30-s (or 2 × 15-s), both averaged and fixed times (1%) [61]… etc.”. I guess the 29 (37%) relates specifically to the 15/30-s fixed interval data, so the sentence needs to be re-structured and improved to clarify this in relation to the other methods listed.

L279, 281: I think you need to include the “min” unit after 5, 6, 6, 9 and 15 – or not if you were to remove all the references in brackets (another example of how difficult the interesting numbers and results are to decipher from the long [and unnecessary?] lists of references).

L285-286: Why are two different %ages presented (19 and 19.7)?

L291: Suggest removing “i.e.”? Not included elsewhere.

L295: Should this be “the” maximal-intensity work rate, rather than “a” (presumably it was specific to that study and the preceding VO2max test).

L297: Could you briefly describe in this sentence what the formula was based on?

L300: “Forty-two studies (54%)” – consistent reporting.

L300: obtained “during” (rather than “at”)?

L366-onwards: Is there a reason for changing the presentation (order of using) round and square brackets again? And see my point above (in the Abstract) about the number of decimal places on the P values < 0.001. Is there any statistical reason/need for this?

L383: (performed on the same day as vs. a different day from the CPET) – suggestion.

L387-388: Could you include the P value here for this no sig diff, as it is a key result.

Discussion

At times I struggle to follow the logic of the arguments in this (very long) discussion, so I think the interpretations can be written more clearly and concisely in places. In addition, there is a lot of discussion of previously published studies and concepts, without reference to the findings from the current results. This seems inappropriate for a systematic review/meta-analysis, so I would encourage the authors to focus more on their own findings in light of previous work, rather than merely presenting a review of the existing literature.

L422: Reconsider “over” in this sentence. Maybe “rather than”?

L433-435: This study did not analyse children or clinical groups, so where is the “current evidence suggest[ing] that the verification phase is a safe and well-tolerated procedure to confirm attainment of true VO2max” in these groups? This particular study can surely only make this claim about the apparently healthy adults who were analysed, or am I missing something?

L445: “of a ramp-incremented…” (missing article).

L455-456: Are 17% and 33% comparable in this sentence? If so, please use the same unit (either CPETs, or participants) in order to compare like with like (e.g., “17% of participants (2 of the 12) during a cycling ramp-incremented CPET, while 33%…”.

L477-478: Is this statement true? If so then I’m missing something. Re-reading ref #100 (McGawley 2017) it is stated that: “There was a significant effect of test type on VO2max, with higher values recorded during STEP compared with VER (P = 0.013)”. Can you clarify how you’ve come to this conclusion (three studies to-date) and how you conducted this analysis/check?

L494: Is there a typo here: “4 of the 7 participants (9%)”?

L496: “11 participants (9 men; age…)” – it appears as if you are only reporting the descriptives for the men, is this the case? Please clarify.

L499-503: From your results and Figure 2 this looks like an outlier. Has/should there been any accounting for outliers in your analyses?

L477-516: This is a very long paragraph (> 1 page). Please consider shortening. I don’t think all the detail of the three specified studies is required (L478-503) – this could be condensed and written more concisely.

L508-516: This seems to explain this result as an outlier. What happens to your findings (CPET vs verification phase VO2max) if this study is removed from your overall analyses?

L517-519: I’m not sure I agree with this statement (or maybe I misunderstand what you mean). If CPET = VP or if CPET > VP then is that confirmation of a “true” VO2max during the CPET? Can both tests not elicit a VO2max that is lower than an individual’s “true” VO2max in this scenario?

L520-521: I don’t quite follow the logic of this follow-up sentence. Are you saying that there would need to be a difference in order for the statement in the first sentence to be true? Why? Please clarify.

L526: Why “only” 25 (i.e. 27%)? To me this 27% of the studies is important in demonstrating that the CPET doesn’t always do its job properly (i.e., in eliciting a “true” VO2max). This is where the analysis of agreement is important too – what is the similarity (or dissimilarity) in VO2max values derived from a CPET vs VP “within” individuals? Please comment.

L517-528: I actual don’t follow the logic or point being made in this section. Could you please try to clarify?

L542: “who” underwent?

L544: was similar “to the”?

L547: At this point I’m really struggling to follow the logic and arguments presented over the last few pages. Are you saying that CPET should be higher than VP in order to accept that a true VO2max has been attained in the CPET? Why? What is the problem with CPET = VP? A more fundamental question, in my opinion: Why is it not acknowledged/discussed that individuals can very easily underperform on BOTH tests, and that we really don’t have any idea as to whether we have attained a “true” VO2max at all. Please comment.

L552-557: The “different” methods described previously for study #94 are also relevant here, as is my comment above (i.e., that it is always possible that neither test was truly maximal and elicited a “true” VO2max).

L557: What is meant by “this put into question” in this context?

L561-562: This is the first time this endeavor has been mentioned (except in the title). Please re-consider the phrasing here (and in the title!) – especially given the conclusion of this sentence (L567-569), i.e., that no best practice can actually be recommended.

L570-571: This list of 6 references does not seem complete, or to reflect “most studies”. Please clarify.

L570-578: What new insight does this paragraph add, from the current results, which was not already known? Please embellish with additional information, or remove.

L584-585: I don’t think decimals are needed on these %ages.

L579-596: Again, how do the current results relate to the previous literature? This is not a review article, so as I see it the discussion section should be used to present the results of the current study in the context of previous results. The information presented here (that 105% was different from 115% according to Nolan et al.) is not supported by your results, as I understand, since you saw no significant effect of VP intensity. This is what ought to be discussed, in my opinion.

L597-610: Again, this is a review of the existing literature. Please discuss the results of the present study.

L613: Remove the extra space(s) between Small and sampling.

L614: rapid changes

L611-621: Same issue again - this is a review of the existing literature without reference to the current study results. Please reconsider.

L624: should not exceed

L636: on the duration of

L642: when “what” are short? The VPs? Please clarify.

L649-653: I don’t think the second sentence is a good enough “get-out” given the significance of the criticism stated by Noakes. This underpins the entire concept of “validity”. Do not overlook or underplay the fact that your final sentence, which you say you are not doing (“rather than the question of whether an individual has elicited a ‘true’ VO2max.”), is exactly what you say you are doing in the title (as I interpret things)!

L655: Have effect sizes been presented anywhere?

L656-657: “in cycle ergometry and treadmill running”?

L660: “compromising their ability” (plural)

L670: I don’t understand this, in the context of the previous sentences: “The mandatory application of the verification phase in all situations may be therefore questioned”. Why questioned?

L671: settings?

Tables & Figures

The studies appear to have been ordered chronologically and then alphabetically in Table 1, with this ordering system then continued throughout the later tables & figures. This seems arbitrary (chronologically then alphabetically) and makes it difficult to locate any specific study in the later tables/figures. Could you order the studies entirely alphabetically or according to the reference numbering from the outset?

Is there any reason for presenting the subgroup analyses according to the characteristics of the verification phase protocol in a “figure”, while the subgroup analyses regarding sex, cardiorespiratory fitness level, exercise modality, and CPET protocol are presented in a “table”? Could this method of presentation be standardized? Also, I’m not familiar with the presentation used in Figure 4. Can you provide more information about how to read it (top line with green box, middle line with green box and black diamond), as it won’t be clear to all readers.

Table 1: The heading “mean values” should probably be aligned over the final three columns to the right, as sex and N are not means. Also, ranges should be differentiated in this heading, if that’s what those are and if they can’t be expressed as means (e.g., 25-35 and 19-61). And can/should the number of decimal places be standardized in the data? Any reason that some terms (e.g., Sedentary, Cyclists, Runners, Athletes) are capitalized, but others aren’t?

Tables 3/4: Can Total be clarified (presumably it’s the number of participants, but this is not stated anywhere). The %Weight is hard to comprehend – I have no experience of this measure or its calculation, but the statistical power seems to bear no relation to N, which seems odd to me. Can you explain?

Table 5: Can horizontal lines be used to clarify where each category (TTE, VO2max, HR) starts and ends (i.e., to the right of each N)?

Figure 1: Can you clarify (even if just to me) why the 1 full article excluded in the Eligibility stage due to “Non-maximal exercise test protocols…” had not already been excluded for the same reason in the Screening stage?

Figure 2: The data suggests to me a tendency for CPET to be higher than VP. Is there any accounting for potential outliers (e.g., Colakoglu et al.)? What happens if this study is removed from the analyses (if there is good reason to do so, which reading the discussion there might be)?

Figures 2-4: The quality of these figures is poor (due to the high level of detail). Can they be presented at a higher resolution?

Reference list

L684: (1985) should presumably be removed from the JAP title?

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Kerry McGawley, Ph.D.

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: Comments to the authors.pdf

PLoS One. 2021 Feb 17;16(2):e0247057. doi: 10.1371/journal.pone.0247057.r002

Author response to Decision Letter 0


7 Nov 2020

- Title: Verification phase for confirming maximal oxygen uptake in apparently healthy adults: A systematic review and meta-analysis

- Corresponding Author: Felipe A. Cunha

- E-mail: felipeac@globo.com

- Manuscript ID: PONE-D-20-25408

Dear Editor,

Please find below our responses to the reviewers’ comments concerning the article PONE-D-20-25408, titled “Verification phase for confirming maximal oxygen uptake in apparently healthy adults: A systematic review and meta-analysis”. The manuscript has been revised according to the reviewers’ suggestions and an itemized, point-by-point response to each of the reviewers’ comments has been provided.

Yours Sincerely,

Felipe A. Cunha

Review comments:

Reviewer: 1

This meta -analysis evaluated the validity of a verification phase to confirm the achievement of VO2max. Although the authors will notice that I am very critical of the model for reasons that I hope are clear and evident in my comments, I have to recognize that this is a very detailed analysis and that the data have the potential to make a meaningful contribution to the literature. However, I believe that the authors need to make some major adjustments in the structure of this manuscript so that the correct message is delivered. I believe that this could be solved relatively easily, but this would imply a shift in the interpretation of physiological bases for supporting the idea of the verification phase as a valid approach to satisfy the plateau criterion for achievement of VO2max. In fact, towards the end of the manuscript, I realized that the authors fully understand the limitations of the model. However, for reasons that are difficult for me to understand at this point, they still present the model as valid to test something that it cannot test. I have provided extensive (and admittedly repetitive) feedback on this aspect throughout my review. I apologize for this, but I was trying to be as clear as possible with my words so that the chances of misunderstanding my position are minimized. I truly hope that the authors are willing to make some important changes, as I consider the data very strong and I fully respect and value the amount of work put into this analysis.

Introduction

Lines 55-58: This sentence is too long and difficult to read as presented. For example, I do not think that the word “neuromuscular” is needed here. It is obvious that neuromuscular function is important for performance and it will impact vascular function to a given extent. However, VO2max is not defined by neuromuscular performance in my view. Regardless, the authors should shorten this sentence.

ANSWER: Thank you for the observation. We have reviewed the paragraph according to the reviewer’s suggestion and it now reads as follows: “Maximal oxygen uptake (VO2max) represents the upper physiological limit of the utilization of oxygen for producing energy during strenuous exercise performed until volitional exhaustion”.

Line 59: It is interesting that the definition of VO2max included cardiovascular, pulmonary, neuromuscular, and metabolic responses, but the authors refer to the test as cardiopulmonary (CPET). I never understood this term, as it excludes the vascular component, which is a key component of O2 distribution. I do not think that I am asking for this to be changed, but I wanted to highlight that I do not like it.

ANSWER: Since the definition of VO2max has been revised to address your previous comment, we believe the revision also addresses this comment.

Lines 70-72: It is surprising that the work of Iannetta et al. (Am J Physiol Regul Integr Comp Physiol. 2020 Jul 22. doi: 10.1152/ajpregu.00126.2020; J Appl Physiol. 2019 Dec 1;127(6):1519-1527) and Keir et al. (Appl Physiol Nutr Metab. 2018 Sep;43(9):882-892) is not mentioned here, as it highlights important aspects of ramp incremental tests that are often ignored and that have resulted in misinterpretations of: 1) the link between ramp incremental and constant-load VO2 and work rates; 2) the lack of validity in the idea of using a supra peak work rate intensity for the verification of the VO2max response (more on this will certainly arrive in later comments).

ANSWER: Apologies for missing this. We have read all the articles to address the reviewer’s suggestions. The aforementioned studies are now included in the revised manuscript and also meta-analyzed (e.g. Iannetta et al. Am J Physiol Regul Integr Comp Physiol. 2020 Jul 22. doi: 10.1152/ajpregu.00126.2020).

Lines 74: It remains surprising that the authors discuss that more needs to be learned about using information from ramp incremental tests and have not even mentioned the manuscripts indicated above (I understand that one of them might be too recent, but I find the other omissions surprising).

ANSWER: Please see our response to your last comment.

Lines 75- 76: This is true. For example, the supra peak work rate intensities that are often used represent a good example of what could be a wrong protocol. Additionally, if motivation is an issue, then the verification ride would not fix this problem!

ANSWER: We are in agreement with the reviewer’s opinion. Thank you for your comment.

Lines 77-104: I think the authors do a good job in this paragraph identifying important issues. However, this paragraph is a bit redundant and convoluted. This section could be shortened by ~30% to streamline the message.

ANSWER: This paragraph (now starting on line 72) has been shortened according to the reviewer’s suggestion.

Line 94: Second VO2max is missing the letter “x”.

ANSWER: The correction has been made.

Lines 105-129: Well, there are several conceptual limitations in this section. First, I will start by saying that a wrong concept repeated by many people, remains a wrong concept (even when some of the people supporting the concept are dominant figures in the field). Let’s start by saying that, by definition, there is no such thing as supramaximal. In the end, the work rate obtained at the end of a ramp incremental test is far from maximal. It is simply a peak value, that will be greater or smaller depending on the characteristics of the ramp. In fact, it has been shown that whereas VO2max (or peak if you prefer) remains constant across a wide range of slope during a ramp incremental test, the power output (PO) varies widely depending on the slope of the ramp. For example, Iannetta et al. (J Appl Physiol. 2019 Dec 1;127(6):1519-1527) showed that whereas steeper ramps elicit the greatest peakPOs, less steep ramps result in progressively lower peak POs. This demonstrates that the so-called “supramaximal” work rates are just an illusion. “Supramaximal” in relation to what? This has been further indicated in a study that showed that performing sub peak work rate exercise to exhaustion is more likely to result in the achievement of higher VO2 values than performing supra peak work rate exercise (Am J Physiol Regul Integr Comp Physiol. 2020 Jul 22. doi: 10.1152/ajpregu.00126.2020). Although this might sound counter intuitive to the naïve reader, it should be physiologically expected by anyone who understands how ramp incremental testing affects the development of the slow component of VO2. See Appl Physiol Nutr Metab. 2018 Sep;43(9):882-892 and Med Sci Sports Exerc. 2020 Mar 20. doi: 10.1249/MSS.0000000000002343 for more details.

ANSWER: We have added these references to the paper and the so-called “supramaximal” term has been changed to “supra peak” work rate throughout the manuscript in accordance with the reviewer’s comments (the revised paragraph starts on line 98).

Second, and based on this comment, I will add that: 1) the statement that “a continuous CPET followed by an appended supramaximal verification phase is conceptually similar to the discontinuous tests most commonly used from the 1920s to the 1970s, but with the notable advantage of requiring only a single visit to the laboratory” is simply wrong. I would highly recommend that the authors read the papers indicated above to fully appreciate the idea that there is no such thing as a “supramaximal” work rate; 2) it is surprising that experts in VO2 kinetics such as Poole and Jones have made the mistake or recommending that “supramaximal” intensities should be used for the verification rides. As indicated in the papers highlighted above, 110% of a typical ramp test (or even 1-min step) might not allow for enough time for the VO2 response to be fully expressed, thus resulting in VO2 values that are lower than those obtained during the ramp incremental test, and creating the false idea that VO2max has been achieved when this cannot be demonstrated.

ANSWER: The statement cited above by the reviewer in double inverted commas has been removed from the revised manuscript. We have addressed the issue of using the term ‘supramaximal’ in a previous comment.

Lines 134-137: I would argue that a recent paper (Am J Physiol Regul Integr Comp Physiol. 2020 Jul 22. doi: 10.1152/ajpregu.00126.2020) has demonstrated that the verification phase is a flawed approach to confirm that VO2max has been achieved. In fact, this paper has indicated what type of verification ride would be most adequate in order to achieve the highest possible VO2 (which most likely represents VO2max, as derived from the ramp incremental test!)

ANSWER: This part of the “Introduction” section has been rewritten to address the reviewer’s suggestions (please see text in red).

Lines 137-139: Unfortunately, this meta-analysis cannot achieve this goal. All the authors can do is to provide information on what others have obtained from a ramp or step test and compare those values with the verification rides. Given some of the comments that I made before, the authors should understand that the idea of a “supramaximal” work rate to determine the plateau criterion is flawed. This is not something that is up to debate. It is just a fact based on published data. Then, given the limitation of the model and the inability to establish a plateau response as originally proposed, the verification ride is simply an extra ride that might make people feel more confident with the idea that the highest possible VO2 has been achieved. From a personal perspective, I have no issues with people doing this. However, the vast majority of the data indicate no significant differences between the highest VO2 from the ramp incremental test and the verification ride. When differences exist, those typically fall within the measurement error and on both sides of the probability (i.e., VO2 could be higher in either the ramp or the verification).

ANSWER: Thank you for the observation and we agree with the reviewer. The aim of the study has been rewritten to address the reviewer’s comment (please see text in red starting from line 126).

As a summary of this section, I think that this meta-analysis, as proposed, is set to fail as it cannot answer the question that it is proposed to answer. My recommendation would be that the authors simply try to answer whether there are differences in the highest VO2 values obtained from ramp or step tests compared to a verification ride. Even if differences existed, there is no way that the authors could claim that VO2max has been verified as the “supramaximal” intensities are rather arbitrary and often ignore the most basic physiological responses for VO2 adjustments. In my view, the introduction should describe the limitations with the idea that VO2max can be verified (i.e., the “supramaximal” concept is flawed and “submaximal” verification phases has been shown to work as well), while highlighting that there might still be value in trying to confirm that the highest possible VO2 was achieved, which can be done in a variety of ways (as demonstrated later in this study). However, I find it critical that the erroneous idea that VO2max is verified because a “supramaximal” intensity was added to the evaluation is not perpetuated.

ANSWER: The research questions and study aim at the end of the introduction section have been revised to address the reviewer’s concerns. Thank you for your comment.

Methods

General comment: Although I have participated in meta-analysis studies and reviewed some of them, I have to admit that statistical analysis is not my area of expertise. Thus, even though the information seems correct, I will rely on others for proper evaluation of this section.

ANSWER: Comment acknowledged.

Results

Line 291: Here and throughout the text, I think it would be more appropriate to refer to the intensity as “supra peak work rate” or something like that, so that it is clear that maximal work rate has not been determined in any of those studies. The same idea would apply to the term “submaximal”.

ANSWER: The manuscript has been revised throughout in accordance with the reviewer’s recommendation.

Line 298: I am happy to see that the authors used the right term here and referred to “peak work rate”.

ANSWER: Comment acknowledged.

Line 301: By design, this so-called verification phase cannot confirm the attainment of VO2max. As discussed earlier, the VO2max response remains a mystery. From my perspective, I would argue that VO2max was most likely achieved. However, given that the model used in many studies does not even allow for the VO2 response to be fully expressed, then this method risks falsely accepting the attainment of VO2max simply because the “supramaximal” intensity was too high to allow for the duration of exercise to be long enough for the VO2 response to be fully expressed (sorry for the extra-long sentence!). I would strongly recommend that the authors stay away from the idea of confirming that VO2max was attained as this is an untestable hypothesis. However, the authors can try to verify whether the VO2 associated with the “verification” ride was lower/higher than, or similar to, the VO2 observed during the incremental test.

ANSWER: We agree with the reviewer’s opinion and any statements about the verification phase “confirming that VO2max was attained” have been removed from the revised manuscript.

Line 326: I like the wording here, as the author refer to the highest VO2. There might be value in knowing whether the verification phase results in a greater VO2 compared to the incremental test. However, as I indicated earlier, it is important not to confuse that with determination of VO2max.

ANSWER: Thank you for your comment.

General comment: I think that a further analysis that could be added would include the duration of the verification phases. I guess it will not change much as the mean data will hide the potential effect of duration. However, it is likely worth exploring whether shorter verification phases resulted in comparatively lower values.

ANSWER: We have included the further analysis according to the reviewer’s suggestion.

Discussion

Line 417: As I indicated already, the verification phase cannot confirm the achievement of VO2max as the use of this approach is physiologically flawed. Thus, this meta-analysis should re-focus its aim and indicate a testable objective (i.e., identifying whether the VO2 during the verification phase is different from that obtained during the incremental test).

ANSWER: The aim stated in the introduction section and the discussion section have been revision to address the reviewer’s suggestions (please see text in red).

Lines 428-432: This section seems almost out of place or, at least, unnecessarily long. I understand that the authors might want to highlight the safety of the procedure in other populations. However, I would argue that the point of safety can be made in relation to the current data, and then the authors could add that several studies in clinical populations also confirm the safety of the procedure.

ANSWER: The reviewer is correct, thank you for the comment. The relevant sentence has been moved to the last paragraph of the revised discussion section.

Line 434- 435: Well, I guess my position on this wording is already clear to the authors. However, I will repeat it. This statement is unwarranted as no confirmation of VO2max has been obtained. Such confirmation requires a different experimental model as the intensity used in most verification phases are: 1) often too high for allowing a full development of the VO2 response; 2) below or at the peak PO, which invalidates the premise of “supramaximal” intensity to achieve the plateau criterion. In the end, the model helps you feeling more comfortable with the idea that the performance was maximal. However, the same limitations that applied to not feeling comfortable with a maximal effort during the incremental test also apply during the verification phase. Considering this and the idea that a plateau response is just an illusion (i.e., constant work rate loads become “supramaximal” much earlier than the peak PO observed during incremental testing), then the wording needs to be changed. Thus, the authors could say that “the verification phase is a safe and well-tolerated procedure to evaluate if a higher VO2 compared to that observed during the incremental test was obtained”, or something in line with that.

ANSWER: We have revised the text in accordance with the reviewer’s suggestion. The revised text is located at the end of the second paragraph of the discussion section.

Lines 441- 443: These lines include some information that is correct, and some that sound wrong (at least as presented). It is true that incremental tests do not allow for the VO2 slow component to be fully expressed (with this being more noticeable with steeper ramps), which dissociates the relationship between VO2 and PO when comparing incremental and constant work rate exercise. However, the authors mentioned that the VO2 response is accelerated towards the end of the incremental test. This is incorrect. In fact, during the most used ramps (i.e., 20-30 W/min), the increase in VO2 for a given increase in PO is reduced towards the end of the ramp. This is reflective of the inability of the VO2 response to adjust fast enough to the rapidly increasing ramp. In fact, what the authors mentioned is observed during slow ramps (i.e., 5-10 W/min). This can be seen in Iannetta el al. (J Appl Physiol. 2019 Dec 1;127(6):1519-1527). Interestingly, that paper showed consistent VO2max responses with a wide range of peak POs, and a follow up paper showed that POs do not need to be “supramaximal” to confirm that the highest VO2 has been obtained (Am J Physiol Regul Integr Comp Physiol. 2020 Jul 22. doi: 10.1152/ajpregu.00126.2020). In fact, “supramaximal” intensities are discouraged. Importantly, the current data confirm that the incremental test is sufficient to obtain the highest possible VO2, which demonstrates that a verification phase does not offer much additional value.

ANSWER: We agree with the reviewer’s comment and the sentence has been rewritten to address this concern.

Lines 444-449: This is perfectly in line with what I had just indicated. Given the wide range of ages and fitness levels, and the variety of ramps (i.e., 15, 20, and 25 W/min), the phenomenon that I just described is to be expected. In other words, slow ramps are far more likely to result in a plateau in the VO2 response as the slow component can be expressed to a larger extent, even within a narrower range of POs, as compared to fast ramps.

ANSWER: Thank you for your comment.

Lines 449-450: Exactly! If you think about it, the same ramp would be a relatively faster ramp for a less fit person compared to a fitter one (i.e., less time for the VO2 response to fully adjust).

ANSWER: Thank you for your comment.

Lines 465-476: This section could be just deleted as, aside from the use of HR responses, no other secondary criteria were considered in this analysis. Given that the discussion is excessively long, I would remove this section.

ANSWER: The section has been deleted in accordance with the reviewer’s suggestion.

Lines 491-497: In fact, this information is in line with a recent paper from DiMenna’s group (Arad et al., PLoS One. 2020 Jul 6;15(7):e0235567.doi: 10.1371/journal.pone.0235567). In this study (which should also be added to the analysis in my view), they showed that the verification phase resulted in a significantly (albeit minimally) greater VO2 compared to the incremental response. Interestingly, the authors commented that they selected 100% of peak PO to make sure that the duration of the verification phase was not too short. Based on the definition by Poole and Jones, the authors say that they cannot claim that VO2max was achieved because the intensity was not “supramaximal”. However, they know that this idea of “supramaximal” is flawed anyway. Regardless, it is likely that a verification phase can help achieving a higher VO2 compared to the ramp test in some very specific circumstances. However, there is still no proof that VO2max has been achieved!

ANSWER: We have added this reference to the manuscript and also meta-analyzed their data to in response to the reviewer’s comments.

Lines 499-516: Well, this section is difficult to interpret as the results from Colakoglu’s work are suspicious. How can you reconcile the results presented in Figure 2 in that study? I would be the first to admit (and I have made this point already) that 110% of the peak PO might result in a short duration for the verification ride, which might not allow for the VO2 response to be fully expressed. However, a near 20% drop in the VO2 response during the 110% verification ride is unthinkable! In fact, I went to the original study as the difference between the incremental test VO2 and the 100% verification ride VO2 was extremely high. After seeing the even greater difference at 110% of peak PO, I simply cannot trust those data (which is likely the reason why the paper is published in a very low impact journal). Regardless, the point is that the data from this study should probably not be considered. However, we are not the judges of science and I think it is fair to include the published results. That being said, a discussion on the surprising ~30% difference between the highest and lowest VO2 should be presented.

ANSWER: Although we agree with the reviewer’s concerns, we understand that that it was not appropriate to omit these data from our analyses and should allow the reader to draw their own conclusions about the cited study. Notably, after omitting the data from Colakoglu’s study, there was no change in the main results of the present meta-analysis.

Lines 517-519: This should be reworded. What effectiveness has been demonstrated? In the end, the verification phase did not produce any significant differences in the results. Thus, rather than effective, I would consider this a waste of time. Based on your data, the recommendation should be that a verification phase is not included as it does not add any value to what was already informed by the incremental test. At best, you can say that “the verification phase procedure did not result in a higher VO2 response than that observed in continuous ramp or pseudo-ramp CPET protocols”. Perhaps, you might want to feel that this adds validity to the procedure, and I can accept that as a personal preference. However, by no means can the authors say that VO2max was confirmed.

ANSWER: The sentence has been changed and now reads as follows: “To-date, this systematic review has demonstrated that the majority of studies have shown that the highest mean VO2 values elicited by verification phase bouts were similar (or not statistically or meaningfully different) to the attained VO2 values in continuous ramp or pseudo-ramp CPET protocols [104, 114, 36, 125, 97, 37, 77, 81, 38, 60, 96, 39, 57, 102, 101, 63, 86, 88, 119, 61, 69, 73, 79, 99, 64, 65, 94, 113, 124, 126, 103, 107, 115, 53, 66, 75, 76, 112, 67, 80, 85, 117, 123, 56, 72, 84, 87, 91, 116, 118, 120, 59, 71, 74, 78, 83, 89, 90, 105, 108, 109, 15, 121, 70, 106].” (Page 35, lines 358-363).

Lines 520-521: Why do the authors in theory agree with this premise? The premise of a “supramaximal” verification phase as a means of establishing the plateau criterion is physiological flawed. I hope that the authors can appreciate that now. In fact, this pre-conception sounds as a bias that pushes the authors to try to accept a perceived reality, even though the data that they presented showed the opposite. I honestly think that the authors are to be commended for the detailed analysis that they did, and I truly think that the data are useful. I just cannot understand the origin of the authors’ convictions on this topic.

ANSWER: The ‘Discussion’ section has been rewritten in order to avoid the premise of a “supramaximal” verification phase as a means of establishing the plateau criterion.

Line 529: Here, the authors refer to “the utility of the verification phase”. I believe this wording, although vague, is the closest you can get to reality. Anything that links the verification phase to a confirmation of VO2max is not scientifically verifiable using this model. Thus, this goes in line with what I have been saying from the beginning of my comments.

ANSWER: We have changed this sentence to keep coherence with the revised manuscript and now it reads as follows: “In addition, the present findings also provide consistent and unbiased confirmatory evidence that the reproducibility of the highest VO2 value during CPET and verification phase does not appear to be affected by sex, cardiorespiratory fitness, exercise modality, CPET protocol design, or even how the verification phase was performed (see Table 4 and Figure 4).”

Lines 532-534: This is simply incorrect. I mean, that your results coincide with those of Poole and Jones is not. What is incorrect is that the verification phase provides evidence that VO2max was achieved.

ANSWER: This sentence has been deleted to address the reviewer’s concerns.

Lines 535-540: As the authors might suspect, I do not agree with this overall statement. Any reference to this approach as a confirmation of VO2max should be avoided. It can be said that this was the goal of the approach, but that this goal cannot be achieved with this model. Referring to it as the “gold-standard” approach is simply ridiculous.

ANSWER: This sentence has been deleted to address the reviewer’s concern.

Lines 545- 547: I would argue that the authors in that study should not limit their interpretation to older, less experienced or unfit participants. I mean, if the need for a verification phase is often purported to be connected to the idea that some participants might not be willing to complete a maximal effort during the incremental test, the same argument could be used as a reason to suspect that a maximal effort would not be performed during the verification phase. In other words, why would someone who is unwilling to push hard during the incremental test would suddenly become willing to push hard during the verification phase. In my view, this is another reason why thinking that the verification phase confirms VO2max is erroneous. First, the idea of a “supramaximal” intensity is flawed. Second, a verification phase cannot ensure that a supposedly absent maximal effort during the incremental test is now performed during the verification phase. In fact, if a suboptimal performance occurs during the verification phase, many would erroneously interpret that as proof that VO2max was achieved!

ANSWER: This sentence has been deleted to address both of the reviewer’s suggestions, focusing more on our own findings.

Line 551: This is correct and, as indicated above, the results from that study are highly irregular. I would attribute the differences to poor measurement rather than to physiological variability. A ~30% difference between the two extreme conditions is simply unacceptable.

ANSWER: Thank you for your comment.

Lines 551-557: Perhaps, neither “a)” nor “b)” are the correct interpretation. What your data clearly show (and I commend the author for doing an impressive work with this), is that the highest VO2 values during the incremental test is not different from the highest VO2 value during the verification phase. In neither case achievement of VO2max can be confirmed. If you ask for my personal opinion, I would argue that VO2max was achieved, and that the verification phase helped demonstrating that a supra critical intensity bout to exhaustion during the verification phase resulted in the same highest VO2 value as seen during the incremental test. In that sense, I am confident that the verification phase, although unnecessary in this population, adds confidence to the measure. However, given the reasons explained before, the plateau criterion based on a “supramaximal” intensity is flawed, especially when the intensity for the verification phase is too high for the VO2 response to be fully expressed (which would lead to the wrong interpretation that VO2max has been confirmed).

ANSWER: This sentence has been deleted to address the reviewer’s concern.

Lines 558-560: I would argue that this research has been done (Am J Physiol Regul Integr Comp Physiol. 2020 Jul 22. doi: 10.1152/ajpregu.00126.2020) and that it should be incorporated into this manuscript.

ANSWER: The reviewer is right, thank you for the observation. The suggested reference has been incorporated into the revised manuscript.

Lines 563-564: As I mentioned earlier, the terms “supramaximal” and “submaximal” should be changed. In fact, defining the terminology would be important in relation to some of the concepts that I discussed before.

ANSWER: The terms “supramaximal” and “submaximal” has been changed to “supra peak WR” and “sub peak WR”, respectively, throughout the manuscript.

Lines 561-569: I like this paragraph because it highlights that any effort to exhaustion that is high enough above the critical intensity of exercise will result in the achievement of VO2max. In fact, if someone’s goal was just to measure VO2max, any “aggressive” constant work rate to exhaustion should be good enough (however, the goal of the test is often more ambitious than just evaluating VO2max). What I do not like though is the final sentence. I do not think that the authors can recommend that “procedures that are within the scope of the reviewed studies” should be used. As I mentioned, it sounds to me as if anything above the critical intensity would do it. The problem is that people do not normally do a verification phase at 75% of peak PO from a 30 W/min ramp! Thus, the recommendation is not easy to justify.

ANSWER: The sentence has been revised and now reads as follows: “Considering that differences in the verification procedure itself do not appear to influence the utility of the procedure, a specific verification procedure cannot be currently recommended. However, some caution must be exercised in the application of WR for the verification phase, to avoid an insufficient protocol duration that does not allow the highest possible VO2 achievement as compared to CPET.” (Page 39, Lines 491-494)

Lines 570-576: I guess I do not need to say again why this is not correct.

ANSWER: This sentence has been changed and now reads as follows: “In this sense, most studies reviewed used verification phase protocols incorporating WRs above 100% of the peak WR achieved in the CPET [114, 97, 77, 81, 98, 38, 96, 39, 57, 54, 63, 111, 119, 61, 69, 99, 94, 124, 126, 92, 107, 115, 53, 75, 76, 82, 93, 112, 80, 117, 122, 123, 95, 110, 56, 58, 62, 87, 120, 59, 71, 74, 78, 83, 89, 90, 15, 121, 70, 25]. However, peak WR utilised in verification phases has varied between 85% [100] and 130% peak WR [81]. According to Poole and Jones [2], researchers must select a WR that is sufficiently higher than that attained on the CPET to give the VO2 signal for the higher WR the opportunity to emerge from the extant noise. In the event that the subsequent verification phase produces a VO2 plateau signifying VO2max, this signal would be lower than expected for the WR based on the previous VO2-WR slope. Hence, Poole and Jones [2] recommended that the verification phase should apply ~110% of the WRpeak attained in the CPET. The authors recognized that this WR may not be ideal for testing all participants, groups or circumstances. On the other hand, Iannetta et al. [25] advocated the adoption of sub peak WR verification bouts in order to allow the VO2max attainment, since WR above the critical power should result in VO2max, as long as the time to exhaustion is sufficiently prolonged.”

Lines 576-578: This is great. Then, almost everything written in this manuscript and most of my justifications were unnecessary! If the authors accept this fact, and thus accept that “supramaximal” efforts are not necessary, then they should have presented this idea from the beginning (even in the introduction), as it simply demonstrates that the concept of the “supramaximal” effort satisfying the plateau criterion is nonsense!

ANSWER: Thank you for your comment.

Lines 585- 589: Brilliant! I agree with this 100%. It is disappointing though that the authors are saving the valuable physiological information for last! Seriously, I think that, in the last two paragraphs, the authors have debunked the “supramaximal” work rate theory. I think that this meta-analysis should not only present data, but also discuss the topic in a more physiologically relevant manner. This important section comes just towards the end of an excessively long discussion.

ANSWER: Thank you for your comment.

Lines 622-632: I think this is all good for discussion. However, in line with previous comments, these different criteria would be useful, in this model, to determine whether the highest VO2 from the incremental test is different from the verification phase beyond normal variability and/or measurement error. Once again, the verification phase approach cannot determine that VO2max has been achieved as its premise is flawed.

ANSWER: Thank you for your comment. The correction has been made and now reads as follows: “A final issue to be addressed refers to appropriate criteria to accept that the highest possible VO2 has been achieved. The most commonly used criterion in the presently reviewed studies stated that the highest VO2 observed in the verification phase should not exceed 3% of the highest VO2 obtained in the CPET. This threshold can be justified by the technical error of measurement and intra-individual biological variation observed in the VO2max attainment [57, 63, 86, 69, 113, 107, 82, 122, 95, 56, 62, 91, 116, 120, 71, 78, 89, 90, 108, 15, 121]. The more restrictive value of ≤ 2% [97, 110] and the less restrictive values of ≤ 5-5.5% [104, 111, 105, 106] may also be appropriate for single or different day variability. In this context, for example, some studies investigated the test-retest reliability of VO2max attainment applying two [97, 87, 120] or even five [95] trials with the same CPET and verification protocols, reporting a coefficient of variation of less than 5% between the highest VO2 values observed in the CPET and verification phase. However, further research is required before recommendations can be made to determine whether the highest VO2 from the ramp or continuous-incremented CPET is different from the verification phase beyond the technical error of measurement and intra-individual biological variation.”

Line 633: Please change wording. The verification phase does not verify VO2max.

ANSWER: The sentence has been deleted.

Lines 633-642: In general, this paragraph is weak. The authors presented the HR responses, but the discussion is very limited. For example, the idea that the kinetics of HR is slower than the kinetics of VO2 is not as evident as the authors make it sound (there is more information on this than the single reference that the authors presented). This could explain the differences in some studies, but likely not in the majority of them. I would argue that a priming effect with already improved blood flow availability before the onset of the verification phase might also play a role (although this is just speculation). Additionally, given that having a lower HR during the verification phase did not affect the VO2 responses, then the sex, intensity, etc. effects become irrelevant in my view. To be honest, I feel that given that this study is about the verification phase for determining whether or not a higher VO2 value can be achieved during this process, I would simply delete the HR analysis as it does not add anything to the story. In fact, it makes it longer and less focused.

ANSWER: The heart rate data analysis has been omitted to focus solely on VO2 data and therefore the mentioned paragraph has been deleted.

Lines 655-672: The conclusion in general is too long as reads more as a summary than a conclusion.

ANSWER: The conclusion section has been revised to address both reviewers’ concerns.

Line 660: This is incorrect. Your analysis did show that different procedures can be applied to establish the same VO2 response during the verification phase as compared to the incremental test. However, by no means you can say that this procedure contributed to establish a true VO2max response (not even if using quotation marks for the word “true”).

ANSWER: We have revised the text and it now reads as follows: “From a practical perspective, our findings indicate that different procedures may be applied to establish similar highest mean VO2 responses during the verification phase as compared to the ramp or continuous step-incremented CPETs.”

Lines 660-662: As I indicated before, what this meta-analysis highlighted is that any hard-enough intensity of exercise (i.e., “respectably” above the critical intensity) performed for long-enough will result in the highest possible VO2 response. Then, the recommendation that the verification phase should be constrained to the ones seen in the current analysis is unfair to me (or at least unnecessary). By doing this, the authors provide some level of validity to the proposed procedures and, indirectly, take validity away from other options that might be equally effective.

ANSWER: The reviewer is correct, thank you for the comment. The following information has been included: “Even then, it is worth mentioning that some caution must be exercised concerning the selection of sub or supra peak WRs since any exercise above the critical power must also be sustainable for sufficient duration to allow the achievement of the highest possible VO2 response in the verification phase.”

Lines 662-666: This makes no sense to me as reaching the same HR response is not a prerequisite to reach the highest possible VO2.

ANSWER: This sentence has been deleted.

Lines 668: No, this verification phase has not validity as proof that VO2max has been achieved. It simply demonstrates that a higher VO2 value cannot be achieved during the verification phase. I insist in the idea that I would argue that this was, in fact, VO2max. However, that was already known from the incremental test (which is clearly shown in your data). What I oppose is the wrong idea that VO2max can be confirmed because the plateau criterion has been met. This is clearly not the case as the model is inappropriate to show that. Some people in the scientific community seems to lack the understanding of the differences between constant work rate and incremental exercise responses, which has caused this misinterpretation of the verification phase as a tool to satisfy the plateau criterion.

ANSWER: The correction has been made.

Lines 668-670: This is correct. The verification phase did not add anything but some level of confidence that people pushed hard enough. I am not against that thought (as long as people do not confuse this with the idea that confirmation of VO2max has been achieved).

ANSWER: Thank you for your comment.

Lines 670-672: This is not a conclusion from the present study, but rather an opinion. I would delete this.

ANSWER: This sentence has been deleted.

Reviewer #2:

General Comments

This study aimed to provide a systematic review and meta-analysis of the validity of deriving VO2max and HRmax from a verification phase, with a cardiopulmonary exercise test (CPET) considered the gold standard. It is a thorough investigation of the studies published within this field to date, and the data appears to have been presented objectively and in a clear manner. The research question is relevant and the analysis necessary.

ANSWER: Thank you for your comment.

One fundamental problem I have with studies discussing the attainment of a “true” VO2max (or maximal anything) via CPET and VP (or equivalent) is that we actually don’t know in any scenario whether a “true” maximal value really has been attained. This almost philosophical point (although I find it to be quite obvious and, in this case, physiological) is almost always overlooked and it has been once again in this paper. I think it would be judicious of the authors to acknowledge – and discuss in some detail – this fact (rather than just skim over it at the end of the discussion, in relation to Noakes’ [150] critique).

ANSWER: The reviewer is correct, thank you for the comment. The manuscript has been revised in order to focus on the comparison between the highest VO2 values during the CPET vs. verification phase. We agree with both reviewers that it remains unclear whether a ‘true’ maximal value really has been attained. Therefore, terms and sentences related to the confirmation of a ‘true’ VO2max have been discarded in the revised manuscript.

Another fairly major concern I have is that the analyses are limited to comparing maximal VO2 measures derived from a CPET versus a verification phase at a group level (also highlighted as an issue by Noakes [150]), which is not the same as measuring validity (i.e., through the agreement of two measures). Two measures can be similar (or not significantly different) at a group level, but the agreement can still be very poor. Without an analysis of agreement, how can validity be inferred?

ANSWER: We agree with the reviewer that this is an important issue of the present study. Unfortunately, after countless failed attempts, we concluded that would only be possible to compare VO2 data derived from a CPET vs. verification phase at a group level. Even so, 26 studies were not meta-analyzed because the authors did not answer our emails. Upon reflection, we have now clearly stated that the aim of the study was “to systematically review and provide a meta-analysis on the application of the verification phase for confirming whether the highest possible VO2 has been attained during ramp or step-incremented CPETs in apparently healthy adults” and the term “validity” has been excluded throughout the manuscript.

Specific Comments

Title

I’m not sure why single quotation marks are needed around the selected terms. Also, “recommendations for best practice” were not scientifically investigated, but are (a small) part of the discussion/conclusion (as is the case in many papers). So I suggest removing these components of the title. This would give something more succinct and specific, like: “The verification phase for confirming maximal oxygen uptake in apparently healthy adults: A systematic review and meta-analysis”.

ANSWER: The title has been revised according to the reviewer’s suggestion.

Abstract

On analyzing the title and aim of the study, I don’t feel this abstract is necessarily a clear summary of the most relevant findings. The results are rather heavily focused on the HR data, which is not a main focus of the study. Please reflect and re-consider.

ANSWER: The results have been rewritten in the ‘Abstract’. To address both reviewers’ concerns, HR data have been excluded from the revised manuscript to focus solely on VO2 data.

L37: CPET on “a” cycle… (the article seems to be missing).

ANSWER: The correction has been made.

L41: The punctuation suggests that this is the age range (and even VO2max) of the women only. Please clarify.

ANSWER: Thank you for the observation. The descriptive data refer to all individuals. The comma has been changed to a semicolon.

L42, 44: n = 52/36… presumably these are the number of studies? Please clarify.

ANSWER: The reviewer is correct. The first version included 78 studies in the systematic review, but only 52 (VO2max) and 36 (HRmax) studies were meta-analyzed. The current version added 2 studies and excluded HR data and now reads as follows: “The highest VO2 in the CPET and verification phase was similar [n = 54, mean difference = 0.03 (95% CI = -0.01 to 0.06) L/min, P = 0.15] (…)”.

L44: Can (should) bpm be expressed to decimal places? Also, 5 d.p. on the P value seems quite excessive (as is the case throughout the manuscript for very small P values – is there any reason for this?).

ANSWER: The HR data have been excluded from the revised manuscript so this has resolved the issue of decimal places for HR data. P values in the revised manuscript have been given to a maximum of three decimal places.

L44-47: It is unclear what the comparisons (3 bpm greater HR) relate to for the three P values. What are they being compared to? Please clarify. That said, I’m not sure why so much focus is given to HR in the abstract, when the study is really about verifying VO2max.

ANSWER: The HR data have been omitted from the revised manuscript, so this issue has been resolved.

L50. Why would concordance (agreement) “put [this] into question”? This seems contradictory. Please clarify.

ANSWER: The conclusion has been revised to address the concerns of both reviewers, and now reads as follows: “The verification phase seems a robust procedure to establish consistent values for highest mean VO2 responses following a ramp or continuous step-incremented CPETs (…)”.

Introduction

The Introduction is long and there is rather a lot of discussion around the general topic of VO2max testing, but sparse detail relating to the actual topic of the study (i.e., the use of verification phases in all their various make-ups). The sentence at L126-129 to me is the crux of the problem and the study, and this is what the introduction should focus on more exclusively. The study referenced in L130-131 ([53]) requires more discussion/explanation, so that there is clearer context for the current study and justification for the sentence at L131-134. I have no issue with the importance of this work, but a bit more clarity around the actual problem (and existing literature) is required.

ANSWER: Thank you for the insightful comments. The introduction has been revised to address the reviewer’s suggestions (please see text in red).

L55-58: This is a very long opening sentence. I suggest breaking it in two (if all content is to remain).

ANSWER: This comment has been addressed in our response to your previous comment.

L61: Should the refs be listed in numerical order (e.g. [1, 3-6])? (There are other examples throughout the manuscript as well.)

ANSWER: The corrections have been made.

L69-70: A transition from that stated, but to what? Continuous fast ramp tests? It feels like the end of the sentence is missing – maybe combine (and condense?) it with some of the text from the following sentence (L70-72).

ANSWER: The sentence has been revised and now reads as follows “These technological advances have contributed to a transition from the original time-consuming discontinuous step-incremented protocols to more time-efficient continuous ramp or pseudo-ramp protocols for determining VO2max [20-25].”.

L74: Should this be “limitations of VO2max” (i.e., the method of measurement)? The limitations “to” VO2max is a different topic, as I see it, more related to training, genetics, etc.

ANSWER: We have now corrected this - thank you.

L83-86: There seem to be two contrasting definitions of the Taylor et al. VO2 plateau criterion here. Please clarify.

ANSWER: The reviewer is correct, thank you for the comment. In the 'Conclusion' section, Taylor et al. stated that the increase in VO2 was associated with an increase of 2.5 % grade (below the VO2max), which was ~ 300 mL/min. If the VO2 at two different grades differs by less than 150 mL/min (i.e. 50% of the expected increase), it can be assumed that a VO2max was attained. To avoid misunderstanding and make the text simpler and clearer, the text now reads as follows: “The landmark study of Taylor et al. [34] was the first to use a formal VO2 plateau criterion, which was defined as an increase in VO2 of less than 150 L/min (or ≤ 2.1 mL·kg-1·min-1, considering an average body mass of 72 kg) in response to a specific discontinuous step-incremented protocol performed over 3-5 laboratory visits.”

L87: Should the ref ([32]) not be included with Taylor et al. even here?

ANSWER: The correction has been made.

L94: Typo on VO2max (the second time).

ANSWER: The correction has been made.

L97-99: Check and change the grammar/punctuation. Something is going on around “investigators; however, due to”, which makes the sentence incoherent (to me).

ANSWER: The revised manuscript text reads as follows: “However, this approach has been widely criticized by numerous investigators due to the individual variability in maximal physiological responses for these variables and lack of specificity in identifying individuals who did not continue the CPET to their limit of exercise tolerance [40, 36, 37, 29, 38, 2].”

L77-104: This is a very long paragraph. I suggest splitting it in two – the first relating to VO2max criteria, the second to the secondary criteria. Or write the content more concisely to produce one shorter paragraph.

ANSWER: The reviewer is right, thank you for the observation. The mentioned paragraph has been split into two paragraphs: the first one introducing the VO2 plateau and the other one addressing the limitations of secondary criteria.

L105-139: These two paragraphs are most important in justifying the current study, so I think a more in-depth discussion of this literature is required (instead of the level of detail presented in the three preceding paragraphs, which could be significantly condensed).

ANSWER: The introduction has been revised to address the reviewer’s suggestions (please see text in red).

Methods

L162: “ergometer or treadmill” – was this limited to bi-pedal running on a treadmill? Please specify, as many other modes of exercise are possible on a treadmill (e.g., cycling, hand-cycling, wheelchair running, inline skating, roller skiing, etc.). It would be useful to make this clarification through a clear definition somewhere in the paper, that by “treadmill” (see for example Figure 1, “Only treadmill”) you actually mean “treadmill running” (if that’s the case).

ANSWER: The sentence has been revised as follows: “the CPET was carried out on a cycle ergometer (i.e. bipedal cycling) or treadmill (bipedal running or walking)”.

L171-172: “In the final review, we provided…” – is this referring to what is presented in the current manuscript? Please clarify, as “final review” and “we provided” is a bit unclear to me.

ANSWER: Thank you for the observation. The text has been revised and now reads as follows: “We provided a flowchart of included and excluded studies, with reasons for their exclusion.”

L196: You have previously written abbreviations within round brackets in square brackets: (95% confidence interval [CI]).

ANSWER: The correction has been made.

L197: Out of interest, what did you do in cases where VO2max was reported in mL/kg/min?

ANSWER: In the absence of absolute VO2 data, we performed the metabolic conversion. However, most of the studies presented the data in L/min or mL/min units and the few studies that only presented the data related to body mass we got to 'rescue' ~90% of the data from the authors via e-mail.

L210: P-values “were” obtained…?

ANSWER: The correction has been made – thank you.

L215: Is this less than 50%, or less than or equal to? The symbol looks unclear to me.

ANSWER: Less than or equal to (≤).

L223: The studies were also…

ANSWER: The correction has been made.

L225: Stratified analyses were also…

ANSWER: The correction has been made.

Results

The main issue for me throughout this section is the long lists of references accompanying each result. This is not common in other studies of this type that I’m familiar with, and to me it makes deciphering the interesting information nigh on impossible. I would recommend removing these long number strings.

ANSWER: The correction has been made.

L241: (interquartile range [IQR]) – or consistent with previous presentation.

ANSWER: The correction has been made.

L240-245: You write that “the sex of 130 participants was not specified”, and then that “one study did not specify the sex of the participants (see Table 1)”. In that study (Scheadler and Devor [92]) n = 13, so I don’t understand the mismatch between 130 and 13. Please clarify.

ANSWER: Some studies presented data from males and females, but without reporting how many were males and how many were females. Scheadler and Devor recruited 13 individuals; however, the sex of them was not stated.

L246: BMI should presumably be defined after the words (body mass index) and doesn’t then need to be included in the brackets.

ANSWER: BMI has been previously defined in ‘Data Extraction and Management’ section (Page 7, Line 164). The parentheses and square brackets were used here to clarify what measures were adopted [i.e. mean ± standard deviation (range)].

L246-247: The square and round brackets seem to have switched places in this sentence, any reason?

ANSWER: As answered in your previous comment, there are other results within square brackets, such as mean, SD, range. This is consistent with values from those reported in global analyses (i.e. VO2max).

L247: Writing “(VO2max normalized to body mass)” seems superfluous when you have the unit as mL/kg/min. Consider removing.

ANSWER: The correction has been made.

L251: “Characteristics of studies using CPET…”?

ANSWER: The correction has been made.

L253, 255: “on a cycle/treadmill” (again, the article is missing).

ANSWER: The correction has been made.

L253-307: These long strings of references make the results very unreadable. I suggest removing them all, as finding the actual interesting numbers (i.e., the results) through the long lists is so difficult.

ANSWER: The correction has been made.

L272-274: Could you re-phrase to fix the grammar and clarity on: “whereas 29 (37%) used fixed intervals of 15- to 30-s (or 2 × 15-s), both averaged and fixed times (1%) [61]… etc.”. I guess the 29 (37%) relates specifically to the 15/30-s fixed interval data, so the sentence needs to be re-structured and improved to clarify this in relation to the other methods listed.

ANSWER: The text has been revised for grammar and clarity according to the reviewer’s recommendation.

L279, 281: I think you need to include the “min” unit after 5, 6, 6, 9 and 15 – or not if you were to remove all the references in brackets (another example of how difficult the interesting numbers and results are to decipher from the long [and unnecessary?] lists of references).

ANSWER: Thank you for the observation. In accordance with the Reviewer’s suggestion, we have added the “min” unit.

L285-286: Why are two different %ages presented (19 and 19.7)?

ANSWER: We apologize for this error. The correction has been made and now the text reads as follows: “Fifteen studies (19%) carried out the verification phase on a different day to the CPET”.

L291: Suggest removing “i.e.”? Not included elsewhere.

ANSWER: The correction has been made.

L295: Should this be “the” maximal-intensity work rate, rather than “a” (presumably it was specific to that study and the preceding VO2max test).

ANSWER: The correction has been made.

L297: Could you briefly describe in this sentence what the formula was based on?

ANSWER: The following sentence has been included in the revised manuscript to address this issue: “predicted WR based on a formula (1%) to elicit the subject’s limit of tolerance within 180 s as follows: power output = (finite work capacity ÷ 180 s) + critical power”.

L300: “Forty-two studies (54%)” – consistent reporting.

ANSWER: Thank you for your comment.

L300: obtained “during” (rather than “at”)?

ANSWER: The correction has been made.

L366-onwards: Is there a reason for changing the presentation (order of using) round and square brackets again? And see my point above (in the Abstract) about the number of decimal places on the P values < 0.001. Is there any statistical reason/need for this?

ANSWER: Usually, the main findings of the meta-analysis are described from the mean difference, 95% CI, and P-value. We have revised the number of decimal places on the P-values throughout the manuscript.

L383: (performed on the same day as vs. a different day from the CPET) – suggestion.

ANSWER: The correction has been made.

L387-388: Could you include the P value here for this no sig diff, as it is a key result.

ANSWER: The required information has been included in the revised manuscript.

Discussion

At times I struggle to follow the logic of the arguments in this (very long) discussion, so I think the interpretations can be written more clearly and concisely in places. In addition, there is a lot of discussion of previously published studies and concepts, without reference to the findings from the current results. This seems inappropriate for a systematic review/meta-analysis, so I would encourage the authors to focus more on their own findings in light of previous work, rather than merely presenting a review of the existing literature.

ANSWER: The Discussion has been revised to address both reviewer’s suggestions (please see text in red).

L422: Reconsider “over” in this sentence. Maybe “rather than”?

ANSWER: The HR data have been removed from the revised manuscript, which has resolved this issue.

L433-435: This study did not analyse children or clinical groups, so where is the “current evidence suggesting that the verification phase is a safe and well-tolerated procedure to confirm attainment of true VO2max” in these groups? This particular study can surely only make this claim about the apparently healthy adults who were analysed, or am I missing something?

ANSWER: The reviewer is correct, thank you for the comment. The relevant sentence has been moved to the last paragraph of the revised ‘Discussion’ to emphasize the need for future meta-analysis studies focusing on special populations.

L445: “of a ramp-incremented…” (missing article).

ANSWER: The correction has been made.

L455-456: Are 17% and 33% comparable in this sentence? If so, please use the same unit (either CPETs, or participants) in order to compare like with like (e.g., “17% of participants (2 of the 12) during a cycling ramp-incremented CPET, while 33%…”.

ANSWER: This information has been revised as follows: “Similarly, Rossiter et al. [37] reported the occurrence of a deceleration in the VO2 response at the limit of exercise tolerance in only 17% (2 of the 12) of cycling ramp-incremented CPETs (WR increment of 20 W/min), while 33% of the incremental tests elicited an accelerated VO2 response, and 50% demonstrated a linear VO2 response.”

L477-478: Is this statement true? If so then I’m missing something. Re-reading ref #100 (McGawley 2017) it is stated that: “There was a significant effect of test type on VO2max, with higher values recorded during STEP compared with VER (P = 0.013)”. Can you clarify how you’ve come to this conclusion (three studies to-date) and how you conducted this analysis/check?

ANSWER: Apologies for missing this. We have double-checked the results from all meta-analyzed studies and included the mentioned study as one of those who reported significant differences between the VO2 data from CPET vs. verification phase.

L494: Is there a typo here: “4 of the 7 participants (9%)”?

ANSWER: The correction has been made.

L496: “11 participants (9 men; age…)” – it appears as if you are only reporting the descriptives for the men, is this the case? Please clarify.

ANSWER: To address this comment the revised text now reads as follows: “In another study with 9 men and 2 women (age: 22.4 ± 3.21yr.; VO2max: 51.6 ± 4.47 mL·kg-1·min-1)”.

L499-503: From your results and Figure 2 this looks like an outlier. Has/should there been any accounting for outliers in your analyses?

ANSWER: We are in agreement with the reviewer’s comments about Colakoglu’s data. Notably, after omitting their data, there was no change in the main results of the present meta-analysis. It was therefore not appropriate to omit these data from our analysis and we prefer that readers draw their own conclusions about the quoted study.

L477-516: This is a very long paragraph (> 1 page). Please consider shortening. I don’t think all the detail of the three specified studies is required (L478-503) – this could be condensed and written more concisely.

ANSWER: The mentioned paragraph has been reviewed in order to focus on the main findings of the present meta-analysis.

L508-516: This seems to explain this result as an outlier. What happens to your findings (CPET vs verification phase VO2max) if this study is removed from your overall analyses?

ANSWER: As commented previously, after removing the VO2 data from the Colakoglu’s study there was no change on the present meta-analysis findings regarding the highest VO2 values from the CPET vs. the verification phase (e.g. P-value changed from 0.15 to 0.17 after removal of this study).

L517-519: I’m not sure I agree with this statement (or maybe I misunderstand what you mean). If CPET = VP or if CPET > VP then is that confirmation of a “true” VO2max during the CPET? Can both tests not elicit a VO2max that is lower than an individual’s “true” VO2max in this scenario?

ANSWER: Within this scenario (i.e. highest VO2 from the CPET ≥ to highest VO2 from the verification), we can only confirm that VO2max was likely attained. According to Midgley et al. [97], if the mean highest VO2 attained in the verification phase is significantly higher than in the CPET, the investigator should consider that the CPET protocol was inadequate in eliciting a highest possible VO2 response in all, or at least some of the participants.

L520-521: I don’t quite follow the logic of this follow-up sentence. Are you saying that there would need to be a difference in order for the statement in the first sentence to be true? Why? Please clarify.

ANSWER: We are saying our data concurs with the notion that there was no significant difference between the highest VO2 values attained during the CPET and in the verification phase. In other words, this means that a verification phase applied after ramp or continuous step-incremented CPETs may offer robust evidence that the highest possible VO2 has been achieved, and it is not affected by sample’s characteristics, exercise modality, or CPET and verification protocol designs. The mentioned sentence has been revises and now reads as follows: “The present meta-analysis findings in 54 CPET/verification phase studies are in good agreement with this premise. Our data did not confirm the existence of significant differences between the effect sizes of the highest mean VO2 values attained in CPET and subsequent verification phase in primary studies with relevant data available [n = 54; mean difference = 0.03 (95% CI = -0.01 to 0.06) L/min, P = 0.15] (see Figure 2).”

L526: Why “only” 25 (i.e. 27%)? To me this 27% of the studies is important in demonstrating that the CPET doesn’t always do its job properly (i.e., in eliciting a “true” VO2max). This is where the analysis of agreement is important too – what is the similarity (or dissimilarity) in VO2max values derived from a CPET vs VP “within” individuals? Please comment.

ANSWER: Considering the 103 experimental conditions that were meta-analyzed, 25 displayed average VO2 values in the verification that exceeded the average value of the VO2 attained during the CPET, however, without statistical significance. Only one study (i.e. Colakoglu et al. 2016) showed a significant difference in favor of the verification phase (i.e. higher ‘peak’ of VO2 in the verification phase vs. the CPET).

L517-528: I actual don’t follow the logic or point being made in this section. Could you please try to clarify?

ANSWER: This paragraph has been revised as follows: “To-date, this systematic review has demonstrated that the majority of studies have shown that the highest mean VO2 values elicited by verification phase bouts were similar (or not statistically or meaningfully different) to the attained VO2 values in continuous ramp or pseudo-ramp CPET protocols [104, 114, 36, 125, 97, 37, 77, 81, 38, 60, 96, 39, 57, 102, 101, 63, 86, 88, 119, 61, 69, 73, 79, 99, 64, 65, 94, 113, 124, 126, 103, 107, 115, 53, 66, 75, 76, 112, 67, 80, 85, 117, 123, 56, 72, 84, 87, 91, 116, 118, 120, 59, 71, 74, 78, 83, 89, 90, 105, 108, 109, 15, 121, 70, 106]. The present meta-analysis findings in 54 CPET/verification phase studies are in good agreement with this premise. Our data did not confirm the existence of significant differences between the effect sizes of the highest mean VO2 values attained in CPET and subsequent verification phase in primary studies with relevant data available [n = 54; mean difference = 0.03 (95% CI = -0.01 to 0.06) L/min, P = 0.15] (see Figure 2). In fact, the mean absolute difference of 0.03 L/min represents a relative error of only 0.85% between the highest VO2 values attained in the CPET and verification phase (this is within the most commonly adopted measures of test variability at 2-3%). Comparing only the mean values from the highest VO2 responses in either CPETs and verification phase bouts (i.e. regardless statistical significance reported in the primary investigations), 27 (i.e. 26%) out of 103 specific experimental conditions included for overall comparisons presented average VO2 values during the CPETs that were below those attained in the verification phase [mean diff = -0.06 (-1.6%) L/min]. Whereas, 76 conditions the highest VO2 responses during the CPETs were similar or above to those attained in the verification phase [mean diff. 0.06 (1.7%) L/min] (see Table 3 and Figure 2). In addition, the present findings also provide consistent and unbiased confirmatory evidence that the reproducibility of the highest VO2 value during CPET and verification phase does not appear to be affected by sex, cardiorespiratory fitness, exercise modality, CPET protocol design, or even how the verification phase was performed (see Table 4 and Figure 4). Collectively, these findings indicate that the verification phase procedure provides some level of confidence that the highest possible VO2 has been likely elicited during a single session CPET.”

L542: “who” underwent?

ANSWER: This paragraph has been deleted.

L544: was similar “to the”?

ANSWER: The correction has been made.

L547: At this point I’m really struggling to follow the logic and arguments presented over the last few pages. Are you saying that CPET should be higher than VP in order to accept that a true VO2max has been attained in the CPET? Why? What is the problem with CPET = VP? A more fundamental question, in my opinion: Why is it not acknowledged/discussed that individuals can very easily underperform on BOTH tests, and that we really don’t have any idea as to whether we have attained a “true” VO2max at all. Please comment.

ANSWER: This sentence has been deleted to address the review’s concerns.

L552-557: The “different” methods described previously for study #94 are also relevant here, as is my comment above (i.e., that it is always possible that neither test was truly maximal and elicited a “true” VO2max).

ANSWER: The relevant paragraph has been omitted from the revised manuscript to address the reviewer’s comments.

L557: What is meant by “this put into question” in this context?

ANSWER: The mentioned expression has been deleted.

L561-562: This is the first time this endeavor has been mentioned (except in the title). Please re-consider the phrasing here (and in the title!) – especially given the conclusion of this sentence (L567-569), i.e., that no best practice can actually be recommended.

ANSWER: The correction has been made.

L570-571: This list of 6 references does not seem complete, or to reflect “most studies”. Please clarify.

ANSWER: The reviewer is right, thank you for the observation. The mentioned list of references has been updated.

L570-578: What new insight does this paragraph add, from the current results, which was not already known? Please embellish with additional information, or remove.

ANSWER: This sentence has been changed to address the first reviewer’s comments and to keep coherence with the main findings of the present meta-analysis.

L584-585: I don’t think decimals are needed on these %ages.

ANSWER: The correction has been made.

L579-596: Again, how do the current results relate to the previous literature? This is not a review article, so as I see it the discussion section should be used to present the results of the current study in the context of previous results. The information presented here (that 105% was different from 115% according to Nolan et al.) is not supported by your results, as I understand, since you saw no significant effect of VP intensity. This is what ought to be discussed, in my opinion.

ANSWER: This is a good point. Although the present meta-analysis had not detected any potential moderator between the highest VO2 values during the CPET and verification phase, it is worth mentioning that an inappropriately high-intensity in the verification phase protocol (as expressed by the WR selected from that achieved in the CPET) would result in a short test duration that results in insufficient time to reach VO2max. In a recent study of Iannetta et al. [25], the question about “how much lower/higher than WRpeak should the WR of the verification phase be set at?” have been widely discussed and included in the revised discussion section. Thus, we considered it important to highlight some aspects of the verification phase design even in the lack of statistical significance in our data.

L597-610: Again, this is a review of the existing literature. Please discuss the results of the present study.

ANSWER: The ‘Discussion’ section has been rewritten in order to address both reviewer’s demands.

L613: Remove the extra space(s) between Small and sampling.

ANSWER: The correction has been made.

L614: rapid changes

ANSWER: The correction has been made.

L611-621: Same issue again - this is a review of the existing literature without reference to the current study results. Please reconsider.

ANSWER: The ‘Discussion’ section has been rewritten in order to address both reviewer’s demands.

L624: should not exceed

ANSWER: The correction has been made.

L636: on the duration of

ANSWER: The mentioned paragraph has been deleted.

L642: when “what” are short? The VPs? Please clarify.

ANSWER: The mentioned paragraph has been deleted.

L649-653: I don’t think the second sentence is a good enough “get-out” given the significance of the criticism stated by Noakes. This underpins the entire concept of “validity”. Do not overlook or underplay the fact that your final sentence, which you say you are not doing (“rather than the question of whether an individual has elicited a ‘true’ VO2max.”), is exactly what you say you are doing in the title (as I interpret things)!

ANSWER: This paragraph has been revised and the ‘validity’ term has been excluded throughout the manuscript.

L655: Have effect sizes been presented anywhere?

ANSWER: These data have been presented in Table 3.

L656-657: “in cycle ergometry and treadmill running”?

ANSWER: The correction has been made.

L660: “compromising their ability” (plural)

ANSWER: The correction has been made.

L670: I don’t understand this, in the context of the previous sentences: “The mandatory application of the verification phase in all situations may be therefore questioned”. Why questioned?

ANSWER: This sentence has been deleted.

L671: settings?

ANSWER: This word has been deleted.

Tables & Figures

The studies appear to have been ordered chronologically and then alphabetically in Table 1, with this ordering system then continued throughout the later tables & figures. This seems arbitrary (chronologically then alphabetically) and makes it difficult to locate any specific study in the later tables/figures. Could you order the studies entirely alphabetically or according to the reference numbering from the outset?

ANSWER: The correction has been made and now the studies are ordered alphabetically.

Is there any reason for presenting the subgroup analyses according to the characteristics of the verification phase protocol in a “figure”, while the subgroup analyses regarding sex, cardiorespiratory fitness level, exercise modality, and CPET protocol are presented in a “table”? Could this method of presentation be standardized? Also, I’m not familiar with the presentation used in Figure 4. Can you provide more information about how to read it (top line with green box, middle line with green box and black diamond), as it won’t be clear to all readers.

ANSWER: This design is similar to that adopted in a previous meta-analysis (Cornelissen and Smart. Journal of the American Heart Association, v. 2, n. 1, p. e004473, 2013). The Table 5 shows 4 sub-analyses, as follows: according to sex, cardiorespiratory fitness level, exercise modality and CPET protocol. Each of them has 2 or 3 groups. In addition, this table presents the duration from CPET and verification. Therefore, we decided to use a table instead of a figure to report relevant outcomes, but in a smaller illustration. A similar plot was presented in that aforementioned meta-analysis. The green boxes represent the VO2max obtained in the CPET vs. verification for each subgroup. For example, there is a green box for VO2max between CPET and verification for sub peak WR and another for supra peak WR analysis. There is a green box for active and another one for passive recovery, and so on. The black diamond represents a combined effect between groups (i.e. sub peak WR and supra peak WR for intensity, active and passive for recovery, etc.). In other words, the black diamond takes into consideration the effect size of all subgroups within each analysis to provide a final result.

Table 1: The heading “mean values” should probably be aligned over the final three columns to the right, as sex and N are not means. Also, ranges should be differentiated in this heading, if that’s what those are and if they can’t be expressed as means (e.g., 25-35 and 19-61). And can/should the number of decimal places be standardized in the data? Any reason that some terms (e.g., Sedentary, Cyclists, Runners, Athletes) are capitalized, but others aren’t?

ANSWER: Table 1 has been revised accordingly.

Tables 3/4: Can Total be clarified (presumably it’s the number of participants, but this is not stated anywhere). The %Weight is hard to comprehend – I have no experience of this measure or its calculation, but the statistical power seems to bear no relation to N, which seems odd to me. Can you explain?

ANSWER: Total is the number of participants who performed CPET and verification phase. According to the Review Manager (RevMan), we can observe the following order of the data: Test A: mean, standard deviation and N (participants) vs. Test B: mean, standard deviation and N (participants). The %Weight is attributed to each study due to its statistical power. The %Weight is affected by N and mainly by the standard deviation. Please, look at two studies: Weatherwax et al. [122] (second experimental condition) and Kramer et al. [91]. Although the sample size (n = 6 and 15, respectively) were not huge, the standard deviation was low. The %Weight from the studies of Astorino and DeRevere [56] (second experimental condition) and Niemeyer et al. [126] overcame 3% due to the large sample size (n = 79 and 46, respectively).

Table 5: Can horizontal lines be used to clarify where each category (TTE, VO2max, HR) starts and ends (i.e., to the right of each N)?

ANSWER: The correction has been made.

Figure 1: Can you clarify (even if just to me) why the 1 full article excluded in the Eligibility stage due to “Non-maximal exercise test protocols…” had not already been excluded for the same reason in the Screening stage?

ANSWER: In the records screened for inclusion after removing duplicates, we have looked at the title and abstracts and during this stage it was possible to identify 26 articles to be excluded. However, only after the following stage (i.e. full articles assessed for eligibility), it was possible to detect another study to be excluded. These numbers match with our flowchart. We have also contacted the authors to check whether or not the verification phase was performed until volitional exhaustion.

Figure 2: The data suggests to me a tendency for CPET to be higher than VP. Is there any accounting for potential outliers (e.g., Colakoglu et al.)? What happens if this study is removed from the analyses (if there is good reason to do so, which reading the discussion there might be)?

ANSWER: As previously mentioned, there was no impact on our results after removing the VO2 data from the aforementioned study.

Figures 2-4: The quality of these figures is poor (due to the high level of detail). Can they be presented at a higher resolution?

ANSWER: The figures have been remade with 300 DPI to improve resolution.

Reference list

ANSWER: The correction has been made.

L684: (1985) should presumably be removed from the JAP title?

ANSWER: The correction has been made.

Attachment

Submitted filename: PlosOnes_response letter.docx

Decision Letter 1

Laurent Mourot

7 Dec 2020

PONE-D-20-25408R1

Verification phase for confirming maximal oxygen uptake in apparently healthy adults: A systematic review and meta-analysis

PLOS ONE

Dear Dr. Cunha,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Specifically, a thorough revision of the English is required, as well as a revision of the Discussion section to be clearer.

Please submit your revised manuscript by Jan 21 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Laurent Mourot

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: No

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: I would like to thank the authors for their attention to my comments. I think this revised version is much better and I believe that this manuscript would be an excellent reference on the topic. From my perspective, I only have on minor suggestion that I would leave up to the authors to accept or reject. This is: As presented, the title seems to imply that a verification phase is something necessary. Would it be better crafting the title as a question? Something like “Is a verification phase for confirming maximal oxygen uptake useful in apparently healthy adults: A systematic review and meta-analysis”.

Regardless, I would like to congratulate the authors for the high quality of this meta-analysis. I often feel that this type of scientific contributions are not as meritorious as original research, but I think that with the amount of information that is currently available on this topic, this particular meta-analysis is fully warranted. Thanks!

Reviewer #2: General comments

Preparing this manuscript has clearly been a huge job, so I commend the authors on this significant undertaking. I have again reviewed this paper in thorough detail and have a number of observations and feedback points, specified below.

In general, I would firstly urge the native English speakers on the author list – or a professional proof reader – to take responsibility for thoroughly checking the language (especially the grammar) before resubmitting, as the current level of writing makes the text difficult to comprehend. There are particular issues in the new sections of the Discussion that need re-writing/correcting to improve the clarity and flow.

Secondly, and perhaps most importantly, the Discussion to me is too long. It makes it impossible for the reader to grasp the key findings and messages from this paper, as there is just so much detail of previous studies, their protocols and specific findings (data, P values, etc.). I personally feel that the paper would be far more comprehensible and impactful if the Discussion was more concise.

Abstract:

L33-35: The use of semicolons in this list seems odd, and makes it difficult to understand how the search was conducted. Should they be commas? Please re-consider.

L43: Why the comma after VO2max but not after age? Also, the -1 on the VO2max unit looks too high.

L44: Can you clarify that the VO2 values were similar in 54 of the 80 studies, because at the moment it seems like this is a result for a total of 54 studies analyzed, which I don’t think is the case. Suggestion: “The highest mean VO2 in the CPET and verification phase was similar in 54 of the 80 studies (mean difference…)”. Also, I still don’t understand the inconsistencies in the use of square/round brackets. Can this be standardized throughout the manuscript? Also, why change to L/min in this sentence, after presenting mL/kg/min in the previous sentence? Can this be standardized?

L51: “following a… CPETs” = incorrect grammar. Maybe write CPET.

L51-53: I like this idea, but to me it needs to be related specifically to “your” findings, not attributed to “some [other?] researchers”. Maybe something like: “However, given the high concordance between the highest mean VO2 achieved in the CPET and verification phase, findings from the current study would question its necessity in all testing circumstances.”

Introduction:

L70: Maybe mention the Douglas bag method here, since that is the main reason for the improved time-efficiency (i.e., changing from DBs to B-b-B systems). The type of protocol is merely a by-product.

L73-74: This sentence is not quite right. The “confirmation of VO2max attainment” is not “due to” the listed factors; the “lack of attaining” VO2max might be due to those factors. Please re-phrase.

L82: Presumably this is average “male” body mass. Please specify (there needs to be transparency when sex biases exist in data sets).

L85-89: Is this a new paragraph? It seems very short (2 sentences). Maybe it should be a continuation of the previous paragraph, since it’s the same topic (VO2 plateau)?

L93: “is attained”?

L95-96: I’m not sure what this means: “lack of specificity in identifying individuals who did not continue the CPET to their limit of exercise tolerance”. What is meant by “specificity” in this context?

L103: CP is specific to power (usually analogous to cycling) and not practically applicable to other exercise modes (such as running). Also, “i.e.” is specific (“that is”). So maybe change i.e. to e.g. (“for example”). Or add speed/velocity to power when referring to CP (so CS/CV).

L106: Suggestion (to avoid the use of “each other”): “the highest VO2 values in the CPET are consistent with the verification phase”

L108-111: Please present the “evidence” for Poole & Jones’ statement (i.e., “that a verification phase must be performed at a higher WR than attained in the ramp-based CPET protocols in all future studies”). Where/what is the evidence for this suggestion?

L111: The terminology “On the other hand” seems inappropriate, as this is a different topic compared with the previous sentence (VP WR vs CPET increments). The correct counter-argument here would presumably be the data showing that VP WR can actually be “lower” than WRpeak to still elicit a VO2peak/max.

L113-116: This is now confusing, with “In contrast” following “On the other hand”. Please clarify the arguments that you are making, and the key information you’re trying to get across. I also think WRpeak needs defining. Also, is it necessary to list average W values?

L108-120: In general, this section needs to be more clear and concise.

L120-132: I think all of these ideas can be incorporated into one far more clear and concise paragraph explicitly stating the aims of the study. I suggest moving the questions (at L120-125) down to L138-141 (Methods) and just focusing on the aims here, as required.

Methods:

L136: “is” shown in S1 Text.

L138-141: These are not completely consistent with the text in the Introduction (L120-132). Also, you are still using the terminology “valid alternative to confirm”, which I understood had been changed throughout (since this is not a study of validity). And I’m not sure the study does actually identify “the most appropriate protocol for applying the verification phase”. So I would use questions similar to those currently posed in the Introduction (L120-125) here instead.

L155: participants “who” were

L157-158: Suggestion: “…carried out using bipedal cycle ergometry or bipedal treadmill running or walking.”

L160: Remove “included” (as you already have “involved”, above). Also, importantly, “and” should presumably be changed to “or”, since any one of those three situations would surely lead to exclusion.

L166-167: Writing in the first person seems odd here (we). Are you saying that a flowchart has been included in the paper? If so, where is it? If not, why mention it?

L170-174: There is still an inconsistent use of round and square brackets. Please re-consider (round brackets are typically used in the first instance, and [square brackets] within round brackets.)

L175: Can you specify that you mean “other” authors than yourselves, so “authors of the original articles were contacted…”.

L185: Does “selective reporting” need to be written twice here?

L187: You are inconsistent with your capitalization (or not) of the sub-title words

L206: “the primary study groups results” requires an apostrophe somewhere on groups, depending on the specific meaning (one group or multiple groups) – please correct.

L207: Same here – “groups” needs an apostrophe.

L208: The less than or equal to symbol is still odd here. The lower “equal to” line should be horizontal.

L209: using “a” funnel plot, or using funnel “plots”?

L216: This new red text is unnecessarily complicated. Firstly, you have previously used the term WRpeak (although this doesn’t seem to have been defined anywhere in the manuscript). Secondly, sub and supra peak does not need to be reinforced with < or > 100%, that’s obvious by definition. Suggestion: (i.e. < 100% WRpeak vs. > 100% WRpeak)

L218-220: This sentence could be clearer, suggestion: “as the CPET or on a different day, and the duration of the verification phase (i.e. ≤ 80 s, 81–120 s, > 120 s).”

L222: Should this be “cut-off points”?

Results:

L229-230: This (“Figure 1 summarizes the screening and selection process”) seems like repetition from the Methods section (L166-167). Should the flowchart and reference to it be moved up to the Methods (L167)?

L234: Presumably your sub-section headings should use a different font/presentation from your main section headings.

L235-239: In L235 you use the term “eligible studies”, in L236 you write “included studies” and in L239 you write “primary studies”. This is confusing, because I think you are talking about the same 80 studies in all cases. Can you use consistent terminology for clarity?

L240-241: BMI has already been defined in the paper, so I suggest: “participants had a BMI within the normal range (mean ± SD [range]: 24.4…”. And once again, you need to be consistent with the use of round/square brackets.

L242: VO2mx does not need to be written twice; I suggest removing the first reference to it: “cardiorespiratory fitness (VO2max mean ± SD [range]: 46.9…”.

L243: Delete “according to… [53]” as these have already been defined in the Methods (L222-223).

L252-256: I think this could be clearer, e.g.: “Thirty-three (41%) of the 80 studies included in the review adopted one or more of the traditionally reported plateau or secondary criteria to confirm the attainment of a VO2max, with 30 using a VO2 plateau criteria, 21 using the heart rate plateau or age-predicted maximal heart rate, 18 using RERmax, and 8 using post-CPET blood lactate concentration.”. Line 256 is missing a full stop in any case.

L258: I don’t think the second “time averages” should be hyphenated.

L263: Should this read something like “Regarding the period between CPET and VP…”? Because you seem to be referring to the rest period after the CPET here, right?

L266: What do you mean by “self-paced approach”? A self-paced recovery period? Of any duration?

L270-271: You need to be clearer that you are now referring to the VP exercise intensity. I also think you should decide upon the terminology earlier in the paper, define it and stick to it (e.g. peak WR or WRpeak, supra or > 100%, etc.). Because at the moment there is a mix.

L272: Maybe write 105-130%, consistent with the presentation of ranges in the previous paragraph.

L273: What do you mean by “or both peak or supra peak WR (1%)”? Is this one study, a different study from the “Eight studies” previously stated? If so, please specify. Also, do you mean both peak “and” supra peak? This is unclear to me.

L275: Again, does 1% relate to one study? If so, you could maybe write (one study, 1%) for clarity/consistency. Also, you have previously used the term “participant” rather than “subject”.

L277: “85-95% of…”

L281-282: “1.5-2.2”; ”50-150” (consistent with previously stated ranges).

L294: Figure 2 (not Figures).

L322: I suggest removing the two commas, as this “middle” information is imperative to understanding the analyses.

L338-339: Please add units to the IQR, presumably s.

L339-340: This very last line of the Results section: “There were no significant differences between the CPET and verification phase for VO2max (P = 0.18 to P = 0.71)”… Can you be more specific about what the sentence, and the range of P values, relate to. For example, how does the information differ from that stated in L295-297 (i.e., “Notably, the mean highest VO2 values were similar between the CPET and verification phase [mean difference = 0.03 (95% CI = -0.01 to 0.06) L/min, P = 0.15].”)? Presumably you are referring to the sub-groups, but it’s currently unclear from what you have written.

Discussion:

In general the writing in this section, particularly the new parts in red, need reviewing and revising to improve the clarity of the messages being communicated. The grammar is poor in places, which makes the important information difficult to understand. Also, the Discussion to me is far too long. There are unnecessarily lengthy descriptions of studies and protocols throughout, so I suggest writing more concisely. Also, there are lengthy descriptions relating to points not supported by the current data analysis. In my mind, the authors need to make a significant overhaul of the discussion and cut it down in length, such that the important points and messages are communicated far more clearly.

L357-359: The word “repeated” appears out of place here, since the VP is not repeated (it is only carried out once). Also “compared to that observed” is unclear – what are you comparing to? The maximal effort and VO2max (in which case, plural = those)? Please review and revise this sentence for clarity.

L360: “To date” should not be hyphenated (the term also seems odd/unnecessary). Also, consider re-phrasing: “has demonstrated that… have shown that…”.

L364: “findings from 54 CPET…”. Also, I don’t really understand what this sentence adds to the previous sentence. They seem to say the same thing. Can they be combined?

L366: Please revisit the grammar and revise this sentence.

L369-370: I suggest removing this important information from the brackets and supporting the statement with a reference (i.e., that this is within the most commonly adopted measures of test variability at 2-3%).

L371: Revise the grammar.

L374-375: Revise the grammar.

L383: “for example” appears out of place here, in a new paragraph.

L383-393: This is a long account of a previous study. What point is being made?

L383-403: This is a long paragraph describing a few previous studies and their results, but there is no reference to the overall findings of current study. Please revise.

L404: A new paragraph should not begin with “On the other hand”. Also, is this point (“54 studies meta-analyzed”) now in relation to your meta-analysis? Please clarify. In general, I suggest you revise your paragraph structures and arguments to arrive more quickly at the points you are trying to make, in light of previous work, and link the story to the findings of the current analysis. At the moment I am wondering where all this is going (particularly the paragraph above), and what the relevance is to the findings of your systematic review/meta-analysis.

L406-409: The use of brackets seems inconsistent here too.

L425-426: Please revise the writing here: “since low cardiorespiratory fitness are more susceptible to stopping early during the CPET”

L404-432: This is another very long paragraph describing previous studies in length. I think these points can be made far more concisely and interpreted more clearly in light of the general findings from your study.

L427-432: You seem to have spent a long time describing a possibility that is not actually supported by your own data analysis. I suggest turning your thinking around, and discussing what you did find and what that actually means, in relation to the specific studies analysed. I get your point, but it is not actually supported by your stats, which is the issue. Alternatively, make this point more concisely so that it is not over-inflated.

L433: Due to the shear length of this discussion I have read the remainder with less of a focus on details. As stated above, my advice is that the writing is more clear and focused, with less extensive description of previous studies.

L499-501: Again, is there any data to support Poole and Jones’ suggestion here (i.e., that researchers “must”…), or is it a subjective view (because other research would suggest that this is not necessary)?

Table 1:

Title: (N = 80) – include space; CPET has not been defined here, but it has in the Fig 2 title.

Table 2:

Abbreviations for TR and CYC seem to be missing (I haven’t read every detail of Table 2, so there might be other bits missing).

Table 3:

Can you clarify in the table heading row what is meant by “Total”

For Arad et al., % Weight should probably be written to 2 d.p. (1.40%).

Figure 1:

This is a nice figure. Is it what you refer to in L167?

“Hand searches (reference lists from the previously identified studies)”

Should it be “Records excluded”? (Box to the right in “Screening”?)

Eligibility: can the horizontal arrow stem from the left box?

Figure 2:

Legend on L319: Suggest “reported as mean differences (MD) adjusted for” (and remove the final sentence on this line).

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Kerry McGawley

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Feb 17;16(2):e0247057. doi: 10.1371/journal.pone.0247057.r004

Author response to Decision Letter 1


24 Dec 2020

Review comments:

Reviewer: 1

I would like to thank the authors for their attention to my comments. I think this revised version is much better and I believe that this manuscript would be an excellent reference on the topic. From my perspective, I only have on minor suggestion that I would leave up to the authors to accept or reject. This is: As presented, the title seems to imply that a verification phase is something necessary. Would it be better crafting the title as a question? Something like “Is a verification phase for confirming maximal oxygen uptake useful in apparently healthy adults: A systematic review and meta-analysis”.

Regardless, I would like to congratulate the authors for the high quality of this meta-analysis. I often feel that this type of scientific contributions are not as meritorious as original research, but I think that with the amount of information that is currently available on this topic, this particular meta-analysis is fully warranted. Thanks!

ANSWER: Thank you for your comments. The title has been revised according to the reviewer’s suggestion.

Reviewer #2:

General Comments

Preparing this manuscript has clearly been a huge job, so I commend the authors on this significant undertaking. I have again reviewed this paper in thorough detail and have a number of observations and feedback points, specified below.

In general, I would firstly urge the native English speakers on the author list – or a professional proof reader – to take responsibility for thoroughly checking the language (especially the grammar) before resubmitting, as the current level of writing makes the text difficult to comprehend. There are particular issues in the new sections of the Discussion that need re-writing/correcting to improve the clarity and flow.

Secondly, and perhaps most importantly, the Discussion to me is too long. It makes it impossible for the reader to grasp the key findings and messages from this paper, as there is just so much detail of previous studies, their protocols and specific findings (data, P values, etc.). I personally feel that the paper would be far more comprehensible and impactful if the Discussion was more concise.

ANSWER: The discussion section has undergone a comprehensive revision to improve the grammar and make it more concise by removing specific details that provide limited information for the interpretation of the results.

Abstract:

L33-35: The use of semicolons in this list seems odd, and makes it difficult to understand how the search was conducted. Should they be commas? Please re-consider.

ANSWER: The correction has been made.

L43: Why the comma after VO2max but not after age? Also, the -1 on the VO2max unit looks too high.

ANSWER: The correction has been made.

L44: Can you clarify that the VO2 values were similar in 54 of the 80 studies, because at the moment it seems like this is a result for a total of 54 studies analyzed, which I don’t think is the case. Suggestion: “The highest mean VO2 in the CPET and verification phase was similar in 54 of the 80 studies (mean difference…)”. Also, I still don’t understand the inconsistencies in the use of square/round brackets. Can this be standardized throughout the manuscript? Also, why change to L/min in this sentence, after presenting mL/kg/min in the previous sentence? Can this be standardized?

ANSWER: Thank you for your comment. We have revised the paragraph according to the reviewer’s suggestion and it now reads as follows: “The highest mean VO2 values attained in the CPET and verification phase were similar in the 54 studies that were meta-analyzed (mean difference = 0.03 [95% CI = -0.01 to 0.06] L/min, P = 0.15).” In addition, we have standardized the use of square/round brackets throughout the manuscript. On the other hand, VO2max relative to body weight has been used as descriptive data for defining levels of cardiorespiratory fitness for sub-analysis, whereas absolute values of VO2 have been used for the meta-analysis since most studies assumed L/min. We have now also included mean ± SD VO2 values in L/min in the abstract as follows: “Eighty studies were included in the systematic review (total sample of 1,680 participants; 473 women; age 19-68 yr.; VO2max 3.3 ± 1.4 L/min or 46.9±12.1 mL·kg-1min-1).”

L51: “following a… CPETs” = incorrect grammar. Maybe write CPET

ANSWER: The correction has been made.

L51-53: I like this idea, but to me it needs to be related specifically to “your” findings, not attributed to “some [other?] researchers”. Maybe something like: “However, given the high concordance between the highest mean VO2 achieved in the CPET and verification phase, findings from the current study would question its necessity in all testing circumstances.”

ANSWER: This sentence has been rewritten according to the reviewer’s suggestion.

Introduction:

L70: Maybe mention the Douglas bag method here, since that is the main reason for the improved time-efficiency (i.e., changing from DBs to B-b-B systems). The type of protocol is merely a by-product.

ANSWER: The correction has been made.

L73-74: This sentence is not quite right. The “confirmation of VO2max attainment” is not “due to” the listed factors; the “lack of attaining” VO2max might be due to those factors. Please re-phrase.

ANSWER: Thank you for the observation. The correction has been made. The sentence now reads as follows: “One particularly problematic aspect has been the challenge in identifying a lack of VO2max attainment due to inappropriate test protocols, premature fatigue, or poor participant motivation and lack of effort.”

L82: Presumably this is average “male” body mass. Please specify (there needs to be transparency when sex biases exist in data sets).

ANSWER: The reviewer is correct, thank you for the comment. The following information has been included: “The landmark study of Taylor et al. [34] was the first to use a formal VO2 plateau criterion, which was defined as an increase in VO2 of less than 150 L/min (or ≤ 2.1 mL·kg-1·min-1, considering an average body mass of 72 kg from 115 male subjects) (…)”

L85-89: Is this a new paragraph? It seems very short (2 sentences). Maybe it should be a continuation of the previous paragraph, since it’s the same topic (VO2 plateau)?

ANSWER: It is the same paragraph that was summarized to attend both reviewers. Lines 78 to 89 describe the same topic – VO2 plateau. This is more clearly shown in the revised manuscript.

L93: “is attained”?

ANSWER: We have revised the text to “has been attained” since VO2max criteria are applied after the VO2max test has been performed.

L95-96: I’m not sure what this means: “lack of specificity in identifying individuals who did not continue the CPET to their limit of exercise tolerance”. What is meant by “specificity” in this context?

ANSWER: The following sentence has been included to address this issue: “Research has shown that some individuals can satisfy some of the secondary criteria thresholds long before the highest VO2 value observed in the CPET has been attained [2, 29, 37, 39]. The maximal RER criterion, for example, can be satisfied at VO2 values 27%-39% lower than the highest VO2 value achieved in the CPET [37, 39].”

L103: CP is specific to power (usually analogous to cycling) and not practically applicable to other exercise modes (such as running). Also, “i.e.” is specific (“that is”). So maybe change i.e. to e.g. (“for example”). Or add speed/velocity to power when referring to CP (so CS/CV).

ANSWER: The correction has been made.

L106: Suggestion (to avoid the use of “each other”): “the highest VO2 values in the CPET are consistent with the verification phase”

ANSWER: The correction has been made and now the sentence reads as follows: “The verification phase is based on the premise that when the highest VO2 values in the CPET are consistent with the verification phase (typically within 2-3% in accordance with the test-retest reliability of VO2max), this provides substantial empirical support that the highest possible VO2 has been elicited.

L108-111: Please present the “evidence” for Poole & Jones’ statement (i.e., “that a verification phase must be performed at a higher WR than attained in the ramp-based CPET protocols in all future studies”). Where/what is the evidence for this suggestion?

ANSWER: We described the exact term employed by Poole and Jones (2017). However, to avoid further controversy on this issue, we decided to replace “must be” to “should be”.

L111: The terminology “On the other hand” seems inappropriate, as this is a different topic compared with the previous sentence (VP WR vs CPET increments). The correct counter-argument here would presumably be the data showing that VP WR can actually be “lower” than WRpeak to still elicit a VO2peak/max.

ANSWER: This section of the introduction has been considerably revised and includes addressing this issue.

L113-116: This is now confusing, with “In contrast” following “On the other hand”. Please clarify the arguments that you are making, and the key information you’re trying to get across. I also think WRpeak needs defining. Also, is it necessary to list average W values?

ANSWER: The text has been revised as follows: “Poole and Jones [2] recently stated that to confirm the attainment of VO2max a verification phase should be performed at a higher WR than attained in the CPET in all future studies. Conversely, Iannetta et al. [25] recommended WRs within the upper limit of the severe exercise intensity domain to allow the verification phase to be maintained long enough for VO2max attainment.”

L108-120: In general, this section needs to be more clear and concise.

ANSWER: This section has been revised for conciseness, so it is now only 8 lines.

L120-132: I think all of these ideas can be incorporated into one far more clear and concise paragraph explicitly stating the aims of the study. I suggest moving the questions (at L120-125) down to L138-141 (Methods) and just focusing on the aims here, as required.

ANSWER: This text has been revised according to the reviewer’s suggestion.

Methods:

L136: “is” shown in S1 Text.

ANSWER: The correction has been made.

L138-141: These are not completely consistent with the text in the Introduction (L120-132). Also, you are still using the terminology “valid alternative to confirm”, which I understood had been changed throughout (since this is not a study of validity). And I’m not sure the study does actually identify “the most appropriate protocol for applying the verification phase”. So I would use questions similar to those currently posed in the Introduction (L120-125) here instead.

ANSWER: The correction has been made according to the reviewer’s suggestion as follows: “The main questions addressed by the present study were: To what extent does the highest VO2 attained in the CPET differ from that attained in the verification phase? Secondly, are the highest VO2 values in the CPET and verification phase affected by the verification phase characteristics (e.g. intensity, adoption of a criterion threshold, and aspects of the recovery period between the CPET and the verification phase), or even with respect to particular subgroups (e.g. sex, cardiorespiratory fitness levels, exercise test modality, and CPET protocol design) in apparently healthy adults?”

L155: participants “who” were

ANSWER: The correction has been made.

L157-158: Suggestion: “…carried out using bipedal cycle ergometry or bipedal treadmill running or walking.”

ANSWER: We agree with the reviewer’s comment and the sentence has been rewritten accordingly.

L160: Remove “included” (as you already have “involved”, above). Also, importantly, “and” should presumably be changed to “or”, since any one of those three situations would surely lead to exclusion.

ANSWER: The correction has been made.

L166-167: Writing in the first person seems odd here (we). Are you saying that a flowchart has been included in the paper? If so, where is it? If not, why mention it?

ANSWER: The correction has been made.

L170-174: There is still an inconsistent use of round and square brackets. Please re-consider (round brackets are typically used in the first instance, and [square brackets] within round brackets.)

ANSWER: We have standardized this throughout the manuscript according to your suggestion.

L175: Can you specify that you mean “other” authors than yourselves, so “authors of the original articles were contacted…”.

ANSWER: Thank you for the observation. We have revised the paragraph according to the reviewer’s suggestion and now it reads as follows: “When the relevant quantitative data were not reported, authors of the original studies were contacted to request the data”.

L185: Does “selective reporting” need to be written twice here?

ANSWER: We have shortened it to “selective reporting of outcomes”.

L187: You are inconsistent with your capitalization (or not) of the sub-title words

ANSWER: The capitalization of the sub-title words is now standardized.

L206: “the primary study groups results” requires an apostrophe somewhere on groups, depending on the specific meaning (one group or multiple groups) – please correct.

ANSWER: The text has been revised to address this issue and now reads as follows: “(mean ± standard deviation [SD] values for group VO2max and protocol duration during the CPET and verification phase from primary study groups)”.

L207: Same here – “groups” needs an apostrophe.

ANSWER: The text has been revised to address this issue and now reads as follows: “The I2 statistic measures the extent of inconsistency among the results of the primary study groups, interpreted approximately as the proportion of total variation in point estimates that is due to heterogeneity rather than sampling error.”

L208: The less than or equal to symbol is still odd here. The lower “equal to” line should be horizontal.

ANSWER: The correction has been made.

L209: using “a” funnel plot, or using funnel “plots”?

ANSWER: We have added the word “a” funnel plot. Thank you for the observation.

L216: This new red text is unnecessarily complicated. Firstly, you have previously used the term WRpeak (although this doesn’t seem to have been defined anywhere in the manuscript). Secondly, sub and supra peak does not need to be reinforced with < or > 100%, that’s obvious by definition. Suggestion: (i.e. < 100% WRpeak vs. > 100% WRpeak)

ANSWER: The correction has been made.

L218-220: This sentence could be clearer, suggestion: “as the CPET or on a different day, and the duration of the verification phase (i.e. ≤ 80 s, 81–120 s, > 120 s).”

ANSWER: The correction has been made.

L222: Should this be “cut-off points”?

ANSWER: The correction has been made.

Results:

L229-230: This (“Figure 1 summarizes the screening and selection process”) seems like repetition from the Methods section (L166-167). Should the flowchart and reference to it be moved up to the Methods (L167)?

ANSWER: The flowchart has been moved up to the Methods section.

L234: Presumably your sub-section headings should use a different font/presentation from your main section headings.

ANSWER: The sub-titles have been standardized throughout the manuscript.

L235-239: In L235 you use the term “eligible studies”, in L236 you write “included studies” and in L239 you write “primary studies”. This is confusing, because I think you are talking about the same 80 studies in all cases. Can you use consistent terminology for clarity?

ANSWER: “Primary” has been removed and “eligible” has been replaced by “included” to keep the text standardized. You are correct, this means the same 80 studies.

L240-241: BMI has already been defined in the paper, so I suggest: “participants had a BMI within the normal range (mean ± SD [range]: 24.4…”. And once again, you need to be consistent with the use of round/square brackets.

ANSWER: We have incorporated your suggestion and square and round brackets are now standardized throughout the manuscript.

L242: VO2max does not need to be written twice; I suggest removing the first reference to it: “cardiorespiratory fitness (VO2max mean ± SD [range]: 46.9…”.

ANSWER: We have incorporated your suggestion.

L243: Delete “according to… [53]” as these have already been defined in the Methods (L222-223).

ANSWER: This sentence has been removed.

L252-256: I think this could be clearer, e.g.: “Thirty-three (41%) of the 80 studies included in the review adopted one or more of the traditionally reported plateau or secondary criteria to confirm the attainment of a VO2max, with 30 using a VO2 plateau criteria, 21 using the heart rate plateau or age-predicted maximal heart rate, 18 using RERmax, and 8 using post-CPET blood lactate concentration.”. Line 256 is missing a full stop in any case.

ANSWER: We have attempted to make this text clearer. The text now reads as follows: “Thirty-three (41%) of the 80 studies included in the review used one or more VO2 plateau or secondary VO2max criteria to confirm the attainment of VO2max. Thirty studies used the VO2 plateau, 21 used the heart rate plateau or a criterion based on age-predicted maximal heart rate, 18 used the RERmax, and 8 used the post-CPET blood lactate concentration.”

L258: I don’t think the second “time averages” should be hyphenated.

ANSWER: The hyphen has been removed.

L263: Should this read something like “Regarding the period between CPET and VP…”? Because you seem to be referring to the rest period after the CPET here, right?

ANSWER: We have incorporated your suggestion.

L266: What do you mean by “self-paced approach”? A self-paced recovery period? Of any duration?

ANSWER: This text has been revised and now reads as follows: “Two studies (3%) employed a combination of passive and active recovery and another (1%) used a self-paced approach where participants were permitted to choose their own WR.”

L270-271: You need to be clearer that you are now referring to the VP exercise intensity. I also think you should decide upon the terminology earlier in the paper, define it and stick to it (e.g. peak WR or WRpeak, supra or > 100%, etc.). Because at the moment there is a mix.

ANSWER: The revised text now specifically refers to the verification phase. Consistent terminology also has now been used for supra in instead of > 100% and for WRpeak instead of peak WR.

L272: Maybe write 105-130%, consistent with the presentation of ranges in the previous paragraph.

ANSWER: The correction has been made.

L273: What do you mean by “or both peak or supra peak WR (1%)”? Is this one study, a different study from the “Eight studies” previously stated? If so, please specify. Also, do you mean both peak “and” supra peak? This is unclear to me.

ANSWER: The study of Wingo et al. (Medicine & Science in Sports & Exercise, v. 37, n. 2, p. 248-255, 2005) stated the following sentence “To ensure that a plateau in VO2 was attained, subjects completed an additional bout of cycling following 20 min of rest. Subjects cycled to exhaustion at a power output equivalent to the last workload performed during the graded test (if ≤ 1 min was completed during the last stage of the graded test) or at a power output 25 W higher than the last workload performed during the graded test (if ≥ 1 min was completed during the last stage of the graded test)”. This means that the usage of either peak (100% WRpeak) or supra peak WR (>100% WRpeak) depended on the duration of the last CPET stage. Another study (COLAKOGLU, Muzaffer et al. Stroke volume responses may be related to the gap between peak and maximal O2 consumption. Isokinetics and Exercise Science, v. 24, n. 2, p. 133-139, 2016.) also applied both peak and supra peak WRs. Considering this, the sentence has been rewritten and we think it is now clearer.

L275: Again, does 1% relate to one study? If so, you could maybe write (one study, 1%) for clarity/consistency. Also, you have previously used the term “participant” rather than “subject”.

ANSWER: “One study” has been added. “Participants” is now used consistently throughout the manuscript,

L277: “85-95% of…”

ANSWER: The correction has been made.

L281-282: “1.5-2.2”; ”50-150” (consistent with previously stated ranges).

ANSWER: The correction has been made.

L294: Figure 2 (not Figures).

ANSWER: The correction has been made.

L322: I suggest removing the two commas, as this “middle” information is imperative to understanding the analyses.

ANSWER: The correction has been made.

L338-339: Please add units to the IQR, presumably s.

ANSWER: The units have been added.

L339-340: This very last line of the Results section: “There were no significant differences between the CPET and verification phase for VO2max (P = 0.18 to P = 0.71)”… Can you be more specific about what the sentence, and the range of P values, relate to. For example, how does the information differ from that stated in L295-297 (i.e., “Notably, the mean highest VO2 values were similar between the CPET and verification phase [mean difference = 0.03 (95% CI = -0.01 to 0.06) L/min, P = 0.15].”)? Presumably you are referring to the sub-groups, but it’s currently unclear from what you have written.

ANSWER: We have added the following sentence “Considering all sub-analyses presented in the Table 4…”. By looking at the Table 4, we can see the minimal P value (0.18 for the high cardiorespiratory fitness) and the maximal P value (0.71 for female subjects).

Discussion:

In general the writing in this section, particularly the new parts in red, need reviewing and revising to improve the clarity of the messages being communicated. The grammar is poor in places, which makes the important information difficult to understand. Also, the Discussion to me is far too long. There are unnecessarily lengthy descriptions of studies and protocols throughout, so I suggest writing more concisely. Also, there are lengthy descriptions relating to points not supported by the current data analysis. In my mind, the authors need to make a significant overhaul of the discussion and cut it down in length, such that the important points and messages are communicated far more clearly.

ANSWER: The discussion section has been comprehensively revised to improve the grammar and make the it more concise, including removal of the unnecessary descriptions of studies.

L357-359: The word “repeated” appears out of place here, since the VP is not repeated (it is only carried out once). Also “compared to that observed” is unclear – what are you comparing to? The maximal effort and VO2max (in which case, plural = those)? Please review and revise this sentence for clarity.

ANSWER: The correction has been made. We have replaced “repeated” by “carried out” and “that” by “highest VO2” observed in the preceding CPET, which is the dependent variable measured in the two protocols.

L360: “To date” should not be hyphenated (the term also seems odd/unnecessary). Also, consider re-phrasing: “has demonstrated that… have shown that…”.

ANSWER: Thank you for the observation. We have removed “to-date” from the sentence.

L364: “findings from 54 CPET…”. Also, I don’t really understand what this sentence adds to the previous sentence. They seem to say the same thing. Can they be combined?

ANSWER: The first sentence refers to results from the systematic review that includes all 80 studies, while the second sentence refers to the meta-analyzed data that includes only 54 of the 80 studies). It means that the qualitative results (systematic review) support the quantitative ones (meta-analysis). A hypothetical situation could happen whereby data from the systematic review could indicate VO2max attainment in both protocols that are not be confirmed through the statistical analysis. Although the two sentences appear similar, they are reinforcing the consistency of results from a different amount of studies in each case (80 vs. 54).

L366: Please revisit the grammar and revise this sentence.

ANSWER: This sentence has been removed from the revised discussion section.

L369-370: I suggest removing this important information from the brackets and supporting the statement with a reference (i.e., that this is within the most commonly adopted measures of test variability at 2-3%).

ANSWER: The correction has been made.

L371: Revise the grammar.

ANSWER: This text has been removed from the revised discussion section.

L374-375: Revise the grammar.

ANSWER: This text has been removed from the revised discussion section.

L383: “for example” appears out of place here, in a new paragraph.

ANSWER: This text has been removed from the revised discussion section.

L383-393: This is a long account of a previous study. What point is being made?

ANSWER: This relevant study by Day et al. reported that traditional VO2max criteria are protocol dependent (the plateau occurrence is not a commonly observed phenomenon in continuous step-incremented tests and even less so in individuals with low cardiorespiratory fitness). In contrast, a verification bout induced no difference in VO2 compared to that observed in the CPET. In summary, the verification phase does not appear to be affected by protocol design and is not population dependent. This is an important point and the findings by Day et al. are consistent with the results from the current meta-analysis, because all sub-analyses indicated no differences according to protocol and individual characteristics.

L383-403: This is a long paragraph describing a few previous studies and their results, but there is no reference to the overall findings of current study. Please revise.

ANSWER: Rossiter et al. investigated similar questions as those by Day et al., which already have been commented on in our previous answer.

L404: A new paragraph should not begin with “On the other hand”. Also, is this point (“54 studies meta-analyzed”) now in relation to your meta-analysis? Please clarify. In general, I suggest you revise your paragraph structures and arguments to arrive more quickly at the points you are trying to make, in light of previous work, and link the story to the findings of the current analysis. At the moment I am wondering where all this is going (particularly the paragraph above), and what the relevance is to the findings of your systematic review/meta-analysis.

ANSWER: “On the other hand” has been removed from the sentence. The 54 studies are from our meta-analysis. The next paragraphs discuss a small quantity of the included studies (~11%) that found significant mean differences between the CPET and verification phase. Moreover, we tried to explain the possible reasons for these differences. Forty-eight out of 54 studies showed, by analyzing individually, no difference in the highest VO2 between the CPET and verification phase, being consistent with the present meta-analysis. We have comprehensively revised the discussion section to more quickly arrive at the points being made to address the reviewer’s comment.

L406-409: The use of brackets seems inconsistent here too.

ANSWER: The round and square brackets are now standardized throughout the manuscript.

L425-426: Please revise the writing here: “since low cardiorespiratory fitness are more susceptible to stopping early during the CPET”

ANSWER: We have added “individuals with” low...

L404-432: This is another very long paragraph describing previous studies in length. I think these points can be made far more concisely and interpreted more clearly in light of the general findings from your study.

ANSWER: We have removed the unnecessary detail as part of the comprehensive revision of the discussion section.

L427-432: You seem to have spent a long time describing a possibility that is not actually supported by your own data analysis. I suggest turning your thinking around, and discussing what you did find and what that actually means, in relation to the specific studies analysed. I get your point, but it is not actually supported by your stats, which is the issue. Alternatively, make this point more concisely so that it is not over-inflated.

ANSWER: The percentages are from our data analysis. The difference across cardiorespiratory fitness levels was not significant, supporting the use of the verification phase regardless of individual characteristics, which contrasts with the low occurrence of the VO2 plateau during a CPET in subjects with low cardiorespiratory fitness.

L433: Due to the shear length of this discussion I have read the remainder with less of a focus on details. As stated above, my advice is that the writing is more clear and focused, with less extensive description of previous studies.

ANSWER: The discussion section has undergone a comprehensive revision to improve the grammar and make it more concise by removing specific details that provide limited information for the interpretation of the results.

L499-501: Again, is there any data to support Poole and Jones’ suggestion here (i.e., that researchers “must”…), or is it a subjective view (because other research would suggest that this is not necessary)?

ANSWER: The correction has been made.

Table 1:

Title: (N = 80) – include space; CPET has not been defined here, but it has in the Fig 2 title.

ANSWER: A space has been included and CPET has now been defined.

Table 2:

Abbreviations for TR and CYC seem to be missing (I haven’t read every detail of Table 2, so there might be other bits missing).

ANSWER: We have double-checked the abbreviations to ensure there are none missing.

Table 3:

Can you clarify in the table heading row what is meant by “Total”

ANSWER: “Total” means the number of participants that have performed both CPET and the verification phase. As we used Review Manager to calculate the statistics, the software layout requires input data in this order: mean, standard deviation and number of participants for each intervention.

For Arad et al., % Weight should probably be written to 2 d.p. (1.40%).

ANSWER: Thank you, the correction has been made.

Figure 1:

This is a nice figure. Is it what you refer to in L167?

ANSWER: Yes. This figure is a flowchart with the process of inclusion and exclusion of the studies.

“Hand searches (reference lists from the previously identified studies)”

ANSWER: Thank you. The correction has been made.

Should it be “Records excluded”? (Box to the right in “Screening”?)

ANSWER: Yes. We have changed to “Records excluded”.

Eligibility: can the horizontal arrow stem from the left box?

ANSWER: Figure 1 has been formatted according to PRISMA guidelines and we would prefer to keep it in its current format.

Figure 2:

Legend on L319: Suggest “reported as mean differences (MD) adjusted for” (and remove the final sentence on this line).

ANSWER: We have incorporated your suggestion.

Attachment

Submitted filename: PlosOnes_response letter.docx

Decision Letter 2

Laurent Mourot

14 Jan 2021

PONE-D-20-25408R2

Is a verification phase useful for confirming maximal oxygen uptake in apparently healthy adults? A systematic review and meta-analysis

PLOS ONE

Dear Dr. Cunha,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the  minor points raised during the review process.

Please submit your revised manuscript by Feb 28 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Laurent Mourot

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #2: Congratulations on producing a great paper! I have once again read the manuscript in detail and have a series of small suggestions. Most reflections relate to the Discussion, but I am happy to accept the manuscript following a final revision by the authors.

Abstract:

Really clear.

Introduction:

Really informative. Just a few small suggestions…

L68: “fast-responding” – NOTE that this rule should be applied throughout the manuscript in my opinion… two words forming an adjective to describe a subsequent noun should be hyphenated (there are lots of examples, but e.g. verification-phase characteristics, verification-phase duration, highly-trained endurance athletes, test-protocol dependent, 10-min recovery, different-day variability etc.).

L100: 27-39%

L117: Could you just double check whether WRpeak has already been defined. If not then I suggest you explicitly define the abbreviation here.

Methods:

Really clear. Just a few small suggestions…

L135: Suggest writing the PROSPERO weblink and reg number in one single sentence.

L161: This final criteria seems redundant when you have specified (at L158) “bipedal treadmill running or walking”. I would remove it (otherwise the reader could wonder why you haven’t included outdoor/track cycling as an exclusion criteria).

L204: You are missing a full stop.

Results:

Really thorough. Some simple suggestions to consider…

L231-233: “All potential… for eligibility”. These two sentences seem like methods. I suggest removing them.

L263-266: This makes 82 studies in total (103%). I suggest that the two studies using both are not included in the other lists, so that they are not duplicated (if that’s what’s happened) and so that the total = 80. If not then at least re-phrase “and another two”, because that’s not strictly true. It’s two of those already mentioned (so something like “two of which”). I would actually re-order the list to have continuous, then discontinuous, then both cont+discont, then self-paced, just to improve the logical flow for the reader. But it’s up to you!

L269: Could you double check that the RERmax abbreviation has been defined already. If not then I suggest you explicitly define it here.

L271-275: It’s a bit unclear why you’ve included %ages for the first two examples, but not throughout the rest of the sentence (especially when %ages are included in the previous and following sentences). Maybe it’s all the small numbers, but something to re-consider.

L293: Forty-two studies (53%) employed

L296: researchers’ laboratories

L313: were judged to have a low risk of bias…

Discussion:

I have some minor suggestions and reflections…

L362-363: of these studies, 90% of which have been published since 2009.

L363: I would probably remove “of the review”, because it’s both a review and a MA.

L367-369: I suggest removing these two sentences as both seem out of place, with neither relating directly to the aims of the study. Firstly, you didn’t explicitly investigate safety across fitness groups. And the second sentence implies that you could run a VP as a stand-alone test to measure VO2max, which you didn’t investigate (and shouldn’t be implied, as the effect of the previous CPET is a confounder).

L373: Is “error” the correct term here? Isn’t it just a “difference”?

L375: You didn’t measure agreement, specifically. Is a better term “similarity”.

L379-382: Have you got this the wrong way round? Are low fitness groups not “less” likely to exhibit a VO2 plateau as a result of VO2 not decelerating?

L385: When you write “mean VO2 values”, do you actually mean “mean VO2max values”? There is of course an important difference between mean VO2 and mean VO2max. Also in this sentence, you should probably write “by” 0.03 and 0.04 L/min (not “of”).

L386-389: This sentence is confusing. Suggestion: “However, sub-group analyses revealed that while maximal VO2 in the CPET was higher than that attained in the verification phase for participants with moderate and high cardiorespiratory fitness, the opposite was true for those with lower cardiorespiratory fitness.”

L394: Same here as in L385, do you mean “mean VO2max”, rather than just “mean VO2”?

L399: Regarding verification phase… (or “In regards to…”)

L402: who performed five repeated treadmill CPET trials appended by a verification phase…

L406: I would remove the term “inappropriately”, unless you can support the inappropriateness clearly with evidence/a reference.

L408-409: which allowed sufficient time (i.e., ~ XXX s) for VO2max attainment [can you include the VP duration here, to clarify how much greater it was than the previously highlighted 80 s].

L412-413: found no difference for verification phase durations of ≤ 80 s, 81-120 s, and > 120 s

L414: Again, where is the justification for deeming this duration “inappropriate”? Please provide evidence/a reference.

L422: 1-hr should not be hyphenated.

L423-424: What about fatigue?

L426: The opposite is also plausible, though, as prior exercise (warm-up/priming) is known to have favorable effects on VO2. So I think you need to justify the greater advantage of no prior “fatiguing” exercise compared to no warm-up.

L436: Would “effectiveness” be better than “utility”?

L440: Suggest inserting a comma after [128]

L441: Suggest removing the comma after “phase”

L442: suggested that researchers…

L448: is sufficiently long.

L456-458: To me the problem seems to be more an issue of confounding results rather than limited data. There are loads of studies, as you’ve highlighted, but they all show different results!

L458: evidence-based

L459: 10-20 min

L463: Where is the evidence that this “might be better tolerated”? Is there evidence that performing the VP on the same day is not well tolerated? I would remove this idea and just write “An alternative method is to perform the VP on a separate day…”

L498: Suggest changing “patient” to “participant”, as this is not just relevant to clinical populations, but also healthy/athletes, etc.

L500: I don’t understand the term “would be ideally indicated in”, in this context. Something like “is applicable to” might be better.

L502: Does “those” refer to the wheelchair athletes? Presumably not, so maybe write more clearly: “individuals with spinal-cord injuries”.

L504-505: This seems repetitive from L499-500, so I suggest removing this sentence.

Tables:

Table 3 heading: Would it be better as “Overall comparisons in the meta-analyzed studies…”

Table 4 (see also L242-243): I might have missed it, but have you referenced/justified your CR fitness level classifications (Low, Moderate, High) in the Methods? Please check and add if necessary.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: Yes: Kerry McGawley

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Feb 17;16(2):e0247057. doi: 10.1371/journal.pone.0247057.r006

Author response to Decision Letter 2


16 Jan 2021

Reviewer #2:

General Comments

Reviewer #2: Congratulations on producing a great paper! I have once again read the manuscript in detail and have a series of small suggestions. Most reflections relate to the Discussion, but I am happy to accept the manuscript following a final revision by the authors.

ANSWER: Thank you for your comments and for the obvious substantial time and effort you have given to help improve our paper.

Abstract:

Really clear.

ANSWER: Thank you for your comment.

Introduction:

Really informative. Just a few small suggestions…

L68: “fast-responding” – NOTE that this rule should be applied throughout the manuscript in my opinion… two words forming an adjective to describe a subsequent noun should be hyphenated (there are lots of examples, but e.g. verification-phase characteristics, verification-phase duration, highly-trained endurance athletes, test-protocol dependent, 10-min recovery, different-day variability etc.).

ANSWER: You are correct. The corrections have been made.

L100: 27-39%

ANSWER: The alteration has been made.

L117: Could you just double check whether WRpeak has already been defined. If not then I suggest you explicitly define the abbreviation here.

ANSWER: Thank you. We have included its definition: […] a verification phase should be performed at a higher WR than the last load attained in the CPET (i.e. > WRpeak) […].

Methods:

Really clear. Just a few small suggestions…

L135: Suggest writing the PROSPERO weblink and reg number in one single sentence.

ANSWER: We have incorporated your suggestion.

L161: This final criteria seems redundant when you have specified (at L158) “bipedal treadmill running or walking”. I would remove it (otherwise the reader could wonder why you haven’t included outdoor/track cycling as an exclusion criteria).

ANSWER: Thank you. We have removed the final criterion.

L204: You are missing a full stop.

ANSWER: A full stop has been added.

Results:

Really thorough. Some simple suggestions to consider…

L231-233: “All potential… for eligibility”. These two sentences seem like methods. I suggest removing them.

ANSWER: The two sentences have been removed.

L263-266: This makes 82 studies in total (103%). I suggest that the two studies using both are not included in the other lists, so that they are not duplicated (if that’s what’s happened) and so that the total = 80. If not then at least re-phrase “and another two”, because that’s not strictly true. It’s two of those already mentioned (so something like “two of which”). I would actually re-order the list to have continuous, then discontinuous, then both cont+discont, then self-paced, just to improve the logical flow for the reader. But it’s up to you!

ANSWER: We have revised the sentence according to your suggestion and it now reads as follows: “Seventy-three studies (91%) used continuous step-incremented or ramp/pseudo-ramp CPET protocols. Three (4%) used only discontinuous step-incremented protocols. Two studies (3%) used both discontinuous and continuous step-incremented protocols and another two studies (3%) applied self-paced protocols.”

L269: Could you double check that the RERmax abbreviation has been defined already. If not then I suggest you explicitly define it here.

ANSWER: We have included its definition: […] 18 used the maximal RER attained in the CPET (RERmax) […].

L271-275: It’s a bit unclear why you’ve included %ages for the first two examples, but not throughout the rest of the sentence (especially when %ages are included in the previous and following sentences). Maybe it’s all the small numbers, but something to re-consider.

ANSWER: We have added the percentages so every count in the paragraph has a corresponding percentage.

L293: Forty-two studies (53%) employed

ANSWER: The alteration has been made.

L296: researchers’ laboratories

ANSWER: The alteration has been made.

L313: were judged to have a low risk of bias…

ANSWER: The alteration has been made.

Discussion:

I have some minor suggestions and reflections…

ANSWER: OK.

L362-363: of these studies, 90% of which have been published since 2009.

ANSWER: The alteration has been made.

L363: I would probably remove “of the review”, because it’s both a review and a MA.

ANSWER: “Review” has been removed.

L367-369: I suggest removing these two sentences as both seem out of place, with neither relating directly to the aims of the study. Firstly, you didn’t explicitly investigate safety across fitness groups. And the second sentence implies that you could run a VP as a stand-alone test to measure VO2max, which you didn’t investigate (and shouldn’t be implied, as the effect of the previous CPET is a confounder).

ANSWER: These sentences have been removed.

L373: Is “error” the correct term here? Isn’t it just a “difference”?

ANSWER: We have changed “error” to “difference”.

L375: You didn’t measure agreement, specifically. Is a better term “similarity”.

ANSWER: We have changed “agreement” to “similarity”.

L379-382: Have you got this the wrong way round? Are low fitness groups not “less” likely to exhibit a VO2 plateau as a result of VO2 not decelerating?

ANSWER: You are correct. Thank you for spotting this error. The correction has been made.

L385: When you write “mean VO2 values”, do you actually mean “mean VO2max values”? There is of course an important difference between mean VO2 and mean VO2max. Also in this sentence, you should probably write “by” 0.03 and 0.04 L/min (not “of”).

ANSWER: We have written “mean VO2max values”. The word “mean” means the highest VO2 the individuals could achieve. The correction has been made.

L386-389: This sentence is confusing. Suggestion: “However, sub-group analyses revealed that while maximal VO2 in the CPET was higher than that attained in the verification phase for participants with moderate and high cardiorespiratory fitness, the opposite was true for those with lower cardiorespiratory fitness.”

ANSWER: Thank you. We have incorporated your suggestion.

L394: Same here as in L385, do you mean “mean VO2max”, rather than just “mean VO2”?

ANSWER: We have revised the text to include “mean VO2max”.

L399: Regarding verification phase… (or “In regards to…”)

ANSWER: The text has been changed to “Regarding verification intensity…”

L402: who performed five repeated treadmill CPET trials appended by a verification phase…

ANSWER: The alteration has been made.

L406: I would remove the term “inappropriately”, unless you can support the inappropriateness clearly with evidence/a reference.

ANSWER: We have removed “inappropriately” as suggested.

L408-409: which allowed sufficient time (i.e., ~ XXX s) for VO2max attainment [can you include the VP duration here, to clarify how much greater it was than the previously highlighted 80 s].

ANSWER: The suggested information has been added.

L412-413: found no difference for verification phase durations of ≤ 80 s, 81-120 s, and > 120 s

ANSWER: The correction has been made.

L414: Again, where is the justification for deeming this duration “inappropriate”? Please provide evidence/a reference.

ANSWER: We have removed the term “inappropriate”, consistent with our response to your previous comment on this issue.

L422: 1-hr should not be hyphenated.

ANSWER: The correction has been made.

L423-424: What about fatigue?

ANSWER: We have added information on fatigue and the sentence now reads as follows: “It is feasible that the procedures performed before the maximal CPET may have led to poor participant motivation, lack of effort and premature fatigue in the following test”.

L426: The opposite is also plausible, though, as prior exercise (warm-up/priming) is known to have favorable effects on VO2. So I think you need to justify the greater advantage of no prior “fatiguing” exercise compared to no warm-up.

ANSWER: The aim of this sentence is to highlight the potential effect of residual fatigue from the maximal CPET on subsequent verification phase performance. We have added the word “maximal” to the sentence to help clarify this.

L436: Would “effectiveness” be better than “utility”?

ANSWER: We have incorporated your suggestion.

L440: Suggest inserting a comma after [128]

ANSWER: The comma has been inserted.

L441: Suggest removing the comma after “phase”

ANSWER: The comma has been removed.

L442: suggested that researchers…

ANSWER: The correction has been made.

L448: is sufficiently long.

ANSWER: The correction has been made.

L456-458: To me the problem seems to be more an issue of confounding results rather than limited data. There are loads of studies, as you’ve highlighted, but they all show different results!

ANSWER: We have changed “limited data” to “confounding results”.

L458: evidence-based

ANSWER: The alteration has been made.

L459: 10-20 min

ANSWER: The correction has been made.

L463: Where is the evidence that this “might be better tolerated”? Is there evidence that performing the VP on the same day is not well tolerated? I would remove this idea and just write “An alternative method is to perform the VP on a separate day…”

ANSWER: We have incorporated your suggestion.

L498: Suggest changing “patient” to “participant”, as this is not just relevant to clinical populations, but also healthy/athletes, etc.

ANSWER: We have incorporated your suggestion.

L500: I don’t understand the term “would be ideally indicated in”, in this context. Something like “is applicable to” might be better.

ANSWER: We have incorporated your suggestion.

L502: Does “those” refer to the wheelchair athletes? Presumably not, so maybe write more clearly: “individuals with spinal-cord injuries”.

ANSWER: We have incorporated your suggestion.

L504-505: This seems repetitive from L499-500, so I suggest removing this sentence.

ANSWER: We have removed this sentence.

Tables:

Table 3 heading: Would it be better as “Overall comparisons in the meta-analyzed studies…”

ANSWER: We have incorporated your suggestion.

Table 4 (see also L242-243): I might have missed it, but have you referenced/justified your CR fitness level classifications (Low, Moderate, High) in the Methods? Please check and add if necessary.

ANSWER: We provided the Astorino et al. 2015 reference for the CR classifications in line 226 (line 224 in the revised manuscript).

Attachment

Submitted filename: PlosOnes_response letter.docx

Decision Letter 3

Laurent Mourot

1 Feb 2021

Is a verification phase useful for confirming maximal oxygen uptake in apparently healthy adults? A systematic review and meta-analysis

PONE-D-20-25408R3

Dear Dr. Cunha,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Laurent Mourot

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #2: Well done, I look forward to seeing this paper in print.

-------------------------------------------

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: Yes: Kerry McGawley

Acceptance letter

Laurent Mourot

4 Feb 2021

PONE-D-20-25408R3

Is a verification phase useful for confirming maximal oxygen uptake in apparently healthy adults? A systematic review and meta-analysis

Dear Dr. Cunha:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr Laurent Mourot

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Checklist. PRISMA 2009 checklist.

    (DOCX)

    S1 Text. Search strategy.

    (DOCX)

    Attachment

    Submitted filename: Comments to the authors.pdf

    Attachment

    Submitted filename: PlosOnes_response letter.docx

    Attachment

    Submitted filename: PlosOnes_response letter.docx

    Attachment

    Submitted filename: PlosOnes_response letter.docx

    Data Availability Statement

    All relevant data are within the paper and its Supporting information files.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES