Skip to main content
PLOS One logoLink to PLOS One
. 2022 May 5;17(5):e0268074. doi: 10.1371/journal.pone.0268074

Reproducibility of strength performance and strength-endurance profiles: A test-retest study

Benedikt Mitter 1,*, Robert Csapo 1,#, Pascal Bauer 1,#, Harald Tschan 1,#
Editor: Mathieu Gruet2
PMCID: PMC9070879  PMID: 35511896

Abstract

The present study was designed to evaluate the test-retest consistency of repetition maximum tests at standardized relative loads and determine the robustness of strength-endurance profiles across test-retest trials. Twenty-four resistance-trained males and females (age, 27.4 ± 4.0 y; body mass, 77.2 ± 12.6 kg; relative bench press one-repetition maximum [1-RM], 1.19 ± 0.23 kg•kg-1) were assessed for their 1-RM in the free-weight bench press. After 48 to 72 hours, they were tested for the maximum number of achievable repetitions at 90%, 80% and 70% of their 1-RM. A retest was completed for all assessments one week later. Gathered data were used to model the relationship between relative load and repetitions to failure with respect to individual trends using Bayesian multilevel modeling and applying four recently proposed model types. The maximum number of repetitions showed slightly better reliability at lower relative loads (ICC at 70% 1-RM = 0.86, 90% highest density interval: [0.71, 0.93]) compared to higher relative loads (ICC at 90% 1-RM = 0.65 [0.39, 0.83]), whereas the absolute agreement was slightly better at higher loads (SEM at 90% 1-RM = 0.7 repetitions [0.5, 0.9]; SEM at 70% 1-RM = 1.1 repetitions [0.8, 1.4]). The linear regression model and the 2-parameters exponential regression model revealed the most robust parameter estimates across test-retest trials. Results testify to good reproducibility of repetition maximum tests at standardized relative loads obtained over short periods of time. A complementary free-to-use web application was developed to help practitioners calculate strength-endurance profiles and build individual repetition maximum tables based on robust statistical models.

Introduction

Dynamic strength endurance has previously been defined as the amount of concentric work an individual can produce in a cyclic or repetitive movement [1]. Assuming that the range of motion is approximately constant for each repetition of a given resistance training exercise, strength endurance can therefore be described by the number of repetitions performed to momentary failure (RTF) at a given load for a single sustained trial [1,2]. The evaluation of strength endurance by means of a repetition maximum test (occasionally also called repetition endurance test) usually involves an exercise being performed to momentary failure at either a fixed absolute load, expressed in a unit of mass like kg or lbs, or a fixed relative load that has been normalized to the exercise-specific one-repetition maximum (1-RM). The concept is widely applied by coaches to guide resistance training programming [1,3,4]. However, given the fact that resistance training is usually carried out across a wider spectrum of loads, assessing the RTF an individual can execute at a single load only provides limited insight into a person’s fatigue resistance. More meaningful insights into strength endurance could be obtained by studying the relationship between load and RTF (i.e., the individual “strength-endurance profile”). Additionally, knowledge of the mathematical relationship between the two variables could be used by practitioners to predict the load associated with a certain repetition maximum. This may be of particular interest for individuals seeking to control intensity of effort within a set [5] by prescribing a certain percentage of the maximum load that can be used for a given number of repetitions [6]. While other methods have been proposed to evaluate or control intensity of effort based on perceived effort or movement velocity [7], an approach using strength-endurance profiles might overcome certain limitations of these methods. Such limitations include inappropriate anchoring of perception [5], inaccurate subjective estimates of repetitions in reserve at lower intensity of effort [8] and dependency on technology to provide reliable feedback on movement velocity [9].

The relationship between load and RTF can be expressed through simple bivariate models. Thus far, research has proposed models that describe either a linear [1012] or an exponential relationship [1114]; usually, the respective model equations are then rearranged to predict the 1-RM from a repetition maximum test. However, studies conducted to test the validity of these equations often showed poor predictive accuracy, especially when the applied repetition maximum test was executed at loads allowing for 10 repetitions or more [3,12,13,1517]. The poor validity may be related to substantial inter-individual differences in strength-endurance relationships that models not incorporating the responsible confounding factors fail to account for. Indeed, there is evidence that the amount of repetitions that can be performed at a given relative load, and hence the strength-endurance relationship, may depend on various factors, such as qualitative and quantitative training background [11,1820], fiber type composition and the capillary density of involved muscles [21,22], exercise [12,19,23] and movement cadence [14,24]. A possible solution to overcome these challenges in modeling strength-endurance relationships has been proposed by Morton and colleagues [25] who introduced the idea of creating subject-specific models, thereby treating the individual person as the population of interest. For this purpose, the authors reformulated the critical power model originally proposed by Monod and Scherrer [26] such that it may be applied to isoinertial resistance training exercises. The resulting model has recently been referred to as critical load model and was originally presented as a non-linear function featuring three parameters [25,27].

While the individualized modeling approach may reduce variance resulting from uncontrolled confounding variables, such models are typically estimated from a limited number of available data due to the exhaustive nature of sets performed to failure [25,27]. Hence, the estimation of model parameters can be strongly affected by variability in test results, as single data points tend to have a larger influence on parameter estimates in small samples compared to large samples. Therefore, the robustness of individual strength-endurance models over short periods is crucial for their application in practice. The present study was designed to target two objectives: 1) to evaluate the consistency of the RTF at standardized relative loads and 2) to compare the reproducibility of four recently proposed models describing the individual strength-endurance relationship. A complementary, freely available web application will be provided to allow practitioners to easily calculate strength-endurance profiles based on a model that can be considered sufficiently robust to help with the design and regulation of resistance training programs.

Materials and methods

Subjects

Fifteen resistance-trained men (age = 27.2 ± 3.3 yrs, body mass = 85.4 ± 7.9 kg, bench press 1-RM/body mass = 1.33 ± 0.11 kg•kg-1) and nine resistance-trained women (age = 27.7 ± 5.2 y, body mass = 63.6 ± 3.3 kg, bench press 1-RM/body mass = 0.96 ± 0.17 kg•kg-1) volunteered to be tested for the present study. In order to participate, subjects had to be between 18 and 40 years of age, free of illness and injury and have at least one year of training experience in the bench press exercise as well as a 1-RM corresponding to at least 1x body mass for men and 0.75x body mass for women, respectively. Prior to physical testing, participants were informed about the possible risks, had to complete a modified physical activity readiness questionnaire (PAR-Q) and sign an informed consent form. The study was designed in fulfillment of the ethical guidelines communicated in the Declaration of Helsinki and approved by the host institution’s local ethical committee (no. 00461).

Experimental approach

A test-retest design was used to determine the participants’ maximum strength and strength-endurance at high loads in the free-weight bench press exercise on two occasions (T1 & T2) separated by one week (Fig 1). Maximum strength was assessed according to a progressive 1-RM test. Strength-endurance was assessed using repetition maximum tests at 90%, 80% and 70% of the 1-RM, respectively. In order to provide some rest, the 1-RM test and the repetition maximum tests were executed on two different days, separated by 48 to 72 hours, resulting in a total of four visits to the laboratory within 11 days. Importantly, the relative loads applied for the repetition maximum tests during the fourth visit were adjusted to the 1-RM achieved during the third visit. Consequently, a change in the 1-RM between T1 and T2 also implied a change in the absolute load used for the repetition maximum tests at T2, in order to ensure that trials were performed at 90%, 80% and 70% of the current 1-RM. All tests were completed at the same time of day.

Fig 1. Experimental design.

Fig 1

1-RM, one-repetition maximum; RTF, repetitions performed to momentary failure.

Procedures

On the first day, subjects completed preliminary health screening and filled in a physical activity form to evaluate their experience with the tested exercise. Body height and body mass were assessed using a stadiometer (Seca Model 217; SECA GmbH & Co. KG., Hamburg, Germany) and scale (Seca Model 877; SECA GmbH & Co. KG., Hamburg, Germany). Participants then completed a standardized warm-up consisting of cycling for 5 min at a constant power output of 1 W per kg body mass and a rotational velocity of 80 rpm on an ergometer (Kettler X1, Trisport, Huenenberg, Switzerland), followed by a brief dynamic upper body mobilization routine. Subsequently, they were familiarized with the standardized movement technique for the bench press: each subject had to lower the barbell onto two safety pins, which were individually adjusted to a height that would allow for up to 3 cm of vertical distance between the bottom barbell position and the subject’s chest. An experienced staff member visually ensured that the barbell was placed on the safety pins without rebound, before giving the verbal command “Press!”, signaling the subject to execute the concentric phase of the bench press at maximum voluntary velocity. Participants were required to maintain their hip, shoulders and head positioned on the bench and their feet placed on the floor during each set.

Upon completion of the familiarization, participants were requested to estimate their 1-RM based on self-evaluation of their recent training performance. The subsequent 1-RM test featured a progressive loading pattern with the first five loads being fixed at 25%, 50%, 75%, 85% and 95% of the estimated 1-RM, while mean concentric barbell velocity was recorded with a linear position transducer (GymAware Power Tool, Kinetic Performance Technologies, Canberra, Australia). In the initial set, three repetitions were performed, followed by a 3-min break. Two repetitions were performed once the highest achieved velocity of the preceding set fell below 1.0 m/s, followed by a 4-min break and single repetitions were performed once it fell below 0.65 m/s, followed by a 5-min break. After successfully completing 95% of the estimated 1-RM, loads were increased individually to approximate the true 1-RM. The 1-RM was considered to be determined once a load increment of 2.5 kg from the preceding set would no longer allow the subject to complete the exercise across the full range of motion.

Repetition maximum tests were completed at 90%, 80% and 70% of the identified 1-RM, respectively, in the form of a single-visit protocol. Barbell loads were not randomized, but prescribed in a descending scheme, to minimize systematic effects of accumulated fatigue on the performance during subsequent sets [28]. In order to provide extended time for recovery in between repetition maximum tests, yet sustain warm-up effects during these periods, participants underwent the same general warm-up procedure that was used for the 1-RM test prior to each set to failure. Additionally, they performed a specific warm-up including three repetitions at 25%, three repetitions at 50% and two repetitions at 75% 1-RM prior to each set to failure. A passive rest of 3 min was provided between warm-up sets and additional 5 min before and immediately after each set to failure. Due to this methodological structure (i.e., the standardized warm-up; the standardized passive rest before and after each set to failure), the repetition maximum tests were separated by approximately 22 min each. Criteria for movement execution were kept identical to those communicated for the 1-RM test. Participants were instructed to lower the barbell in a controlled fashion on each repetition, albeit not being prescribed a fixed movement cadence. Similar to the 1-RM test, participants had to await the verbal command of the staff member before initiating the concentric phase of a repetition, in order to avoid any rebound from the safety pins. The concentric phase of each repetition had to be performed at maximum intended velocity. A repetition maximum test was terminated once the participant was unable to complete another repetition across the full range of motion despite using maximal effort, suggesting that the point of momentary failure had been reached [5].

Statistical analysis

Reproducibility of performance measures

Statistics were calculated following a Bayesian approach using weakly informative priors. To assess test-retest reliability of the 1-RM and RTF at 90%, 80% and 70% 1-RM, respectively, the following random-intercept mixed effects model was used:

PijNormal(μ+si+ΔtDj,σe2) [1]

In this model, Pij describes the analyzed performance measure as a dependent variable, μ describes the mean performance for T1, si the random deviation from μ for subject i, Δt the fixed effect of time (i.e., the systematic difference in performance between T2 and T1), Dj a binary dummy variable for trial j, and σe2 the variance of model residuals. The random effect parameter si was considered to be sampled from a normal distribution with a mean of 0 and a variance of σs2, as suggested by Baumgartner and colleagues [29]. Posterior distributions for each model parameter were sampled using the Hamiltonian Monte Carlo algorithm of the probabilistic programming language Stan [30] controlled through an R interface (rstan R package, version 2.21.2). Based on the resulting random-intercept models, relative consistency (reliability) of each performance measure was evaluated using the Intraclass Correlation Coefficient (ICC), which was estimated and interpreted as the proportion of total variance (σs2 + σe2) attributed to the variance among subjects (σs2) [29]. Furthermore, absolute consistency (agreement) of performance was quantified using the Standard Error of Measurement (SEM = σe), Within-Subject Coefficient of Variation (WSCV = SEM / μ) and Standard Error of Prediction (SEP = SD(1-ICC2)(1/2)) [29,31,32]. Posterior distributions of the statistics were summarized and interpreted according to the Maximum a Posteriori point estimate (MAP) and 90% Highest Density Interval (HDI) [33]. Effect directions supported by at least 90% of posterior probability were considered “clear” or “likely”.

Reproducibility of strength-endurance models. To describe the relationship between relative load and RTF with respect to individual trends, four previously proposed model types were expressed according to a multilevel (mixed effects) structure:

Lin:loadiNormal(ai+biRTFi,σ2) [2]
Ex2:loadiNormal(aie(biRTFi),σ2) [3]
Ex3:loadiNormal(ci+aie(biRTFi),σ2) [4]
Crit:loadiNormal(Li/(RTFiki)+CLi,σ2) [5]

Eq 2 (Lin) models the relationship as a linear regression. Eqs 3 (Ex2) and 4 (Ex3) both describe exponential regression models, where Ex3 follows the structure of a commonly proposed 3-parameters model [11,12,14] and Ex2 constitutes a simplified 2-parameters version without the additive parameter ci [13]. Eq 5 (Crit) presents the previously described critical load model adapted for relative load as dependent variable, using the original parameter labels L’, k and CL [25,27]. To evaluate how much parameter estimates for Eqs 2 to 5 differ between T1 and T2, a change effect was added for each of the abovementioned subject-level parameters. For example, the parameter expression ai was extended to (ai + Δai Dj), where ai reflects the target parameter at T1, Δai reflects the change effect (difference) of the target parameter between T2 and T1 and Dj reflects a binary dummy variable for trial j. Importantly, all of the abovementioned parameters were modeled as random effects that were free to vary across subjects. The multilevel structure was realized by sampling subject-level parameters and change effects from multivariate normal distributions, applying covariance matrices to account for possible correlations among subject-level parameters and change effects, respectively. Further details on models and prior selection are provided online (Supporting information 1 in S1 Appendix).

A posterior predictive distribution was calculated by drawing random samples from the respective group-level (fixed effects) distribution of each change effect and the draws were standardized to the scale of the associated model parameter at T1. The resulting posterior predictive distributions were summarized and compared to a threshold for acceptable differences that was set at ±0.6, reflecting a small or trivial standardized change of the parameter [34]. Change effects were also expressed as a percentage of the group-level mean of the associated model parameter at T1 to facilitate the interpretation of parameters that are exceptionally homogeneous across subjects.

Results

The variability of 1-RM performance as well as the RTF performed at 90%, 80% and 70% 1-RM is shown in Fig 2. On average, there was an increase in performance between T1 and T2 (Δt), the 90% HDI suggesting a small systematic increase of the 1-RM, the RTF at 80% 1-RM and the RTF at 70% 1-RM. Regarding relative consistency of performance, the 1-RM yielded nearly perfect reliability, with the ICC being close to 1. The RTF, on the other hand, showed a trend for higher relative consistency at lower loads, although the difference between load conditions was not statistically clear at the 90% credibility level. Analysis of absolute consistency revealed the SEM for the 1-RM to be likely less than 2.2 kg (90th percentile). Concerning the RTF performed at submaximal loads, the SEM was likely less than 1 repetition at 90% and 80% 1-RM, and likely less than 1.5 repetitions at 70% 1-RM. Subject performance and consistency statistics are summarized in Table 1.

Fig 2. Variability of strength performance in the bench press.

Fig 2

A, one-repetition maximum (1-RM); B, repetitions performed to momentary failure (RTF) at 90% 1-RM; C, RTF at 80% 1-RM; D, RTF at 70% 1-RM; grey circles, data points (jittered illustration); black circles, group means; solid grey lines, individual performance changes; dashed black lines, systematic performance changes (Δt).

Table 1. Consistency statistics for strength performance in the bench press.

1-RM (kg) RTF at 90% 1-RM (n) RTF at 80% 1-RM (n) RTF at 70% 1-RM (n)
Performance
T1 93.5 ± 28.9 4.2 ± 1.2 7.8 ± 1.7 12.2 ± 2.6
T2 95.4 ± 29.9 4.5 ± 1.1 8.5 ± 1.4 12.9 ± 2.6
Δt 1.9 [1.0, 2.7] 0.2 [-0.1, 0.6] 0.7 [0.3, 1.0] 0.7 [0.2, 1.2]
Absolute consistency
SEM 1.7 [1.4, 2.3] 0.7 [0.5, 0.9] 0.7 [0.6, 1.4] 1.1 [0.8, 1.4]
WSCV (%) 1.8 [1.4, 2.5] 15.9 [12.3, 21.3] 9.2 [7.2, 12.2] 8.8 [6.9, 11.8]
SEP 2.3 [1.6, 3.3] 0.9 [0.7, 1.1] 0.9 [0.7, 1.9] 1.4 [1.0, 1.9]
Relative consistency
ICC 1.00 [0.99, 1.00] 0.65 [0.39, 0.83] 0.82 [0.64, 0.93] 0.86 [0.71, 0.93]

Sample data are presented as mean ± standard deviation.

Statistics are presented as Maximum a Posteriori estimate [90% Highest Density Interval].

1-RM, one-repetition maximum; Δt, fixed effect of time; ICC, interclass correlation coefficient; RTF, repetitions performed to momentary failure; SEM, standard error of measurement; SEP, standard error of prediction; T1, baseline test; T2, retest; WSCV, within-subject coefficient of variation.

Posterior predictive distributions of subject-level model parameters at T1 and T2 are summarized in Table 2. Moreover, posterior predictive distributions for standardized change effects are shown in Fig 3. The critical load model revealed a systematic positive change effect for L’ [p (ΔL’i > 0 | data) > 99.9%] and systematic negative change effects for k [p (Δki < 0 | data) > 99.9%] and CL [p (ΔCLi < 0 | data) = 97.3%]. Similarly, the 3-parameters exponential model showed a systematic positive change effect for c [p (Δci > 0 | data) = 99.7%] and systematic negative change effects for a [p (Δai < 0 | data) = 99.6%] and b [p (Δbi < 0 | data) = 96.4%]. None of the remaining models’ parameters resulted in a clear positive or negative change at the 90% credibility level. No model parameter resulted in a clearly small or trivial change at the chosen credibility level and threshold for acceptable differences. However, the slope parameter of the linear model (b) and the curvature parameter of the 2-parameters exponential model (b) indicated a probability of >80% for the change effect to be small or trivial. Furthermore, both intercept parameters (a) of the linear model and the 2-parameters exponential model indicated relative change effects close to 0 (Table 3).

Table 2. Summary of posterior predictive distributions of absolute parameter values during test (T1) and retest (T2).

Model Parameter T1 T2 Δxi (T2 –T1)
Lin a 101.5 [100.3, 102.5] 101.9 [100.4, 103.4] 0.4 [-0.8, 1.6]
b -2.73 [-3.77, -1.68] -2.56 [-3.64, -1.47] 0.09 [-0.23, 0.47]
Ex2 a 102.6 [101.6, 103.8] 102.9 [101.5, 104.4] 0.3 [-0.9, 1.5]
b -0.031 [-0.044, -0.020] -0.030 [-0.043, -0.017] 0.002 [-0.002, 0.006]
Ex3 a 76.3 [65.1, 95.3] 63.4 [55.0, 75.2] -12.7 [-25.1, -4.5]
b -0.045 [-0.068, -0.022] -0.054 [-0.080, -0.032] -0.010 [-0.021, -0.001]
c 27.3 [7.2, 38.1] 40.7 [27.9, 48.8] 13.7 [5.2, 25.9]
Crit L’ 3638.8 [2062.5, 6422.1] 4583.0 [2637.9, 7271.6] 613.7 [287.6, 1129.9]
k -32.0 [-47.5, -20.5] -36.5 [-52.0, -24.3] -4.1 [-6.6, -2.0]
CL -17.4 [-43.4, 3.6] -23.5 [-48.5, -2.5] -3.3 [-8.8, 0.0]

Posterior predictive distributions are summarized using the Maximum a Posteriori estimate and 90% Highest Density Interval.

Crit, critical load model; Ex2, exponential model (2 parameters); Ex3, exponential model (3 parameters); Lin, linear model; Δxi, change effect between T1 and T2.

Fig 3. Posterior predictive distributions for standardized subject-level change effects (smoothed illustration).

Fig 3

Dashed black lines, threshold for acceptable differences set to [-0.6, 0.6] indicating small or trivial changes; *, change effects Δa and Δc of the exponential 3-parameters model are not visibly displayed due to very large scales.

Table 3. Summary of posterior predictive distributions of relative and standardized change effects.

Model Change effect Relative magnitude (%) * Standardized magnitude ** p (Δxi ∈ [-0.6, 0.6] | data) **
Lin Δa 0.3 [-0.8, 1.6] 0.38 [-4.46, 8.98] 27.6%
Δb 4.8 [-8.9, 17.2] 0.17 [-0.38, 0.78] 86.9%
Ex2 Δa 0.4 [-1, 1.4] 0.13 [-4.43, 9.35] 29.5%
Δb 4.6 [-7.9, 18.1] 0.22 [-0.36, 0.8] 84.8%
Ex3 Δa -19.1 [-28.5, -7.8] -15.26 [-184.85, 0.71] 0.2%
Δb -20.4 [-48.5, -0.7] -0.83 [-2, -0.04] 26.1%
Δc 42.1 [0.4, 217.4] 18.34 [0.05, 232.88] 0.1%
Crit ΔL’ 14.6 [6.7, 29.6] 0.76 [0.24, 1.83] 18.8%
Δk -12.1 [-22.1, -6] -0.63 [-1.26, -0.24] 36.2%
ΔCL -10.9 [-129.6, 7.8] -1.2 [-6.4, 0.52] 11.5%

Posterior predictive distributions are summarized using the Maximum a Posteriori estimate and 90% Highest Density Interval.

*, change effects are expressed relative to the group-level mean of the associated model parameter at T1.

**, change effects are standardized to the group-level standard deviation of the associated model parameter at T1. Crit, critical load model; Ex2, exponential model (2 parameters); Ex3, exponential model (3 parameters); Lin, linear model; p (Δxi ∈ [-0.6, 0.6] | data), probability of the standardized change effect falling within the threshold for acceptable differences given the data.

Discussion

The present study was designed to address two objectives: first, we evaluated the reliability and agreement of RTF performed at 90, 80 and 70% 1-RM in the bench press exercise. Second, we aimed to analyze the reproducibility of four different models representing the individual strength-endurance relationship to identify which ones provide the most robust parameter estimates. Test-retest analysis of performance indicated very good reproducibility of the 1-RM and the RTF at high relative loads in the bench press exercise. The linear regression and the 2-parameters exponential regression yielded the most robust parameter estimates across the investigated models of the strength-endurance relationship.

The 1-RM revealed both very high relative and absolute consistency. In particular, the SEM for the 1-RM was found to be likely less than the smallest load increment applied during the 1-RM assessment in the present study (2.5 kg). These findings correspond to previous research reporting excellent reliability of 1-RM performance in the bench press exercise [4,35,36]. Similarly, the RTF at 90, 80 and 70% 1-RM revealed high absolute consistency, the SEM likely being less than 1.5 repetitions at 70% 1-RM, and less than 1 repetition at 90% and 80% 1-RM. Posterior distribution analysis revealed no systematic differences of SEM between RTF performed at 70%, 80% and 90% 1-RM. However, a slight shift of SEM posterior distributions to lower values could be observed for RTF at higher relative loads. In particular, the difference of SEM between RTF at 70% and 90% 1-RM could have exceeded the predefined threshold for systematic differences at a larger sample size. Interestingly, the ICC showed an opposing non-systematic shift of posterior distributions, with lower relative loads resulting in slightly larger ICC values. These seemingly contradictory trends arising from absolute and relative consistency might be related to the computation of the respective statistics: in the present study, the ICC was calculated as the proportion of total variance attributed to the variance among subjects. Therefore, it tends to be smaller when between-subject variance is low and SEM is large. Indeed, our data suggest a higher between-subject variance of the RTF at 70% 1-RM compared to 90% 1-RM. A similar trend for heteroscedasticity in the relationship between relative load and RTF across individuals (i.e. a mean-variance “tradeoff”) has been reported on numerous occasions [3,11,12,14,18,19,37,38]. This phenomenon could be the result of normalizing the load to the 1-RM, which homogenizes the upper end of the load spectrum. However, it could also be partially explained by inter-individual differences in the strength-endurance relationship.

Conforming trends for the reliability of the RTF performed at given relative loads can be observed from other sources. For example, Anders and colleagues reported an ICC of 0.90 (95% CI: [0.58, 0.97]) for RTF completed at 70% 1-RM in the bench press [4], indicating a similar magnitude compared to the present study (ICC [90% HDI] = 0.86 [0.71, 0.93]). While the reported SEM of 0.68 repetitions was noticeably lower compared to the present study, the authors also described a lower between-subject standard deviation of ±1.5 repetitions. Similarly, Pereira and colleagues reported an ICC of 0.90 for the RTF achieved at 75% 1-RM in the bench press, when performing repetitions at a joint velocity of 100°/s. While no information on subject heterogeneity was provided, the authors also reported an ICC of 0.70 when the exercise was completed at a joint velocity of 25°/s. It could be hypothesized that the reduced movement cadence might have negatively affected the number of repetitions performed [14,24], possibly due to an increased duration of the concentric phase of each repetition and associated increases in metabolic demand [39]. Hence, a reduced movement cadence at lower loads could result in a distribution of RTF that is similar to the RTF at higher loads when repetitions are performed at maximal voluntary velocity, as was the case in the present study.

Other studies investigated the reproducibility of RTF in absolute loads. For example, Mann et al. analyzed the test-retest reliability of NCAA Division I football players in the NFL-225 test, which is a repetition maximum test using a fixed load of 225 lbs or 102.3 kg in the bench press exercise [40]. The authors reported an ICC of 0.98 to 0.99 and a typical error of 1.0 to 1.3 repetitions across three trials, the typical error corresponding to what has been calculated as SEM in the present study. While it is difficult to evaluate at what percentage of the 1-RM each participant performed the NFL-225 test in the absence of a 1-RM test, the authors estimated it to be around 67.9% 1-RM for athletes with a body mass below 100.5 kg and around 44.6% 1-RM for heavier athletes. Therefore, the majority of participants performed the NFL-225 test at lower relative loads compared to the present study. Given this fact, the reports of Mann et al. [40] correspond well to the results of the present study (SEM for RTF at 70% 1-RM [90% HDI] = 1.1 [0.8, 1.4] repetitions), especially when considering the large between-subject variance reported by the authors, which may have contributed to the large ICC, as discussed before. Finally, Rose and Ball analyzed the reliability of the RTF that could be achieved against 15.9 kg and 20.4 kg, reporting an ICC of 0.97 in both cases [36]. In their sample of 21 moderately trained women the two tested loads corresponded roughly to a mean relative load of 42% and 54% 1-RM, which supports the hypothesis of RTF tests showing higher relative consistency at low loads.

A systematic increase in the 1-RM between test and retest has previously been described on numerous occasions for various exercises [41]. Interestingly, Ribeiro and colleagues reported that this time effect did not interact significantly with participants’ experience in resistance training [42]. While the magnitude of the systematic change (Δt [90% HDI] = 1.9 kg [1.0, 2.7]) could be considered trivial in the present study, given the smallest load increment was 2.5 kg, previous research suggested that the effect may occur over the course of multiple consecutive retest trials as a result of practicing the test [4244]. Similarly, the time effect of RTF performed at 90%, 80% and 70%-1RM showed a high probability for being less than 1 repetition. Despite the RTF at 80% and 70% 1-RM indicating a systematic difference between T1 and T2, the magnitude of this effect is likely trivial.

To the best of our knowledge, this is the first study to evaluate and compare the reproducibility of different strength-endurance models with respect to individual trends. Not all of the investigated models resulted in robust parameter estimates over time. Most notably, the 3-parameters exponential model and the critical load model exhibited systematic changes for all parameters. These findings suggest that naturally occurring variability in strength performance likely causes parameter estimates to systematically change, even over short periods, and that the magnitude of these changes is unacceptably high in relation to the respective parameter’s group-level standard deviation. Therefore, the two models may not provide sufficient reproducibility for application in the practical field. In comparison, the linear model and the 2-parameters exponential model both resulted in a high probability for Δb (i.e., the change in slope and curvature parameters, respectively) to fall within the threshold for acceptable differences, although the effects were not clear at the selected credibility level. No clear change effect could be identified for the intercept parameter a in both cases due to low between-subject variability. However, findings suggest a negligible relative magnitude for Δa in both models (Table 3). Therefore, both the linear model and 2-parameters exponential model yield the most robust parameter estimates across test-retest trials among the investigated models. To decide which of the two models to apply in a practical setting, practitioners should also consider statistical qualities other than the robustness of models. For example, both the model fit und predictive validity can be considered essential characteristics of a valuable strength-endurance profile. While previous research provided some evidence that the relationship may be considered approximately linear at high loads [3,1012], it has been suggested that the relationship actually follows a curvilinear trend when considering the full spectrum of loads [11,13,14]. Therefore, practitioners might want to resort to applying the 2-parameters exponential regression rather than the linear regression to model strength-endurance profiles, as research has not proposed any explicit disadvantages reasoning against its use.

Based on the findings of the present study, a freely available web application was developed using the R package shiny (version 1.7.1). The application provides practitioners with a user-friendly interface to enter data from repetition maximum tests and offers different algorithms to compute the individual and exercise-specific strength-endurance profile. Upon computation, it offers a graphical display of the profile, a model equation and an adjusted R2 estimate to evaluate model fit. Furthermore, it produces an individual repetition-maximum table based on the estimated model parameters that predicts loads for a wider spectrum of RTF. A link to the web application is provided at the end of this article.

It should be pointed out that the order of repetition maximum tests was not randomized in the present study. Hence, a possible systematic effect of the earlier sets performed to momentary failure on subsequent sets and, thus, the presence of systematic bias in the RTF performed cannot be excluded. Future research should strive to compare different test protocols and identify a valid, yet practically applicable approach to acquiring the necessary data for model computation. However, the results of the present study may help practitioners understand the consistency of strength performance under standardized conditions and can assist with the selection of a reliable statistical model to calculate individual strength-endurance profiles.

Conclusions and practical applications

In conclusion, both the 1-RM and RTF at 90%, 80% and 70% 1-RM showed good reproducibility over test-retest trials in the bench press exercise for trained subjects. When modeling the relationship between load and RTF using a multilevel structure, the linear regression and 2-parameters exponential regression provide more stable parameter estimates than the 3-parameters exponential regression or critical load model.

To calculate a strength-endurance profile for a given individual and specific exercise, it is recommended to acquire the maximum number of repetitions that can be performed to momentary failure against three different loads. While the loads should be chosen according to a range of interest, practitioners should expect to experience higher absolute day-to-day variability of RTF at lower loads. For loads in the range of 70% - 100% 1-RM, a linear regression or a 2-parameters exponential regression should be applied to reliably model the relationship between tested loads and the number of achieved repetitions. To derive a robust strength-endurance profile, practitioners can access a free-to-use web application using the following link: https://strength-and-conditioning-toolbox.shinyapps.io/Strength-Endurance_Profile/.

Supporting information

S1 Appendix. Modeling details.

This file contains detailed information on priors and models.

(PDF)

S1 Table. Raw data.

This file contains the data used for the statistical analyses.

(XLSX)

Acknowledgments

The authors would like to thank all participants for contributing to the realization of the present study. The authors further want to express their gratitude to the Centre for Sport Science and University Sports, University of Vienna for providing the equipment and facilities.

Data Availability

All relevant data are within the manuscript and its Supporting Information files. Scripts used for the statistical analyses can be accessed using the DOI: https://doi.org/10.5281/zenodo.5840363.

Funding Statement

Open access funding provided by University of Vienna.

References

  • 1.Lawton TW, Cronin JB, McGuigan MR. Strength testing and training of rowers: a review. Sports Med 2011; 41(5):413–32. doi: 10.2165/11588540-000000000-00000 [DOI] [PubMed] [Google Scholar]
  • 2.American College of Sports Medicine. American College of Sports Medicine position stand. Progression models in resistance training for healthy adults. Med Sci Sports Exerc 2009; 41(3):687–708. doi: 10.1249/MSS.0b013e3181915670 [DOI] [PubMed] [Google Scholar]
  • 3.Brechue WF, Mayhew JL. Upper-body work capacity and 1RM prediction are unaltered by increasing muscular strength in college football players. J Strength Cond Res 2009; 23(9):2477–86. doi: 10.1519/JSC.0b013e3181b1ae5f [DOI] [PubMed] [Google Scholar]
  • 4.Anders JPV, Keller JL, Smith CM, Hill EC, Housh TJ, Schmidt RJ et al. The Effects of Asparagus Racemosus Supplementation Plus 8 Weeks of Resistance Training on Muscular Strength and Endurance. J Funct Morphol Kinesiol 2020; 5(1):4. doi: 10.3390/jfmk5010004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Steele J, Fisher J, Giessing J, Gentil P. Clarity in reporting terminology and definitions of set endpoints in resistance training. Muscle Nerve 2017; 56(3):368–74. doi: 10.1002/mus.25557 [DOI] [PubMed] [Google Scholar]
  • 6.DeWeese BH, Hornsby G, Stone M, Stone MH. The training process: Planning for strength–power training in track and field. Part 2: Practical and applied aspects. J Sport Health Sci 2015; 4(4):318–24. [Google Scholar]
  • 7.Suchomel TJ, Nimphius S, Bellon CR, Hornsby WG, Stone MH. Training for Muscular Strength: Methods for Monitoring and Adjusting Training Intensity. Sports Med 2021; 51(10):2051–66. doi: 10.1007/s40279-021-01488-9 [DOI] [PubMed] [Google Scholar]
  • 8.Hackett DA, Cobley SP, Davies TB, Michael SW, Halaki M. Accuracy in Estimating Repetitions to Failure During Resistance Exercise. J Strength Cond Res 2017; 31(8):2162–8. doi: 10.1519/JSC.0000000000001683 [DOI] [PubMed] [Google Scholar]
  • 9.García-Ramos A, Torrejón A, Feriche B, Morales-Artacho AJ, Pérez-Castilla A, Padial P et al. Prediction of the Maximum Number of Repetitions and Repetitions in Reserve From Barbell Velocity. Int J Sports Physiol Perform 2018; 13(3):353–9. doi: 10.1123/ijspp.2017-0302 [DOI] [PubMed] [Google Scholar]
  • 10.Brzycki M. Strength Testing—Predicting a One-Rep Max from Reps-to-Fatigue. JOPERD 1993; 64(1):88–90. [Google Scholar]
  • 11.Desgorces FD, Berthelot G, Dietrich G, Testa MSA. Local muscular endurance and prediction of 1 repetition maximum for bench in 4 athletic populations. J Strength Cond Res 2010; 24(2):394–400. doi: 10.1519/JSC.0b013e3181c7c72d [DOI] [PubMed] [Google Scholar]
  • 12.Reynolds JM, Gordon TJ, Robergs RA. Prediction of One Repetition Maximum Strength from Multiple Repetition Maximum Testing and Anthropometry. J Strength Cond Res 2006; 20(3):584–92. doi: 10.1519/R-15304.1 [DOI] [PubMed] [Google Scholar]
  • 13.Mayhew JL, Johnson BD, Lamonte MJ, Lauber D, Kemmler W. Accuracy of prediction equations for determining one repetition maximum bench press in women before and after resistance training. J Strength Cond Res 2008; 22(5):1570–7. doi: 10.1519/JSC.0b013e31817b02ad [DOI] [PubMed] [Google Scholar]
  • 14.Sakamoto A, Sinclair PJ. Effect of movement velocity on the relationship between training load and the number of repetitions of bench press. J Strength Cond Res 2006; 20(3):523–7. doi: 10.1519/16794.1 [DOI] [PubMed] [Google Scholar]
  • 15.LeSuer DA, McCormick JH, Mayhew JL, Wasserstein RL, Arnold MD. The Accuracy of Prediction Equations for Estimating 1-RM Performance in the Bench Press, Squat, and Deadlift. J Strength Cond Res 1997; 11(4):211–3. [Google Scholar]
  • 16.Ware JS, Clemens CT, Mayhew JL, Johnston TJ. Muscular endurance repetitions to predict bench press and squat strength in college football players. J Strength Cond Res 1995; 9(2):99–103. [Google Scholar]
  • 17.Wood TM, Maddalozzo GF, Harter RA. Accuracy of Seven Equations for Predicting 1-RM Performance of Apparently Healthy, Sedentary Older Adults. Meas Phys Educ Exerc Sci 2002; 6(2):67–94. [Google Scholar]
  • 18.Richens B, Cleather DJ. The relationship between the number of repetitions performed at given intensities is different in endurance and strength trained athletes. Biol Sport 2014; 31(2):157–61. doi: 10.5604/20831862.1099047 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Hoeger WW, Hopkins DR, Barette SL, Hale DF. Relationship between repetitions and selected percentages of one repetition maximum: A comparison between un-trained and trained males and females. J Strength Cond Res 1990; 4(2):47–54. [Google Scholar]
  • 20.Pick J, Becque MD. The Relationship Between Training Status and Intensity on Muscle Activation and Relative Submaximal Lifting Capacity During the Back Squat. J Strength Cond Res 2000; 14(2):175–81. [Google Scholar]
  • 21.Douris PC, White BP, Cullen RR, Keltz WE, Meli J, Mondiello DM et al. The relationship between maximal repetition performance and muscle fiber type as estimated by noninvasive technique in the quadriceps of untrained women. J Strength Cond Res 2006; 20(3):699–703. doi: 10.1519/17204.1 [DOI] [PubMed] [Google Scholar]
  • 22.Terzis G, Spengos K, Manta P, Sarris N, Georgiadis G. Fiber type composition and capillary density in relation to submaximal number of repetitions in resistance exercise. J Strength Cond Res 2008; 22(3):845–50. doi: 10.1519/JSC.0b013e31816a5ee4 [DOI] [PubMed] [Google Scholar]
  • 23.Shimano T, Kraemer WJ, Spiering BA, Volek JS, Hatfield DL, Silvestre R et al. Relationship between the number of repetitions and selected percentages of one repetition maximum in free weight exercises in trained and untrained men. J Strength Cond Res 2006; 20(4):819–23. doi: 10.1519/R-18195.1 [DOI] [PubMed] [Google Scholar]
  • 24.LaChance PF, Hortobagyi T. Influence of Cadence on Muscular Performance During Push-up and Pull-up Exercise. J Strength Cond Res 1994; 8(2):76–9. [Google Scholar]
  • 25.Morton RH, Redstone MD, Laing DJ. The Critical Power Concept and Bench Press: Modeling 1RM and Repetitions to Failure. Int J Exerc Sci 2014; 7(2):152–60. [Google Scholar]
  • 26.Monod H, Scherrer J. The Work Capacity of a Synergic Muscular Group. Ergonomics 1965; 8(3):329–38. [Google Scholar]
  • 27.Bergstrom HC, Dinyer TK, Succi PJ, Voskuil CC, Housh TJ. Applications of the Critical Power Model to Dynamic Constant External Resistance Exercise: A Brief Review of the Critical Load Test. Sports (Basel) 2021; 9(2):15. doi: 10.3390/sports9020015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Sánchez-Medina L, González-Badillo JJ. Velocity loss as an indicator of neuromuscular fatigue during resistance training. Med Sci Sports Exerc 2011; 43(9):1725–34. doi: 10.1249/MSS.0b013e318213f880 [DOI] [PubMed] [Google Scholar]
  • 29.Baumgartner R, Joshi A, Feng D, Zanderigo F, Ogden RT. Statistical evaluation of test-retest studies in PET brain imaging. EJNMMI Res 2018; 8(1):13. doi: 10.1186/s13550-018-0366-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B, Betancourt M et al. Stan: A Probabilistic Programming Language. J. Stat. Soft. 2017; 76(1):32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Vet HCW de Terwee CB, Knol DL Bouter LM. When to use agreement versus reliability measures. J Clin Epidemiol 2006; 59(10):1033–9. doi: 10.1016/j.jclinepi.2005.10.015 [DOI] [PubMed] [Google Scholar]
  • 32.Weir JP. Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. J Strength Cond Res 2005; 19(1):231–40. doi: 10.1519/15184.1 [DOI] [PubMed] [Google Scholar]
  • 33.Makowski D, Ben-Shachar M, Lüdecke D. bayestestR: Describing Effects and their Uncertainty, Existence and Significance within the Bayesian Framework. JOSS 2019; 4(40):1541. [Google Scholar]
  • 34.Hopkins WG, Marshall SW, Batterham AM, Hanin J. Progressive statistics for studies in sports medicine and exercise science. Med Sci Sports Exerc 2009; 41(1):3–13. doi: 10.1249/MSS.0b013e31818cb278 [DOI] [PubMed] [Google Scholar]
  • 35.Pereira MIR, Gomes PSC. Muscular strength and endurance tests: reliability and prediction of one repetition maximum–Review and new evidences. Rev Bras Med Esporte 2003; 9(5):336–46. [Google Scholar]
  • 36.Rose K, Ball TE. A Field Test for Predicting Maximum Bench Press Lift of College Women. J Strength Cond Res 1992; 6(2):103–6. [Google Scholar]
  • 37.Moss AC, Dinyer TK, Abel MG, Bergstrom HC. Methodological Considerations for the Determination of the Critical Load for the Deadlift. J Strength Cond Res 2021; 35(Suppl 1):S31–S37. doi: 10.1519/JSC.0000000000003795 [DOI] [PubMed] [Google Scholar]
  • 38.Dinyer TK, Byrd MT, Vesotsky AN, Succi PJ, Bergstrom HC. Applying the Critical Power Model to a Full-Body Resistance-Training Movement. Int J Sports Physiol Perform 2019; 14(10):1364–70. doi: 10.1123/ijspp.2018-0981 [DOI] [PubMed] [Google Scholar]
  • 39.Fountain WA, Valenti ZJ, Lynch CE, Guarnera SR, Meister BM, Carlini NA et al. Order of concentric and eccentric muscle actions affects metabolic responses. J Sports Med Phys Fitness 2021; 61(12):1587–95. doi: 10.23736/S0022-4707.21.12010-9 [DOI] [PubMed] [Google Scholar]
  • 40.Mann JB, Ivey PJ, Brechue WF, Mayhew JL. Reliability and smallest worthwhile difference of the NFL-225 test in NCAA Division I football players. J Strength Cond Res 2014; 28(5):1427–32. doi: 10.1519/JSC.0000000000000411 [DOI] [PubMed] [Google Scholar]
  • 41.Grgic J, Lazinica B, Schoenfeld BJ, Pedisic Z. Test-Retest Reliability of the One-Repetition Maximum (1RM) Strength Assessment: a Systematic Review. Sports Med Open 2020; 6(1):31. doi: 10.1186/s40798-020-00260-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Ribeiro AS, do Nascimento MA, Mayhew JL, Ritti-Dias RM, Avelar A, Okano AH et al. Reliability of 1RM test in detrained men with previous resistance training experience. IES 2014; 22(2):137–43. [Google Scholar]
  • 43.Mattocks KT, Buckner SL, Jessee MB, Dankel SJ, Mouser JG, Loenneke JP. Practicing the Test Produces Strength Equivalent to Higher Volume Training. Med Sci Sports Exerc 2017; 49(9):1945–54. doi: 10.1249/MSS.0000000000001300 [DOI] [PubMed] [Google Scholar]
  • 44.Dankel SJ, Counts BR, Barnett BE, Buckner SL, Abe T, Loenneke JP. Muscle adaptations following 21 consecutive days of strength test familiarization compared with traditional training. Muscle Nerve 2017; 56(2):307–14. doi: 10.1002/mus.25488 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Mathieu Gruet

21 Feb 2022

PONE-D-22-01272Reproducibility of strength performance and strength-endurance profiles: a test-retest studyPLOS ONE

Dear Dr. Mitter,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

==============================

ACADEMIC EDITOR Dear authors. Thank you for submitting your MS to Plos One. As you will see below, both reviewers found your study interesting and rigorously conducted. They also raised some important concerns which must be adressed during the revision process. Pay particular attention at improving the discussion section according to reviewers' suggestions. Best wishesMathieu Gruet==============================

Please submit your revised manuscript by April 7, 2022. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Mathieu Gruet, Ph.D

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: No

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The present study was designed to evaluate the test-retest consistency of repetition maximum tests at standardized relative loads and determine the robustness of strength-endurance profiles across test-retest trials. The topic is of interest. The introduction & results are well-written parts. As specified in my comments, I think that the methods parts needs some clarification, especially to better understand the experimental design. Finally, the discussion appears as the weakest point of this manuscript, and needs to be amended before the manuscript could be accepted for publication. For the form, some typographical or grammatical errors need to be addressed. Please see my main comments below.

Introduction:

Line 30: please give more explanation for absolute vs relative loads. I guess that “relative” refers to RM; but please justify it since other subjective methods exist (e.g. RIR).

Line 35-36: I understand your rational, but some methods that predict the load associated with a certain repetition maximum already exist. It could be good to strengthen this point.

Line 37: what is the interest to lead a repetition set to exhaustion?

Line 44: while the prediction become lower when the number of repetitions is higher?

Methods:

Line 75: Something is wrong in the formulation of this sentence; i.e. “twenty-four resistance-trained men (n=15)” is confusing. Please amend this sentence. Further, I am not sure that the BP 1-RM/body mass needs to appear here.

Line 92: is “work capacity” the best formulation to be used? Should you not refer to “endurance” here (as mainly performed in the introduction)?

Lines 93-94: did you control (objectively of subjectively) for muscle damage potentially induced by the 1-RM test before the beginning of the repetition maximum tests?

Lines 95-97: this is not clear. The determination of the 1-RM was not the first visit? You talk of 2 visits for T1 and T2; just after you say that participants visited the laboratory on four occasions. I think that this paragraph is important, and as it stands it is not clear (maybe a figure summarizing the experimental design could help).

Line 119-120: what does this mean to “subjectively estimate the 1RM”?

Line 132: what was the duration of the break? This is an important feature here to make sure that no fatigue was present before each rep max tests.

Lines 131-142: what about the eccentric phase during the rep max test? There is no information on that? Was it passive or active? Were specific instructions given for this eccentric phase?

Line 141: why did you chose 22min for the resting duration?

Discussion:

The main results of your study (rather than only an objectives reminder) should appear in the first paragraph. The results section could be complicated for some readers that are not used to mathematical models, thus summarizing the main results at the beginning of the discussion could be helpful. Further, the last sentence (line 254-256) would fit better at the end of the discussion for your perspectives.

Line 263-264: I am not sure that a difference of 0.5 rep between 70% and 90 & 80% RM test could suggest (even if this is only a suggestion) that the test-retest agreement is better at higher relative loads.

Line 268: you cannot put “:” and start the following sentence with a majuscule. This appears many times in your manuscript. Please correct.

Lines 270-272: do you have objective explanation for this result?

Line 274: it could be helpful to remind the ICC obtained in your study to confront it with the one of Anders et al.

Line 282: why a decrease in movement cadence can affect the number of repetitions?

In general, the discussion lacks depth. We do not clearly identify the main results of the study and the application that these results could have. Since the results part is a bit difficult to understand (although the statistical analysis are relevant), a clear and detailed discussion (with objective clues) is needed. I am further surprised by the small number of references that are used in the discussion. The development of a web application is an interested feature of this article, and should be more highlighted in my opining rather that only cited in the last sentence of the conclusion.

Reviewer #2: I’d like to congratulate the authors on a simple yet elegant, and rigorously conducted, study. I actually have very little to suggest here and think that it could largely be published as is. The app is very nicely put together too. I just make a few comments below which the authors might wish to consider.

Many thanks

James Steele

Comments:

“Failure” – you use the phrase “volitional failure” and do not provide a definition for this. You might wish to consider the following article from our group that discusses definitions of terms in relation to this - https://pubmed.ncbi.nlm.nih.gov/28044366/

In a supplementary analysis for a recent meta-analysis from our group (https://sportrxiv.org/index.php/server/preprint/view/109/version/120), we collated data from some studies (https://osf.io/td26u/) reporting group level results for repetitions performed to failure at different relative loads. We did explore group level strength-endurance profiles (though in order to compare self-selected repetitions numbers to what could be performed; https://osf.io/xqz9a/). Anyway, I just thought it might be of interest considering this current study. We only fit a simple linear model to it, but it would be interesting to see how well the other models you describe might fit.

I appreciate the reasoning for not randomising the loads, though think it might be worth mentioning that this is a possible limitation that could in and of itself introduce some degree of systematic bias. Perhaps just mention it in the discussion.

I might also add to the statistical analysis when describing the models for strength-endurance profiles that the random effects for participants included both intercepts and slopes. It is clear from the equations, but not all are mathematically inclined and so explicitly mentioning this in the text might be worthwhile.

The small systematic increase in 1RM, and perhaps RTFs, might be explained by the test practice effect as Jeremy Loenneke’s group have discussed (e.g., https://pubmed.ncbi.nlm.nih.gov/27875635/, https://pubmed.ncbi.nlm.nih.gov/28463902/)

I think it would be worthwhile to include, similarly to the 1RM/RTF table, a table showing the parameter estimates from each model for T1 and T2 in addition to the change parameter estimated. It would be nice for example to compare to estimates from other studies (I appreciate the data are available so a reader could do this themselves if they wanted too though).

Lastly, you mention the mean-variance relationship in the discussion. I just thought it worth highlighting that this is very much apparent for repetitions performed, particularly for their log transformation (see meta-analytic estimate from the supplementary data in our meta-analysis mentioned: https://osf.io/fznhu?show=view&view_only=).

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Robin Souron

Reviewer #2: Yes: James Steele

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2022 May 5;17(5):e0268074. doi: 10.1371/journal.pone.0268074.r002

Author response to Decision Letter 0


30 Mar 2022

Overall response: We would like to express our gratitude to both reviewers for providing constructive feedback on our manuscript and giving valuable suggestions to improve the scientific quality of the article. We believe that with the help of the editor and the reviewers, we managed to eradicate any remaining ambiguities, improve the methodological transparency and provide readers a more profound discussion of our results.

Response to reviewer 1

R1: The present study was designed to evaluate the test-retest consistency of repetition maximum tests at standardized relative loads and determine the robustness of strength-endurance profiles across test-retest trials. The topic is of interest. The introduction & results are well-written parts. As specified in my comments, I think that the methods parts needs some clarification, especially to better understand the experimental design. Finally, the discussion appears as the weakest point of this manuscript, and needs to be amended before the manuscript could be accepted for publication. For the form, some typographical or grammatical errors need to be addressed. Please see my main comments below.

Response: Thank you for taking the time to thoroughly review our manuscript, pointing out missing details and suggesting improvements to enhance both the practical and scientific value of the article. We hope the adjustments made are to the satisfaction of the reviewer.

Introduction:

R1.C1: Line 30: please give more explanation for absolute vs relative loads. I guess that “relative” refers to RM; but please justify it since other subjective methods exist (e.g. RIR).

Response: The authors would like to thank the reviewer for pointing out an ambiguous expression. We provided readers with a description of how absolute and relative loads should be understood in the context of the present study.

Before: […] an exercise being performed to volitional failure at either a fixed absolute or relative load.

After: Line 30-31: […] an exercise being performed to volitional failure at either a fixed absolute load, expressed in a unit of mass like kg or lbs, or a fixed relative load that has been normalized to the exercise-specific one-repetition maximum (1-RM).

R1.C2: Line 35-36: I understand your rationale, but some methods that predict the load associated with a certain repetition maximum already exist. It could be good to strengthen this point.

Response: Thank you for the pertinent suggestion. Indeed, research on the relationship between load and repetition maximum (or RTF) has a long history in exercise science. In the second paragraph of the introduction (Line 47-59), readers are provided with a short overview of published bivariate models and the limitations of a modeling approach that generalizes parameters across individuals.

We rephrased the statement addressed by the reviewer to avoid its misinterpretation as a claim for exclusivity (i.e., the individual “strength-endurance profile” being the only method to predict loads from a repetition maximum, which would not be correct).

Before: […] by studying the relationship between load and RTF (i.e., the individual “strength-endurance profile”) which would also enable practitioners to predict the load associated with a certain repetition maximum.

After: Line 37-38: […] by studying the relationship between load and RTF (i.e., the individual “strength-endurance profile”). Additionally, knowledge of the mathematical relationship between the two variables could be used by practitioners to predict the load associated with a certain repetition maximum.

R1.C3: Line 37: what is the interest to lead a repetition set to exhaustion?

Response: Thank you for requesting clarification. We believe the reviewer may have misinterpreted out statement, eventually due to an inappropriate choice of terminology (i.e., “exhaustion”). We considered “intensity of effort” might be more suitable, since a recent article explicitly proposed its definition [1].

The sentence addressed by the reviewer was referring to the idea of controlling intensity of effort on a submaximal level, therefore not executing a set to momentary failure. To avoid misinterpretation by readers, we included a comparison to autoregulatory approaches of controlling intensity of effort.

Before: This may be of particular interest for individuals seeking to control within-set exhaustion by prescribing […].

After: Line 39-40: This may be of particular interest for individuals seeking to control intensity of effort within a set [1] by prescribing […].

Added sentences: Line 41-46: While other methods have been proposed to evaluate or control intensity of effort based on perceived effort or movement velocity [2], an approach using strength-endurance profiles might overcome certain limitations of these methods. Such limitations include inappropriate anchoring of perception [1], inaccurate subjective estimates of repetitions in reserve at lower intensity of effort [3] and dependency on technology to provide reliable feedback on movement velocity [4].

R1.C4: Line 44: while the prediction become lower when the number of repetitions is higher?

Response: The authors would like to thank the reviewer for his comment. If we understand correctly, the reviewer is addressing the influence of the number of repetitions performed in the RM test on prediction bias. Some of the referenced studies suggest a trend of most predictive equations overestimating the 1-RM with decreasing load and an increasing number of repetitions performed [5, 6]. Others reported a trend of underestimating the 1-RM at lower repetition numbers (<10) in the RM test [7, 8] or only showed systematic prediction bias for specific equations [9, 10].

In the opinion of the authors, an extensive discussion of validation studies would not be beneficial to the introduction of the present manuscript, as published research does not indicate a homogenous trend for prediction bias that applies to all predictive equations, exercises and sample characteristics. We believe that prediction bias of respective equations is a multifactorial phenomenon that has yet to be investigated comprehensively. However, it was not the objective of the present study, and the project the study is embedded within, to analyze the shortcoming of these equations, but to investigate an alternative approach as a potential solution to the problem. Therefore, we decided not to go into further detail on the addressed validation studies.

Methods:

R1.C5: Line 75: Something is wrong in the formulation of this sentence; i.e. “twenty-four resistance-trained men (n=15)” is confusing. Please amend this sentence. Further, I am not sure that the BP 1-RM/body mass needs to appear here.

Response: Thank you for pointing out a confusing statement. We amended the sentence as suggested by the reviewer. Concerning the bench press 1-RM/body mass statistics, we believe it may be considered a valuable piece of information to verify that participants did indeed have substantial training experience. We also believe that the tested 1-RM is to be preferred over self-reported indicators of training experience in terms of validity and precision. Since we also defined inclusion criteria for 1-RM/body mass in the “Subjects” section, we thought it might be more appropriate to report 1-RM/body mass statistics here, rather than at the beginning of the results section.

Before: Twenty-four resistance-trained men (n = 15, age = 27.2 ± 3.3 yrs, body mass = 85.4 ± 7.9 kg, bench press 1-RM/body mass = 1.33 ± 0.11 kg•kg-1) and women (n = 9, age = 27.7 ± 5.2 y, body mass = 63.6 ± 3.3 kg, bench press 1-RM/body mass = 0.96 ± 0.17 kg•kg-1) volunteered […]

After: Line 83-84: Fifteen resistance-trained men (age = 27.2 ± 3.3 yrs, body mass = 85.4 ± 7.9 kg, bench press 1-RM/body mass = 1.33 ± 0.11 kg•kg-1) and nine resistance-trained women (age = 27.7 ± 5.2 y, body mass = 63.6 ± 3.3 kg, bench press 1-RM/body mass = 0.96 ± 0.17 kg•kg-1) volunteered […]

R1.C6: Line 92: is “work capacity” the best formulation to be used? Should you not refer to “endurance” here (as mainly performed in the introduction)?

Response: Thank you for this valuable suggestion. Indeed, we only used the term “work” at the very beginning of the introduction and it might confuse readers to switch back and forth. We changed the expression using “strength-endurance”.

Before: A test-retest design was used to determine the participants’ maximum strength and work capacity at high loads […]

After: Line 97: A test-retest design was used to determine the participants’ maximum strength and strength-endurance at high loads […]

Before: Work capacity was assessed using […]

After: Line 99: Strength-endurance was assessed using […]

R1.C7: Lines 93-94: did you control (objectively of subjectively) for muscle damage potentially induced by the 1-RM test before the beginning of the repetition maximum tests?

Response: The authors would like to thank the reviewer for this pertinent comment. No biological markers of muscle damage were collected over the course of the present study, as we wanted to omit any invasive measurements. Furthermore, no subjective markers of muscle damage were assessed to quantify the magnitude of muscle damage, as we considered it difficult to apply a valid anchoring process (i.e. “calibrating” participants’ perception of a maximum score) without a priori matching it to biological markers of muscle damage.

We did assess barbell velocity using a linear position transducer throughout every set of the test protocol. Recently, a study suggested that velocity-based estimates of the 1-RM, despite not providing good predictions of the actual 1-RM, might reflect training-induced fatigue to some degree [11]. When following a similar approach to what the authors called a MVT-based 1-RM estimate [11], using the highest mean velocity of the 3 initial warm-up sets (approx.. 25%, 50% and 75% 1-RM) of the 1-RM test (session 1) and repetition maximum tests (session 2) for model computation, the resulting MVT-based 1-RM estimates did not indicate a clear decrease between session 1 and session 2 (mean difference ± SD = -1.6 ± 4.3 kg). However, we acknowledge that this indirect approach does not necessarily prove the absence of muscle damage during repetition maximum tests 48-72h after the 1-RM test. Therefore, we decided not to include it in the present manuscript, as we believe it would not contribute to the main objective of the article.

We would argue that even if muscle damage was present over the course of the 4 sessions of this study, both the test and retest (T1 and T2) were still executed with a standardized test protocol including standardized timing between the 1-RM test and the repetition maximum test at T1 and T2. Therefore, we believe the robustness of models would not have been systematically affected by eventual muscle damage.

R1.C8: Lines 95-97: this is not clear. The determination of the 1-RM was not the first visit? You talk of 2 visits for T1 and T2; just after you say that participants visited the laboratory on four occasions. I think that this paragraph is important, and as it stands it is not clear (maybe a figure summarizing the experimental design could help).

Response: Thank you for pointing out an unclear description of the experimental protocol. We added a figure describing the experimental setup and referenced it in the addressed section. The numbering of the subsequent figures was adapted accordingly.

Added sentence: Line 109-110: Fig 1. Experimental design. 1-RM, one-repetition maximum; RTF, repetitions performed to momentary failure.

Before: […] on two occasions (T1 & T2) separated by one week.

After: Line 98: […] on two occasions (T1 & T2) separated by one week (Figure 1).

Before: Fig.1 / Figure 1

After: Line 238 / Line 227: Fig.2 / Figure 2

Before: Fig.2 / Figure 2

After: Line 267 / Line 248: Fig.3 / Figure 3

R1.C9: Line 119-120: what does this mean to “subjectively estimate the 1RM”?

Response: The authors would like to thank the reviewer for suggesting further clarification and apologize for the ambiguous expression. We rephrased the sentence accordingly.

Before: […] participants were requested to subjectively estimate their 1-RM.

After: Line 130: […] participants were requested to estimate their 1-RM based on self-evaluation of their recent training performance.

R1.C10: Line 132: what was the duration of the break? This is an important feature here to make sure that no fatigue was present before each rep max tests.

Response: Thank you for requesting further details. The time in between rep max tests (~22 min) is mentioned at the bottom of the paragraph and was added to Figure 1. The addressed paragraph was revised, since the term “break” might suggest the application of purely passive rest. However, as described in the paragraph, the time in between rep max tests was also used to actively sustain warm-up effects, therefore providing the same warm-up routine prior to each rep max test.

Investigating the presence of fatigue induced by a single visit protocol was beyond the scope of the present study. However, we would like to inform the reviewer that our laboratory recently started data acquisition for a study on that topic, comparing a single-visit protocol to a multiple-visit protocol in a crossover design using stratified randomization.

Before: […] in the form of a single-visit protocol with prolonged breaks in between.

After: […] in the form of a single-visit protocol.

Before: In order to sustain the warm-up effect during these prolonged breaks, participants underwent the same general warm-up procedure used for the 1-RM test prior to each set to failure.

After: Line 144-146: In order to provide extended time for recovery in between repetition maximum tests, yet sustain warm-up effects during these periods, participants underwent the same general warm-up procedure that was used for the 1-RM test prior to each set to failure.

R1.C11: Lines 131-142: what about the eccentric phase during the rep max test? There is no information on that? Was it passive or active? Were specific instructions given for this eccentric phase?

Response: Thank you for suggesting further improvements to promote the reproducibility of our experimental protocol. The requested details were added to the paragraph accordingly. The authors would like to emphasize that the use of a fixed movement cadence was abandoned on purpose for the rep max tests, as a recent review explicitly suggested this aspect for future investigations [12].

Added sentences: Line 153-158: Participants were instructed to lower the barbell in a controlled fashion on each repetition, albeit not being prescribed a fixed movement cadence. Similar to the 1-RM test, participants had to await the verbal command of the staff member before initiating the concentric phase of a repetition, in order to avoid any rebound from the safety pins. The concentric phase of each repetition had to be performed at maximum intended velocity.

R1.C12: Line 141: why did you chose 22min for the resting duration?

Response: Thank you for requesting clarification. The time interval in between rep max tests was the result of the described standardized warm-up procedure and the 5 min of rest provided before and immediately after each rep max test. We acknowledge that the protocol resulting in the addressed time interval was, to some extent, an arbitrary choice by the authors in an attempt to balance recovery and duration for the session, while maintaining positive warm-up effects across all tests. However, investigating the effects of different rest intervals during a single visit assessment was beyond the scope of the present study. We would further argue that the length of the rest interval does not necessarily bias any estimate for reproducibility of performance or model robustness, as long as the protocols for T1 and T2 are identical (as is the case in the present study).

The addressed section was adapted to explain readers where the 22 min resulted from.

Before: Due to this methodological structure, the repetition maximum tests were separated by approximately 22 min each.

After: Line 150-151: Due to this methodological structure (i.e., the standardized warm-up; the standardized passive rest before and after each set to failure), the repetition maximum tests were separated by approximately 22 min each.

Discussion:

R1.C13: The main results of your study (rather than only an objectives reminder) should appear in the first paragraph. The results section could be complicated for some readers that are not used to mathematical models, thus summarizing the main results at the beginning of the discussion could be helpful. Further, the last sentence (line 254-256) would fit better at the end of the discussion for your perspectives.

Response: We would like to thank the reviewer for suggesting structural improvements. Changes were made accordingly.

Added sentences: Line 277-280: Test-retest analysis of performance indicated very good reproducibility of the 1-RM and the RTF at high relative loads in the bench press exercise. The linear regression and the 2-parameters exponential regression yielded the most robust parameter estimates across the investigated models of the strength-endurance relationship.

Deleted: Our results may help practitioners understand the consistency of strength performance and assist with the selection of a reliable statistical model to calculate individual strength-endurance profiles.

Added sentence: Line 386-389: However, the results of the present study may help practitioners understand the consistency of strength performance under standardized conditions and can assist with the selection of a reliable statistical model to calculate individual strength-endurance profiles.

R1.C14: Line 263-264: I am not sure that a difference of 0.5 rep between 70% and 90 & 80% RM test could suggest (even if this is only a suggestion) that the test-retest agreement is better at higher relative loads.

Response: The authors would like to thank the reviewer for pointing this out. Indeed, the difference in SEM could not be deemed clear at the 90% credibility level and we agree that the discussion of our results should avoid suggestive statements that are not supported by our predefined thresholds for qualitative interpretation. However, given that the width of posterior distributions and, hence, the certainty about differences in effects is typically affected by sample size, we believe that any effect that missed the threshold by a small percentage of probability mass should still be pointed out.

Before: […] whereas the absolute agreement was better at higher loads […]

After: Line 14: […] whereas the absolute agreement was slightly better at higher loads […]

Deleted sentences: While these results would suggest that test-retest agreement for the RTF is better at higher relative loads compared to lower relative loads, the relative consistency of RTF performance was found to be slightly worse at higher relative loads compared to lower relative loads, although the difference in magnitude of the ICC could not be deemed systematic at the 90% credibility level.

Added sentences: Line 287-293: Posterior distribution analysis revealed no systematic differences of SEM between RTF performed at 70%, 80% and 90% 1-RM. However, a slight shift of SEM posterior distributions to lower values could be observed for RTF at higher relative loads. In particular, the difference of SEM between RTF at 70% and 90% 1-RM could have exceeded the predefined threshold for systematic differences at a larger sample size. Interestingly, the ICC showed an opposing non-systematic shift of posterior distributions, with lower relative loads resulting in slightly larger ICC values.

Before: The seemingly contradictory findings arising from absolute and relative consistency can, however, be explained mathematically to some degree

After: Line 293-294: These seemingly contradictory trends arising from absolute and relative consistency might be related to the computation of the respective statistics […]

R1.C15: Line 268: you cannot put “:” and start the following sentence with a majuscule. This appears many times in your manuscript. Please correct.

Response: Thank you for pointing this out. Changes were made accordingly.

Before: […] for the bench press: Each subject […]

After: Line 121: […] for the bench press: each subject […]

Before: […] two objectives: First, we evaluated […]

After: Line 273: […] two objectives: first, we evaluated […]

Before: […] to some degree: In the present study […]

After: Line 294: […] to some degree: in the present study […]

R1.C16: Lines 270-272: do you have objective explanation for this result?

Response: Thank you for proposing further details on the heteroscedasticity of the strength-endurance relationship. We adapted this section and provided readers with a possible explanation. Adaptations were also made as part of our response to the other reviewer’s comment #7.

Before: Indeed, our data suggest a higher between-subject standard deviation of the RTF at 70% 1-RM compared to 90% 1-RM.

After: Line 297-303: Indeed, our data suggest a higher between-subject variance of the RTF at 70% 1-RM compared to 90% 1-RM. A similar trend for heteroscedasticity in the relationship between relative load and RTF across individuals (i.e. a mean-variance “tradeoff”) has been reported on numerous occasions [5, 9, 13–18]. This phenomenon could be the result of normalizing the load to the 1-RM, which homogenizes the upper end of the load spectrum. However, it could also be partially explained by inter-individual differences in the strength-endurance relationship.

R1.C17: Line 274: it could be helpful to remind the ICC obtained in your study to confront it with the one of Anders et al.

Response: Thank you for this pertinent comment. Readers were provided the associated ICC value of the present study, as suggested by the reviewer.

Before: […] Anders and colleagues reported an ICC of 0.90 (95% CI: [0.58, 0.97]) for RTF completed at 70% 1-RM in the bench press [19].

After: Line 306-307: […] Anders and colleagues reported an ICC of 0.90 (95% CI: [0.58, 0.97]) for RTF completed at 70% 1-RM in the bench press [19], indicating a similar magnitude compared to the present study (ICC [90% HDI] = 0.86 [0.71, 0.93]).

R1.C18: Line 282: why a decrease in movement cadence can affect the number of repetitions?

Response: Thank you for requesting clarification. We expanded upon our hypothetical explanation based on past research, as suggested by the reviewer.

Before: It could be hypothesized that the reduced movement cadence might negatively affect the number of repetitions that can be achieved at the respective load, therefore resulting in a distribution similar to the RTF at higher loads when repetitions are performed at maximal voluntary velocity, as was the case in the present study.

After: Line 314-317: It could be hypothesized that the reduced movement cadence might have negatively affected the number of repetitions performed [16, 20], possibly due to an increased duration of the concentric phase of each repetition and associated increases in metabolic demand [21]. Hence, a reduced movement cadence at lower loads could result in a distribution of RTF that is similar to the RTF at higher loads when repetitions are performed at maximal voluntary velocity, as was the case in the present study.

R1.C19: In general, the discussion lacks depth. We do not clearly identify the main results of the study and the application that these results could have. Since the results part is a bit difficult to understand (although the statistical analysis are relevant), a clear and detailed discussion (with objective clues) is needed. I am further surprised by the small number of references that are used in the discussion. The development of a web application is an interested feature of this article, and should be more highlighted in my opining rather that only cited in the last sentence of the conclusion.

Response: Thank you for sharing your concerns and providing valuable suggestions to improve the manuscript. We agree that both the practical relevance of the present findings and the web application should be emphasized in the discussion section. In particular, our findings on the robustness of strength-endurance models should be highlighted, as this provides a novel perspective that, to our knowledge, has not been discussed in research yet.

Regarding the small number of references, we decided to focus any comparisons of our results to studies that featured the barbell bench press exercise, preferably studies that standardized loads to the individual 1-RM. Based on the reviewer’s suggestion, we added another reference to our discussion that investigated the reliability of the NFL-225 test, which can be regarded as a rep max test using a constant load. Other studies we found on the reliability of RTF were deemed not suitable for comparison. For example, Hoeger et al. only investigated the reliability of RTF performed on resistance training machines of a 16-station Universal Gym apparatus over the course of a pilot study without specifying the time interval in between test-retest trials and other essential methodological specifications that are potentially crucial to the reproducibility of a test [14].

Before: […] the 1-RM was found to be likely less than the minimal load increment […]

After: Line 282: […] the 1-RM was found to be likely less than the smallest load increment […]

Added sentences: Line 320-333: Other studies investigated the reproducibility of RTF in absolute loads. For example, Mann et al. analyzed the test-retest reliability of NCAA Division I football players in the NFL-225 test, which is a repetition maximum test using a fixed load of 225 lbs or 102.3 kg in the bench press exercise [22]. The authors reported an ICC of 0.98 to 0.99 and a typical error of 1.0 to 1.3 repetitions across three trials, the typical error corresponding to what has been calculated as SEM in the present study. While it is difficult to evaluate at what percentage of the 1-RM each participant performed the NFL-225 test in the absence of a 1-RM test, the authors estimated it to be around 67.9% 1-RM for athletes with a body mass below 100.5 kg and around 44.6% 1-RM for heavier athletes. Therefore, the majority of participants performed the NFL-225 test at lower relative loads compared to the present study. Given this fact, the reports of Mann et al. [22] correspond well to the results of the present study (SEM for RTF at 70% 1-RM [90% HDI] = 1.1 [0.8, 1.4] repetitions), especially when considering the large between-subject variance reported by the authors, which may have contributed to the large ICC, as discussed before.

Before: […] both resulted in a high probability for Δb to fall within the threshold […]

After: Line 357-358: […] both resulted in a high probability for Δb (i.e., the change in slope and curvature parameters, respectively) to fall within the threshold […]

Added sentences: Line 373-380: Based on the findings of the present study, a freely available web application was developed using the R package shiny (version 1.7.1). The application provides practitioners with a user-friendly interface to enter data from repetition maximum tests and offers different algorithms to compute the individual and exercise-specific strength-endurance profile. Upon computation, it offers a graphical display of the profile, a model equation and an adjusted R² estimate to evaluate model fit. Furthermore, it produces an individual repetition-maximum table based on the estimated model parameters that predicts loads for a wider spectrum of RTF. A link to the web application is provided at the end of this article.

Before: To help practitioners with calculating a robust strength-endurance profile, a free-to-use web application was developed based on the findings of the present study, which can be accessed using the following link:

After: Line 403-404: To derive a robust strength-endurance profile, practitioners can access a free-to-use web application using the following link:

Response to reviewer 2

R2: I’d like to congratulate the authors on a simple yet elegant, and rigorously conducted, study. I actually have very little to suggest here and think that it could largely be published as is. The app is very nicely put together too. I just make a few comments below which the authors might wish to consider.

Many thanks

James Steele

Response: Thank you very much for these kind words and your valuable input. We hope to have addressed all points according to your expectations.

Comments:

R2.C1: “Failure” – you use the phrase “volitional failure” and do not provide a definition for this. You might wish to consider the following article from our group that discusses definitions of terms in relation to this - https://pubmed.ncbi.nlm.nih.gov/28044366/

Response: Thank you for pointing out unclear terminology that could be potentially misinterpreted by readers. Our original understanding of “volitional failure” was the point at which participants could not complete another repetition across the full range of motion using volitional contraction of target muscles despite using maximal effort. We acknowledge that this interpretation differs substantially from the definition provided in the article referenced by the reviewer and could therefore be misunderstood by readers and contribute to terminological ambiguity in the literature. We therefore changed the expression to “momentary failure” (MF) throughout the manuscript and referred readers to its definition in the article provided by the reviewer.

Line 27: volitional failure → momentary failure

Line 29: volitional failure → momentary failure

Line 239: volitional failure → momentary failure

Line 244, Table 1 (caption): volitional failure → momentary failure

Line 399: volitional failure → momentary failure

Added sentence: Line 158-161: A repetition maximum test was terminated once the participant was unable to complete another repetition across the full range of motion despite using maximal effort, suggesting that the point of momentary failure had been reached [1].

R2.C2: In a supplementary analysis for a recent meta-analysis from our group (https://sportrxiv.org/index.php/server/preprint/view/109/version/120), we collated data from some studies (https://osf.io/td26u/) reporting group level results for repetitions performed to failure at different relative loads. We did explore group level strength-endurance profiles (though in order to compare self-selected repetitions numbers to what could be performed; https://osf.io/xqz9a/). Anyway, I just thought it might be of interest considering this current study. We only fit a simple linear model to it, but it would be interesting to see how well the other models you describe might fit.

Response: We would like to thank the reviewer for sharing these valuable findings from his research group. Indeed, the scatter plot showing pooled meta-analytic data in https://osf.io/xqz9a/ suggests that a curvilinear model might provide a good fit to the bivariate relationship, especially when considering loads below 60% 1-RM. It would be interesting to see if an exponential or hyperbolic model provides a better fit for the data in the Meta-Analysis. There are various sources supporting the idea of the relationship being curvilinear. We therefore decided to include the topic in our discussion.

Added sentences: Line 364-372: To decide which of the two models to apply in a practical setting, practitioners should also consider statistical qualities other than the robustness of models. For example, both the model fit und predictive validity can be considered essential characteristics of a valuable strength-endurance profile. While previous research provided some evidence that the relationship may be considered approximately linear at high loads [5, 9, 15, 23], it has been suggested that the relationship actually follows a curvilinear trend when considering the full spectrum of loads [10, 15, 16]. Therefore, practitioners might want to resort to applying the 2-parameters exponential regression rather than the linear regression to model strength-endurance profiles, as research has not proposed any explicit disadvantages reasoning against its use.

R2.C3: I appreciate the reasoning for not randomising the loads, though think it might be worth mentioning that this is a possible limitation that could in and of itself introduce some degree of systematic bias. Perhaps just mention it in the discussion.

Response: Thank you for sharing your concerns. We agree that the presence of systematic bias resulting from an order effect cannot be ruled out. In fact, our laboratory recently started data acquisition for a replication study to analyze the magnitude of systematic bias resulting from a non-randomized single-visit protocol (as described in the present manuscript) compared to a multiple-visit approach where single RM tests are executed on different days in randomized order (similar to what has been applied by other researchers, e.g. reference 17, 18). The potential limitations of our methodological design were communicated to readers as suggested by the reviewer.

Added sentences: Line 381-386: It should be pointed out that the order of repetition maximum tests was not randomized in the present study. Hence, a possible systematic effect of the earlier sets performed to momentary failure on subsequent sets and, thus, the presence of systematic bias in the RTF performed cannot be excluded. Future research should strive to compare different test protocols and identify a valid, yet practically applicable approach to acquiring the necessary data for model computation.

R2.C4: I might also add to the statistical analysis when describing the models for strength-endurance profiles that the random effects for participants included both intercepts and slopes. It is clear from the equations, but not all are mathematically inclined and so explicitly mentioning this in the text might be worthwhile.

Response: The authors would like to thank the reviewer for this valuable suggestion. It was quite challenging to decide which aspects of the statistical analysis to include in the main manuscript and which ones to store as supplemental material to achieve good balance between complexity of information and transparency. The authors agree it should be clarified in the main manuscript that all model parameters were free to vary across participants. However, we are not sure whether the expressions “slope” and “intercept” might be appropriately reflecting parameters for non-linear models, especially for the 3-parameters exponential regression (Ex3) and the critical load model (Crit). As suggested by Morton et al. [24], the intercept of Crit is not expressed as a single parameter in the original model, but can be accessed through reparameterization using all three model parameters CL, ALC (=L’) and k. Similarly, the intercept in Ex3 is not expressed as a single parameter, but the resulting sum of a + c. We therefore believe it might be better to use unspecific terminology and address them as “model parameters”.

Deleted: using subjects as a random effect

Added sentence: Line 210-211: Importantly, all of the abovementioned parameters were modeled as random effects that were free to vary across subjects.

R2.C5: The small systematic increase in 1RM, and perhaps RTFs, might be explained by the test practice effect as Jeremy Loenneke’s group have discussed (e.g., https://pubmed.ncbi.nlm.nih.gov/27875635/, https://pubmed.ncbi.nlm.nih.gov/28463902/)

Response: Thank you for providing further resources to complement the discussion of our findings. We addressed the potential explanation in the discussion as suggested by the reviewer.

Added sentence: Line 338-347: A systematic increase in the 1-RM between test and retest has previously been described on numerous occasions for various exercises [25]. Interestingly, Ribeiro and colleagues reported that this time effect did not interact significantly with participants’ experience in resistance training [26]. While the magnitude of the systematic change (Δt [90% HDI] = 1.9 kg [1.0, 2.7]) could be considered trivial in the present study, given the smallest load increment was 2.5 kg, previous research suggested that the effect may occur over the course of multiple consecutive retest trials as a result of practicing the test [26–28]. Similarly, the time effect of RTF performed at 90%, 80% and 70%-1RM showed a high probability for being less than 1 repetition. Despite the RTF at 80% and 70% 1-RM indicating a systematic difference between T1 and T2, the magnitude of this effect is likely trivial.

R2.C6: I think it would be worthwhile to include, similarly to the 1RM/RTF table, a table showing the parameter estimates from each model for T1 and T2 in addition to the change parameter estimated. It would be nice for example to compare to estimates from other studies (I appreciate the data are available so a reader could do this themselves if they wanted too though).

Response: The authors would like to thank the reviewer for his suggestion. We agree that readers might want to compare parameter estimates to those of other studies and, therefore, they should be provided with absolute values in addition to relative and standardized parameters. In particular, this might help readers to understand, why an interpretation of standardized change effects in the present study should absolutely be complimented by an analysis of relative change effects. As suggested by the reviewer, a table was added (Table 2), summarizing posterior predictive distributions of absolute parameter values at T1 and T2, as well as absolute change effects. Consequently, the former Table 2 (standardized and relative change effects) was renumbered to Table 3.

Added sentences: Line 261-262: Table 2. Summary of posterior predictive distributions of absolute parameter values during test (T1) and retest (T2)

Before: Table 2

After: Line 259 / Line 362: Table 3

Before: Posterior predictive distributions for […]

After: Line 246-247: Posterior predictive distributions of subject-level model parameters at T1 and T2 are summarized in Table 2. Moreover, posterior predictive distributions for […]

R2.C7: Lastly, you mention the mean-variance relationship in the discussion. I just thought it worth highlighting that this is very much apparent for repetitions performed, particularly for their log transformation (see meta-analytic estimate from the supplementary data in our meta-analysis mentioned: https://osf.io/fznhu?show=view&view_only=).

Response: Thank you for sharing these insights with us. It would be interesting to see if this relationship differs by exercise (or a similar factor like agonist muscle group) to some extent, in the sense that some exercise categories tend to result in less between-subject variance of RM across loads. We are more than welcome to further exchange views with the reviewer independently of the content of the present article.

There is indeed some evidence to support this relationship, also when looking into single studies who tested multiple loads or repetition maxima. As indicated by reviewer 2 and reviewer 1, we expanded upon the mean-variance relationship in the discussion to strengthen our point. The topic was addressed as part of our response to comment #16 of reviewer 1.

References

1. Steele J, Fisher J, Giessing J, Gentil P. Clarity in reporting terminology and definitions of set endpoints in resistance training. Muscle Nerve 2017; 56(3):368–74.

2. Suchomel TJ, Nimphius S, Bellon CR, Hornsby WG, Stone MH. Training for Muscular Strength: Methods for Monitoring and Adjusting Training Intensity. Sports Med 2021; 51(10):2051–66.

3. Hackett DA, Cobley SP, Davies TB, Michael SW, Halaki M. Accuracy in Estimating Repetitions to Failure During Resistance Exercise. J Strength Cond Res 2017; 31(8):2162–8.

4. García-Ramos A, Torrejón A, Feriche B, Morales-Artacho AJ, Pérez-Castilla A, Padial P et al. Prediction of the Maximum Number of Repetitions and Repetitions in Reserve From Barbell Velocity. Int J Sports Physiol Perform 2018; 13(3):353–9.

5. Brechue WF, Mayhew JL. Upper-body work capacity and 1RM prediction are unaltered by increasing muscular strength in college football players. J Strength Cond Res 2009; 23(9):2477–86.

6. Ware JS, Clemens CT, Mayhew JL, Johnston TJ. Muscular endurance repetitions to predict bench press and squat strength in college football players. J Strength Cond Res 1995; 9(2):99–103.

7. LeSuer DA, McCormick JH, Mayhew JL, Wasserstein RL, Arnold MD. The Accuracy of Prediction Equations for Estimating 1-RM Performance in the Bench Press, Squat, and Deadlift. J Strength Cond Res 1997; 11(4):211–3.

8. Wood TM, Maddalozzo GF, Harter RA. Accuracy of Seven Equations for Predicting 1-RM Performance of Apparently Healthy, Sedentary Older Adults. Meas Phys Educ Exerc Sci 2002; 6(2):67–94.

9. Reynolds JM, Gordon TJ, Robergs RA. Prediction of One Repetition Maximum Strength from Multiple Repetition Maximum Testing and Anthropometry. J Strength Cond Res 2006; 20(3):584–92.

10. Mayhew JL, Johnson BD, Lamonte MJ, Lauber D, Kemmler W. Accuracy of prediction equations for determining one repetition maximum bench press in women before and after resistance training. J Strength Cond Res 2008; 22(5):1570–7.

11. Hughes LJ, Banyard HG, Dempsey AR, Peiffer JJ, Scott BR. Using Load-Velocity Relationships to Quantify Training-Induced Fatigue. J Strength Cond Res 2019; 33(3):762–73.

12. Bergstrom HC, Dinyer TK, Succi PJ, Voskuil CC, Housh TJ. Applications of the Critical Power Model to Dynamic Constant External Resistance Exercise: A Brief Review of the Critical Load Test. Sports (Basel) 2021; 9(2):15.

13. Richens B, Cleather DJ. The relationship between the number of repetitions performed at given intensities is different in endurance and strength trained athletes. Biol Sport 2014; 31(2):157–61.

14. Hoeger WW, Hopkins DR, Barette SL, Hale DF. Relationship between repetitions and selected percentages of one repetition maximum: A comparison between un-trained and trained males and females. J Strength Cond Res 1990; 4(2):47–54.

15. Desgorces FD, Berthelot G, Dietrich G, Testa MSA. Local muscular endurance and prediction of 1 repetition maximum for bench in 4 athletic populations. J Strength Cond Res 2010; 24(2):394–400.

16. Sakamoto A, Sinclair PJ. Effect of movement velocity on the relationship between training load and the number of repetitions of bench press. J Strength Cond Res 2006; 20(3):523–7.

17. Moss AC, Dinyer TK, Abel MG, Bergstrom HC. Methodological Considerations for the Determination of the Critical Load for the Deadlift. J Strength Cond Res 2021; 35(Suppl 1):S31-S37.

18. Dinyer TK, Byrd MT, Vesotsky AN, Succi PJ, Bergstrom HC. Applying the Critical Power Model to a Full-Body Resistance-Training Movement. Int J Sports Physiol Perform 2019; 14(10):1364–70.

19. Anders JPV, Keller JL, Smith CM, Hill EC, Housh TJ, Schmidt RJ et al. The Effects of Asparagus Racemosus Supplementation Plus 8 Weeks of Resistance Training on Muscular Strength and Endurance. J Funct Morphol Kinesiol 2020; 5(1):4.

20. LaChance PF, Hortobagyi T. Influence of Cadence on Muscular Performance During Push-up and Pull-up Exercise. J Strength Cond Res 1994; 8(2):76–9.

21. Fountain WA, Valenti ZJ, Lynch CE, Guarnera SR, Meister BM, Carlini NA et al. Order of concentric and eccentric muscle actions affects metabolic responses. J Sports Med Phys Fitness 2021; 61(12):1587–95.

22. Mann JB, Ivey PJ, Brechue WF, Mayhew JL. Reliability and smallest worthwhile difference of the NFL-225 test in NCAA Division I football players. J Strength Cond Res 2014; 28(5):1427–32.

23. Brzycki M. Strength Testing - Predicting a One-Rep Max from Reps-to-Fatigue. JOPERD 1993; 64(1):88–90.

24. Morton RH, Redstone MD, Laing DJ. The Critical Power Concept and Bench Press: Modeling 1RM and Repetitions to Failure. Int J Exerc Sci 2014; 7(2):152–60.

25. Grgic J, Lazinica B, Schoenfeld BJ, Pedisic Z. Test-Retest Reliability of the One-Repetition Maximum (1RM) Strength Assessment: a Systematic Review. Sports Med Open 2020; 6(1):31.

26. Ribeiro AS, do Nascimento MA, Mayhew JL, Ritti-Dias RM, Avelar A, Okano AH et al. Reliability of 1RM test in detrained men with previous resistance training experience. IES 2014; 22(2):137–43.

27. Mattocks KT, Buckner SL, Jessee MB, Dankel SJ, Mouser JG, Loenneke JP. Practicing the Test Produces Strength Equivalent to Higher Volume Training. Med Sci Sports Exerc 2017; 49(9):1945–54.

28. Dankel SJ, Counts BR, Barnett BE, Buckner SL, Abe T, Loenneke JP. Muscle adaptations following 21 consecutive days of strength test familiarization compared with traditional training. Muscle Nerve 2017; 56(2):307–14.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 1

Mathieu Gruet

22 Apr 2022

Reproducibility of strength performance and strength-endurance profiles: a test-retest study

PONE-D-22-01272R1

Dear Mr Benedikt Mitter

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Mathieu Gruet, Ph.D

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: I have no further comments, the authors have made a substantial revision work; congratulation on that

Reviewer #2: Thank you for your edits and responses. I have no further comments to add - again, great work on this piece!

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Robin Souron

Reviewer #2: Yes: James Steele

Acceptance letter

Mathieu Gruet

27 Apr 2022

PONE-D-22-01272R1

Reproducibility of strength performance and strength-endurance profiles: a test-retest study

Dear Dr. Mitter:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Mathieu Gruet

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Appendix. Modeling details.

    This file contains detailed information on priors and models.

    (PDF)

    S1 Table. Raw data.

    This file contains the data used for the statistical analyses.

    (XLSX)

    Attachment

    Submitted filename: Response to Reviewers.docx

    Data Availability Statement

    All relevant data are within the manuscript and its Supporting Information files. Scripts used for the statistical analyses can be accessed using the DOI: https://doi.org/10.5281/zenodo.5840363.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES