Intra- and inter-rater reliability, agreement, and minimal detectable change of the handheld dynamometer in individuals with symptomatic hip osteoarthritis

Gilvan Ferreira Vaz; Felipe Florêncio Freire; Henrique Mansur Gonçalves; Marcus Alexandre Brito de Aviz; Wagner Rodrigues Martins; João Luiz Quagliotti Durigan

doi:10.1371/journal.pone.0278086

. 2023 Jun 8;18(6):e0278086. doi: 10.1371/journal.pone.0278086

Intra- and inter-rater reliability, agreement, and minimal detectable change of the handheld dynamometer in individuals with symptomatic hip osteoarthritis

Gilvan Ferreira Vaz ^1,^2,^*,^#, Felipe Florêncio Freire ^2,^#, Henrique Mansur Gonçalves ^3,^‡, Marcus Alexandre Brito de Aviz ^4,^‡, Wagner Rodrigues Martins ^1,^‡, João Luiz Quagliotti Durigan ^1,^5,^‡

Editor: Theodoros M Bampouras⁶

PMCID: PMC10249871 PMID: 37289803

Abstract

Introduction

The handheld dynamometer has been validated to measure muscle strength in different muscle groups. However, to date, it has not been tested in individuals who experience pain induced by hip osteoarthritis. The current study aimed to evaluate the intra- and inter-rater reliability, agreement, and minimal detectable change of the Lafayette handheld dynamometer, model 1165, to assess the peak force (Pk) and average peak force (Af) of hip muscles in individuals with symptomatic hip osteoarthritis.

Methods

Twenty participants with hip osteoarthritis (mean ± SD age: 58.7±15.3 years; body mass index: 28.8±4.2 kg/m²) and pain intensity on the Visual Analogue Scale ≥ 4 (8.05±1.2) were recruited to participate in this study. Pk and Af of hip flexors (seated position), abductors and adductors (supine position), and extensors (prone position) were collected in a single day by two independent raters, each one obtaining test and retest in randomly ordered separate sessions.

Results

The intra-rater intraclass correlation coefficient (ICC) was classified as good (>0.75) or excellent (≥0.90) for all muscle groups and all inter-rater ICCs were classified as excellent. Rater A had a lower standard error of measurement compared to rater B, ranging from 0.15 to 0.58 kilogram-force (Kgf) compared with 0.34 to 1.25 kg, respectively. However, the inter-rater comparison showed a minimal detectable change (MDC) of < 10% for all Pk and Af measures for hip adductors and extensors. Finally, the inter-rater Bland-Altman analysis demonstrated good agreement for abductors, adductors, and extensors.

Conclusion

Despite pain and dysfunction related to hip osteoarthritis, the mean of two measures using a handheld dynamometer was shown to be a reliable tool to assess hip muscle strength, with good to excellent intra- and inter-rater ICCs, satisfactory agreement, and small values for MDC.

Introduction

Hip Osteoarthritis (OA) is an end-stage disease from various causes, resulting in chronic hip pain, dysfunction, and stiffness. It is estimated that symptoms are present in 5 to 10% of adults older than 40 and 45 years, considering the Spanish and American populations, respectively, with a higher prevalence with increasing age [1, 2]. Chronic hip pain is associated with muscle atrophy and weakness, as demonstrated in a meta-analysis conducted with pooled data from thirteen articles. Collectively, the authors observed a reduction in muscle strength in individuals with osteoarthritis that mainly affects hip flexors (-22%) and hip extensors (-21%), or abductors (-31%) and adductors (-25%) compared to healthy control groups [3].

Muscle weakness and atrophy seem to have a central role in the dysfunction related to hip osteoarthritis, as demonstrated by imaging studies and isometric dynamometry [3–5]. Both are deeply connected to the degree of radiographic OA classification, and their progress should be avoided through participation in exercise programs that include aerobic and strengthening exercises [6, 7]. However, a reliable and easy method seems to be necessary to measure the strength of hip muscles in the clinical routine, in order to monitor disease and treatment progression in the rehabilitation process.

Measurement of Peak Force (Pk) has been considered the gold standard method for isokinetic test parameters to evaluate muscle function [8] and can be acquired with fixed or portable dynamometry. Handheld dynamometers (HHD) have been suggested as a practical, feasible, and simple tool to assess isometric lower limb muscle strength in the clinical setting [9] compared to fixed laboratory-based dynamometry, such as isokinetic dynamometers [10, 11]. In addition, manual dynamometers require little training for proficient application [10, 12] and have lower costs than fixed laboratory-based dynamometry [13, 14].

Several studies have validated and recommended the use of different HHDs to measure hip muscle strength, with good to excellent intraclass correlation coefficients (ICC). The validity of two HHDs compared with a fixed laboratory-based dynamometer was previously demonstrated [11], with good to excellent reliability, particularly for proximal muscle groups in the lower limbs of health subjects. Other author also found good to excellent reliability when evaluating hip flexors and adductors of young adult football players [15], or when evaluating young, healthy adults with similar results when testing hip and knee strength [16]. The literature assessed different hip muscles in various protocols and also recommended the use of manual devices to acquire hip muscle strength in healthy subjects [9, 17, 18]. Only one study tested the use of the HHD in older people (> 65 years old) and found good reliability for hip and knee muscle strength measures, without specifying lower limb articular disease [10]. There are no definitive findings about the reliability of HHD measurements in participants with symptomatic hip OA.

Accordingly, it is crucial to determine if pain intensity related to hip OA could affect the reliability of HHD in muscle strength evaluations of the hip, since comfort may be a potential limitation for strength performance [12]. Furthermore, the standard error of measurement (SEM) and minimal detectable change (MDC) need to be determined to allow comparability for routine measurements in clinical settings of symptomatic hip OA patients. The purpose of this methodological study was to analyze the reliability, agreement, and minimal detectable change of an HHD in individuals with chronic hip pain related to OA. We hypothesized that HHD could be a reliable tool to measure muscle strength for hip muscles even if symptomatic hip OA is present. Our findings could help clinicians and physical therapists to design more rational assessment strategies for individuals with chronic hip pain related to OA, using a tool that requires little training, is low cost, and minimizes the time needed by patients and clinicians.

Methods

Study design

A methodological study with repeated measures was conducted to determine the intra- and inter-rater reliability, agreement, and MDC for strength assessment of hip muscles obtained with the HHD, testing subjects who experience chronic hip pain. Participants were assessed in a single-day session. Data collection occurred between August 2021 and March 2022 after approval by the local Ethics Committee (CAAE 40347320.1.1001.0025), following the Helsinki Declaration of 1975. All participants signed an informed consent before data collection. The research was conducted at the Hospital das Forças Armadas (Brasília, Brazil) and Instituto Hospital de Base (Brasília, Brazil), following the guidelines for reporting reliability and agreement studies (GRRAS) [19].

Participants

Twenty participants {40% female, mean age 58.7 (± 15.3) years, age range = 41–79 years, body mass index = 28.8 (± 4.2) kg/m²} with symptomatic hip OA were enrolled in the present study from the Orthopedic Department of two tertiary hospitals. Eligibility and demographic data were obtained using an interview questionnaire formulated by the authors. Study procedures were explained to potential participants, and they were assigned to the study protocol if eligible and after giving written informed consent. Participants were included if they presented hip OA radiographically classified as type II (Definite osteophytes, possible joint space narrowing), III (moderate osteophytes, definite joint space narrowing, some sclerosis, possible bone-end deformity), or IV (Large osteophytes, marked joint space narrowing, severe sclerosis and definite bone ends deformity) according to the Kellgren and Lawrence classification [20, 21], performed by rater B. All participants had previously been screened with x-ray images as part of their usual care, and no additional image investigation was performed. Other causes of the reported pain, lower limb and back, were also excluded as the primary source of pain, and range of motion was tested to guarantee the hip as the source of the symptoms.

Instruments

Pain intensity was assessed using the Visual Analogue Scale (VAS), with faces ranging from 0 to 10, presented to the participants at the eligibility interview and after each protocol sequence of muscle strength assessment [22, 23]. The Western Ontario and McMaster Universities Index (WOMAC) was used as a Health-related Quality of Life (HrQOL) questionnaire developed for patients with hip and knee OA as a self-reported tridimensional scale. The questionnaire evaluates pain, function, and joint stiffness (five questions for the subscale of pain, two questions for the subscale of stiffness, and seventeen questions for the subscale of function). Answer options are presented on a 5-point Likert scale. The total possible score ranges from 0 to 96; the fewer points scored, the better the patient’s HrQOL [24, 25]. Lastly, to characterize the severity of hip OA, the Harris Hip Score (HHS) was applied by one of the examiners (rater A) to evaluate four domains: Pain (0–44 points), function (0–47 points), absence of deformity (0 or 4 points), and mobility (0–5 points). Scores range from 0 to 100, with higher scores demonstrating less compromised hip joints [26–28].

Procedures

The HHD Lafayette Manual Muscle Testing System Model-01165 (Lafayette Instrument Company, Lafayette IN, USA) was used to assess hip muscle strength during a three-second maximal effort, following the protocol sequence: hip flexors (seated position), hip abductors, and adductors (supine, long-lever), and hip extensors (prone, long-lever) performed on a regular examination table and collected on the same day by both raters. This time frame was chosen considering the clinical context, that individuals with symptomatic hip OA and older adults have difficulties in moving around, which could affect adherence to a second day of evaluation. We assumed that patients would be more interested in participating in the study protocol if measurements were taken on the same day as their regular medical evaluation. In addition, our protocol aimed to mimic the clinical routine of physicians and physiotherapists when evaluating their patients, reproducing a more realistic scenario to be adopted in practice [19, 29].

Participant and rater positions have been described elsewhere [11], with some minor modifications. Hip flexors (Fig 1A) were evaluated with the participant on an examination table, seated with both legs hanging off the table and arms positioned at the sides of the body, and both knees and hips at 90°. The assessor was placed right in front of the affected lower limb, holding the HHD with both hands at the anterior aspect of the thigh, 1 to 2 cm above the superior edge of the patella. Participants were instructed to push against the HHD, trying to flex the hip with the maximal force for three seconds. Hip abductors (Fig 1B) were tested in the supine position, hands crossed in front of the chest, hip and knee at 0°, with the assessor standing by the side of the examination table and holding the HHD with both hands above the lateral malleolus (long-lever), using their own body to stabilize it. Similarly, the participant tried to abduct the affected hip against the HHD. Hip adductors (Fig 1C) were evaluated with the participant in the same position, but now, with the HHD held above the medial malleolus (long-lever) and the examiner placing their knee in the middle of both participants’ ankles. In this situation, the participant was encouraged to adduct only the affected leg. Finally, the participant was instructed to lie in the prone position to evaluate hip extensors (Fig 1D), arms crossed under the forehead, hip and knee at 0°. The rater stood immediately in front of the end of the table, holding the device with both hands, elbows extended, at 3–4 cm above the posterior calcaneal tuberosity (long-lever), followed by an attempt to extend the hip while maintaining the knee at full extension. All participants were advised not to flex the knee during hip extension.

Before every protocol sequence of muscle strength assessment, participants were instructed to push against the HHD with their maximum force for three seconds and were reminded that the test starts as they push the HHD and hear a single sound alarm and finishes as they hear a double sound alarm. A submaximal strength test trial was performed in the seated position with the non-affected limb to familiarize the participant with how the device works and the sound alarm. One demonstration was also performed in the supine and prone position to clarify how the test could be performed if required [10, 11]. None of the participants had any previous familiarity with this device.

Two independent raters performed data collection, both physicians (V.G.F.; F.F.F) with no experience with the HHD. Raters were allowed to practice the measurements protocol sequence for four months. Data were registered using REDCap (Research Electronic Data Capture) electronic data capture tools hosted at Instituto Hospital de Base [30, 31]. Each rater repeated measurements twice on the same day. To minimize any possible effect of cumulative pain resulting from test-retest, the order of data collection was defined using a randomized sequence generated on the website sealedenvelope.com (proportion of 1:1, in blocks of four). Participants were allowed to rest between each protocol sequence until they felt comfortable to start the next round [14]. The VAS for pain was measured after each sequence. Participants were given continuous encouragement to push harder against the HHD to obtain maximal isometric force during the 3 seconds of each test [11, 14].

Statistical analysis

Descriptive statistics were used to describe participants’ sociodemographic characteristics. The Shapiro-Wilk test was performed to confirm the normal distribution of the data. The Paired t-test was used to compare VAS for pain intensity between intra- and inter-rater measures. Assessment of intra- and inter-rater reliability regarding Pk and Af measures was conducted using the ANOVA 2-way random model, with a Confidence Interval of 95% (95%CI), to compare test-retest measures for intra-rater analysis, and the mean of test-retest for inter-rater analysis. To categorize the reliability between repeated measures, we assessed the intraclass correlation coefficient (ICC _2,1), and the correlation between measures was classified as poor (ICC < 0.5), moderate (0.5 ≥ ICC < 0.75), good (0.75 ≥ ICC < 0.90), and excellent (ICC ≥ 0.90) [11, 32]. To define the presence of bias in the data and establish the Limit of agreement (LoA) between raters, mean values considering the two measures were plotted with a 95% CI using the Bland-Altman (BA) method [33, 34]. Absolute reliability was evaluated by calculating the Standard Error of Measurement (SEM) and percentage of values (SEM%), and Minimal Detectable Change (MDC) and percentage of values (MDC %) for a 95% CI were calculated considering the following equation: $S E M = (\sqrt{\frac{S S t o t a l}{n - 1}}) x \sqrt{(1 - I C C)}$ and $M D C = [z s c o r e (95 % C I)] x S E M x \sqrt{2}$ [35, 36]. Statistical significance was assumed when p < 0.05. All statistical analyses were performed using SPSS version 25 (IBM Corp., Chicago, IL, USA), and the BA graphs were plotted by GraphPad Software (San Diego, CA, USA). The sample size was calculated using an acceptable ICC of 0.70, an expected ICC of 0.90, and assuming an α of 5% and power of 80%, with a drop-out rate of 10%, resulting in a minimal sample of 20 participants [37].

Results

The demographic data of the sample are shown in Table 1. Approximately 85% of the participants presented a defined joint space reduction associated with sclerosis and moderate to severe osteophytes (types III/IV), representing the whole spectrum of substantial alterations in the x-ray related to osteoarthritis. Considering all daily activities during the week before inclusion in the study, the pain intensity (VAS; mean ± SD) was 8.05±1.2, 95%CI {7.47–8.62}, and together with an HHS score of 50.2±20.1 and a WOMAC score of 63.5±14.0, the data show considerable pain, dysfunction, and a reduction in quality-of-life related to hip OA.

Table 1. Characteristics of participants.

Characteristic	Sample (n = 20)
Age, mean (SD), y	58.7 ± 15.28
Sex (%)
Male	12 (60)
Female	8 (40)
Radiographic disease severity (%)
KL II	3 (15)
KL III	6 (30)
KL IV	11 (55)
Symptoms (%)
6m – 1y	1 (5)
1y – 2y	9 (45)
2y – 5y	5 (25)
> 5y	5 (25)
Body mass index, mean (SD)	28.82 ± 4.23
VAS (0–10), mean (SD)	8.05 ± 1.23
HHS (0–100), mean (SD)	50.2 ± 20.1
WOMAC (0–96), mean (SD)	63.5 ± 14.0

Open in a new tab

SD: standard deviation; y: year; m: month; KL: Kellgren and Lawrence classification; VAS: visual analogue scale; HHS: Harris Hip Score; WOMAC: Western Ontario and McMaster Universities.

A statistical difference in VAS (mean±SD, 95%CI) was observed for pain intensity after test and retest for rater A (Test: 6.11±2.96, {4.68–7.36}; Retest: 6.74±2.94, {5.32–8.15}; p = 0.01) that was not observed for rater B (Test: 6.42±2.52, {5.20–7.73}; 6.74±2.74, {5.73–8.04}; p = 0.49), or between raters when considering the mean VAS for the pain intensity after two measures (A test-retest: 6.55 ± 2.96, {5.12–7.98}; B test-retest: 6.73 ± 2.39, {5.58–7.89}; p = 0.54). Therefore, considering the imprecision related to a VAS of ± 20mm [38] and the minimal clinically important difference in pain for hip osteoarthritis of 24mm and 30mm regarding a baseline VAS interval of 50 – 65mm and >65mm, respectively [39, 40], our result did not reach a meaningful change for intra- or interrater VAS between trials.

Table 2 shows the mean ± SD values of test-retest Pk and Af, relative reliability expressed as ICC_2,1, absolute reliability expressed as SEM, and MDC₉₅ for the four major hip muscle groups, comparing intra- and inter-rater reliability.

Table 2. Handheld dynamometer reliability analysis for hip muscle groups.

		Intra-rater A					Intra-rater B					Interrater
Hip muscle group	Measure	Test (mean±SD)	Retest (mean±SD)	ICC (95% CI)	SEM (SEM%)	MDC₉₅ (MDC%)	Test (mean±SD)	Retest (mean±SD)	ICC (95% CI)	SEM (SEM%)	MDC₉₅ (MDC%)	Rater A (mean±SD)	Rater B (mean±SD)	ICC (95% CI)	SEM (SEM%)	MDC₉₅ (MDC%)
Flexors	Pk	13.11±6.00	13.32±6.20	0.931^a (0.822–0.974)	0.58 (4.36)	1.60 (12.07)	11.93±3.76	12.67±6.32	0.851^a (0.612–0.942)	1.04 (8.49)	2.89 (23.52)	13.22±5.90	12.30±4.85	0.966^a (0.912–0.987)	0.28 (2.22)	0.79^b (6.16)
Flexors	Af	10.83±5.08	10.86±5.04	0.939^a (0.841–0.976)	0.42 (3.91)	1.18 (10.83)	9.54±2.95	10.24±4.90	0.761^a (0.380–0.908)	1.25 (12.68)	3.47 (35.14)	10.85±4.91	9.89±3.64	0.935^a (0.832–0.975)	0.42 (4.08)	1.17 (11.31)
Abductors	Pk	7.52±4.09	8.15±4.13	0.974^a (0.932–0.990)	0.17 (2.12)	0.46^b (5.89)	7.87±4.84	8.21±4.49	0.927^a (0.812–0.972)	0.47 (5.85)	1.30 (16.16)	7.83±4.06	8.04±4.51	0.971^a (0.924–0.989)	0.18 (2.22)	0.49 ^b (6.15)
Abductors	Af	5.97±2.99	6.45±3.15	0.968^a (0.917–0.988)	0.15 (2.42)	0.42^b (6.69)	6.42±4.01	6.58±3.48	0.932^a (0.822–0.974)	0.35 (5.41)	0.98 (15.01)	6.21±3.02	6.50±3.63	0.913^a (0.774–0.967)	0.40 (6.28)	1.11 (17.41)
Adductors	Pk	9.31±5.05	10.91±5.69	0.975^a (0.935–0.990)	0.26 (2.60)	0.73^b (7.20)	9.16±4.94	10.87±6.13	0.930^a (0.818–0.973)	0.57 (5.69)	1.58 (15.77)	10.11±5.32	9.97±5.33	0.982^a (0.952–0.993)	0.14 (1.36)	0.38^b (3.77)
Adductors	Af	7.08±3.71	8.31±4.08	0.957^a (0.888–0.983)	0.30 (3.85)	0.82 (10.68)	6.95±3.72	8.38±4.47	0.945^a (0.854–0.980)	0.43 (5.55)	1.18 (15.39)	7.69±3.82	7.67±3.97	0.983^a (0.955–0.993)	0.09 (1.22)	0.26^b (3.40)
Extensors	Pk	9.32±4.75	9.83±4.83	0.924^a (0.797–0.972)	0.51 (5.28)	1.40 (14.65)	8.64±4.38	9.97±4.88	0.940^a (0.839–0.977)	0.45 (4.83)	1.25 (13.40)	9.58±4.62	9.31±4.50	0.973^a (0.929–0.990)	0.17 (1.84)	0.48^b (5.09)
Extensors	Af	7.38±3.64	7.92±3.76	0.920^a (0.786–0.970)	0.42 (5.46)	1.6 (15.13)	6.57±3.11	7.77±3.57	0.942^a (0.844–0.978)	0.34 (4.76)	0.95 (13.19)	7.65±3.56	7.17±3.25	0.957^a (0.885–0.984)	0.22 (2.91)	0.60^b (8.07)

Open in a new tab

Pk: Peak Force (Kgf); Af: Average Force (Kgf); SD: Standard Deviation; ICC_2,1: Intra-class Correlation Coefficient; 95% CI: 95% Confidence Interval; SEM: Standard Error of Measurement; MDC₉₅: Minimal Detectable Change (95% CI)

^aGood/excellent ICC (≥0.75)

^bMDC₉₅ < 10%.

The HHD reliability analysis demonstrated a high to very high ICC for test-retest reliability. All rater A measurements presented an excellent correlation in the test-retest analysis, considering both peak force (Pk) and average peak force (Af), while rater B presented a good ICC for flexors Pk (ICC = 0.851; 95%CI {0.612–0.942}) and flexors Af (ICC = 0.761; 95%CI {0.380–0.908}), and excellent correlations for abductors, adductors, and extensors.

The SEM ranged from 0.15 to 0.58Kgf (kilogram-force) for rater A and 0.34 to 1.25kgf for rater B, with rater A being more consistent in the test-retest measurements of Pk and Af for flexor, abductor, and adductor hip muscles. In addition, rater A obtained smaller values of MDC when considering all flexor, abductor, and adductor muscles for Pk and Af measures. This difference between raters was more pronounced in the flexors muscle group, which presented the highest mean values of strength for both raters in the test-retest measurements.

Nevertheless, when we consider the mean of the two measures in the inter-rater analysis of relative reliability, all ICCs for both Pk and Af were classified as excellent (≥0.90) with good precision, expressed by the 95% CI; the smallest value was found for Abductor Af (ICC = 0.913; 95%CI{0.774–0.967}) and the highest value for Adductor Af (ICC = 0.983; 95%CI {0.955–0.993}). The absolute reliability found for Pk ranged from 0.14 to 0.28kgf, and for Af, it ranged from 0.09 to 0.42kgf, with better consistency for adductor, followed by extensor, abductor, and flexor muscle groups for both measures. These results of MDC% (95%CI) were smaller than 10% for all Pk measures analyzed, which may reflect a satisfactory parameter when comparing the mean of two measures between different raters.

The Bland-Altman plot (Fig 2) shows the distribution of the differences in mean values between raters (A-B) versus the mean of all measures. The differences were well distributed for abductor, adductor, and extensor muscle groups, demonstrated by the low bias for Pk and Af, with the lowest tendency of disagreement for hip adductors (Pk bias = 0.10 {LoA -2.69 to 2.90}, Fig 2E and Af bias = 0.02 {LoA -1.97 to 2.03}, Fig 2F), followed by hip abductors (Pk bias = - 0.2 {LoA -3.05 to 2.63}, Fig 2C and Af bias = -0.28 {LoA -3.99 to 3.41}, Fig 2D), and hip extensors (Pk bias = -0.51 {LoA -5.49 to 4.47}, Fig 2G and Af bias = 0.47 {LoA -2.23 to 3.19}, Fig 2H). The regression line did not show a statistically significant difference in the proportional error for those muscle groups. On the other hand, hip flexor bias demonstrated that differences in measures for rater A for Pk were, on average 0.91 Kgf higher than for rater B (Pk bias = 0.91 {LoA -2.93 to 4.73}, Fig 2A); and the differences in measures for Af were on average 0.95kgf higher than rater B (Af bias = 0.95 {LoA -3.22 to 5.13}, Fig 2B). These higher values seem to be related to a tendency of rater A to measure higher values, with increased mean flexor strength when compared to rater B. The regression analysis showed significant deviations from zero for Pk (p = 0.01) and Af (p = 0.01) in the positive direction, with a higher proportional error for rater B.

Discussion

To our knowledge, this is the first study to assess the use of an HHD in a clinical population with symptomatic hip osteoarthritis, considering the degree of radiographic impairment and pain related to the disease. Our study was designed to reproduce a clinical situation where repeated strength measures could be collected easily in a viable routine rather than a laboratory study design. We demonstrated that the Lafayette HHD is a reliable instrument to evaluate hip muscle strength in this population, with good to excellent intra- and inter-rater reliability, satisfactory consistency, and minimal differences in the intra-rater and inter-rater analyses. Thus, clinicians can use the HHD to evaluate disuse or treatment effects on muscle strength in symptomatic hip OA patients.

Previous studies demonstrated that considering the lower limb musculature, the hip presented the strongest validity and reliability for measures of peak force, comparing the same HHD and a fixed dynamometer. Excellent reliability was also found when comparing the HHD applied by a rater or a belt system. Nevertheless, both these studies evaluated healthy and active subjects, and the authors suggest caution with generalization for the clinical population [11, 16]. Just one study assessed the HHD reliability for lower limb strength in older individuals (over 65 years old), including participants with hip and knee OA, demonstrating good intra- and inter-rater reliability for hip and knee muscle strength assessments [10]. However, only ~60% of the participants included in that study have hip or knee OA, and the descriptions of the pain and source of symptoms were poorly characterized, which makes comparisons between our results and those of Arnold and colleagues [10] difficult.

Interestingly, the present study demonstrated that the participants present good tolerance for the time taken to perform the measurements (3 seconds), even when pain was also perceived. Collectively, these data also corroborate previous results concerning older adults [10], suggesting that even when the articular disease is present in the lower limb, notably hip OA, the reliability of the HHD is satisfactory to recommend this instrument as a tool for clinical assessment. We also provide adequate information about the characteristics of the participants’ hip OA, making it clear how much pain, dysfunction, and reduction in quality of life could be associated with the disease, in order to define more precisely the population of interest in this study. Despite the participants experiencing pain when performing the test protocol, the HHD test demonstrated good to excellent ICC, raising the question of the interference of patient discomfort as a potential limitation to performing tests with enough reliability, as suggested in the literature [12].

Rater A had a better correlation between test-rest measures when compared to rater B for all muscle group measurements for Pk and Af, notably in the flexors group. These results may be explained by the difference in anthropometric measurements of the raters and their presumed strength (1.80m and 85kg versus 1.69m and 68kg), demonstrated previously in the literature as a factor that could influence HHD measurements [1, 17, 41]. It is possible the use of a stabilization belt system, particularly for hip flexors, could help solve this problem, given that it does not depend on the examiner’s strength [17, 42]. However, there are conflicting data in the literature regarding the advantage of belt stabilization for HHD, since this device does not provide a stabilization belt [16]. Adaptations to stabilize the device and the lack of a proper method of fixation could interfere with measurements and should be further tested and validated before any recommendations are made.

The most reliable muscle strength measurement was found for hip adductors, followed by extensors and adductors, demonstrated by excellent values of ICC and an adequate 95% CI, ranging from good to excellent reliability values. An exception was observed for intra-rater B reliability, who, despite showing good ICCs for Pk (ICC = 0.851, 95% CI {0.612–0.942}) and Af (ICC = 0.761, 95% CI {0.318–0.908}), presented a wide range of 95% CI, that could be explained by the stronger participants who had larger differences between test-retest for both raters. This result agrees with Kelln and colleagues (2008), who demonstrated that stronger muscles present wider differences in test-retest evaluations. Our data also suggest that the muscle strength assessment would be more feasible in situations with muscle weakness [11, 42, 43], expressed by the low SEM values in the inter-rater analysis.

The MDC% (95% CI) calculated in the intra-rater analysis was smaller for rater A (ranging from 5.89 to 15.13%) than for rater B, who demonstrated a much wider interval (13.19 to 35.14%). However, when using the mean of two measures for the inter-rater analysis, values of MDC% were considerably reduced, by around 8%, suggesting that at least two measurements should be taken to improve the MDC% and reduce random errors. Values under 10% are considered an adequate parameter to express any real difference instead of a random error of measurement, according to Prentice et al. (2004, quoted in Chamorro et al., 2017). Our protocol seems to be adequate for clinical purposes, since it can detect small variations that could be attributed to a real difference. Averaging two measures seems to be sufficient to reduce the variability that may result from the measuring instrument, raters, or characteristics of the measure taken, aligned with the theoretical assumption that an average score would better estimate the true value, minimizing the effect of random error [32]. This is consistent with a practical protocol of measurements that could easily be reproduced in a clinical scenario, capable of minimizing time requirements and reducing discomfort/pain from repeated strength tests in a compromised joint, such as a hip with osteoarthritis. Although MDC has been considered worthwhile to screen patient progression with good precision, future studies should consider economic evaluations of screening strategies concerning HHD assessment, with many specific challenges to overcome [44].

The Bland-Altman inter-rater analysis demonstrated small values of bias for abductors, adductors, and extensors when considering the mean of the test and retest. There was a reasonable agreement with a low bias for both variables, Pk and Af, for all muscle groups evaluated, with a tendency to a proportional error only for flexors when comparing raters. However, the LoA demonstrated a large range of fixed error, especially for flexors and extensors. Future studies should evaluate the influence of experience and routine practice on the LoA fixed error range when using this device.

Some limitations should be addressed in our study. We did not perform measures on different days and in different positions, so the conclusions raised here should be restricted to conditions that replicate this protocol and compared with caution when considering studies performed in a different setting. With respect to raters, the experience level of both raters was the same; the inclusion of raters with different levels of expertise and practice with this instrument would reflect a more realistic scenario. The rater’s ability to resist hip strength is a very relevant point that could interfere with the reproducibility of measurements [9, 12]. Considering that rater B, who weighs 68kg, had some difficulty stabilizing the HHD for hip flexor measurements, we suggest that lighter raters should be intensively trained to achieve better consistency and to rigorously follow the standardized protocol, since it is possible that knowledge of biomechanics and positioning may overcome the influence of his/her body weight and presumed strength [11, 45]. Furthermore, the sample size did not allow further analysis of the subgroup related to hip osteoarthritis classification, and the relation between radiographic impairment and HHD reliability may not be inferred from our results. Future studies are needed to evaluate the reliability of the HHD in other clinical situations, such as knee osteoarthritis.

Conclusion

The HHD is a reliable method to evaluate hip muscle strength in individuals with symptomatic hip OA, with good to excellent intra- and inter-rater reliability and low values of SEM, even in the presence of pain related to the disease. The mean of at least two measures provides values with satisfactory agreement and reliability between raters, with adequate precision in an easily applied protocol. This study also provided values for the MDC, which could help to define a threshold to quantify improvements or reductions in hip muscle strength during treatment interventions or evaluation of disease progression with a low-cost, portable, and useful tool that requires little training for routine patient care assessment.

Acknowledgments

We thank all participants and collaborators involved in this study.

Data Availability

Raw data was uploaded to the repository Figshare as recommended and is available at: dx.doi.org/10.6084/m9.figshare.6025748.

Funding Statement

This study was supported by FAPDF (Fundação de Apoio a Pesquisa do Distrito Federal), process number 1008, grant number 003/2023.

References

1.Jordan JM, Helmick CG, Renner JB, Luta G, Dragomir AD, Woodard J, et al. Prevalence of hip symptoms and radiographic and symptomatic hip osteoarthritis in African Americans and Caucasians: the Johnston County Osteoarthritis Project. J Rheumatol 2009;36:809–15. doi: 10.3899/jrheum.080677 [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Blanco FJ, Silva-Díaz M, Quevedo Vila V, Seoane-Mato D, Pérez Ruiz F, Juan-Mas A, et al. Prevalence of symptomatic osteoarthritis in Spain: EPISER2016 study. Reumatol Clin 2021;17:461–70. doi: 10.1016/j.reumae.2020.01.005 [DOI] [PubMed] [Google Scholar]
3.Loureiro A, Mills PM, Barrett RS. Muscle weakness in hip osteoarthritis: a systematic review. Arthritis Care Res (Hoboken) 2013;65:340–52. doi: 10.1002/acr.21806 [DOI] [PubMed] [Google Scholar]
4.Loureiro A, Constantinou M, Diamond LE, Beck B, Barrett R. Individuals with mild-to-moderate hip osteoarthritis have lower limb muscle strength and volume deficits. BMC Musculoskelet Disord 2018;19:303. doi: 10.1186/s12891-018-2230-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Zacharias A, Pizzari T, English DJ, Kapakoulakis T, Green RA. Hip abductor muscle volume in hip osteoarthritis and matched controls. Osteoarthritis Cartilage 2016; 24:1727–35. doi: 10.1016/j.joca.2016.05.002 [DOI] [PubMed] [Google Scholar]
6.Kolasinski SL, Neogi T, Hochberg MC, Oatis C, Guyatt G, Block J, et al. 2019 American College of Rheumatology/Arthritis Foundation Guideline for the Management of Osteoarthritis of the Hand, Hip, and Knee. Arthritis Care Res (Hoboken) 2020;72:149–62. doi: 10.1002/acr.24131 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Fransen M, Mcconnell S, Reichenbach S. Exercise for osteoarthritis of the hip (Review). Cochrane Database of Systematic Reviews 2014. 10.1002/14651858.CD007912.pub2. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Kannus P. Isokinetic Evaluation of Muscular Performance. Int J Sports Med 1994;15: S11–8. 10.1055/s-2007-1021104. [DOI] [PubMed] [Google Scholar]
9.Krause DA, Neuger MD, Lambert KA, Johnson AE, DeVinny HA, Hollman JH. Effects of examiner strength on reliability of hip-strength testing using a handheld dynamometer. J Sport Rehabil 2014;23:56–64. doi: 10.1123/jsr.2012-0070 [DOI] [PubMed] [Google Scholar]
10.Arnold CM, Warkentin KD, Chilibeck PD, Magnus CRA. The reliability and validity of handheld dynamometry for the measurement of lower-extremity muscle strength in older adults. J Strength Cond Res 2010;24:815–24. doi: 10.1519/JSC.0b013e3181aa36b8 [DOI] [PubMed] [Google Scholar]
11.Mentiplay BF, Perraton LG, Bower KJ, Adair B, Pua Y-H, Williams GP, et al. Assessment of Lower Limb Muscle Strength and Power Using Hand-Held and Fixed Dynamometry: A Reliability and Validity Study. PLoS One 2015;10:e0140822. doi: 10.1371/journal.pone.0140822 [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Kelln BM, McKeon PO, Gontkof LM, Hertel J. Hand-held dynamometry: reliability of lower extremity muscle testing in healthy, physically active, young adults. J Sport Rehabil 2008;17:160–70. doi: 10.1123/jsr.17.2.160 [DOI] [PubMed] [Google Scholar]
13.Oliveira GDS, Ribeiro-Alvares JB de A, de Lima-E-Silva FX, Rodrigues R, Vaz MA, Baroni BM. Reliability of a Clinical Test for Measuring Eccentric Knee Flexor Strength Using a Handheld Dynamometer. J Sport Rehabil 2022;31:115–9. doi: 10.1123/jsr.2020-0014 [DOI] [PubMed] [Google Scholar]
14.Sisto SA, Dyson-Hudson T. Dynamometry testing in spinal cord injury. J Rehabil Res Dev 2007;44:123–36. doi: 10.1682/jrrd.2005.11.0172 [DOI] [PubMed] [Google Scholar]
15.Fulcher ML, Hanna CM, Raina Elley C. Reliability of handheld dynamometry in assessment of hip strength in adult male football players. J Sci Med Sport 2010;13:80–4. doi: 10.1016/j.jsams.2008.11.007 [DOI] [PubMed] [Google Scholar]
16.Florencio LL, Martins J, da Silva MRB, da Silva JR, Bellizzi GL, Bevilaqua-Grossi D. Knee and hip strength measurements obtained by a hand-held dynamometer stabilized by a belt and an examiner demonstrate parallel reliability but not agreement. Phys Ther Sport 2019;38:115–22. doi: 10.1016/j.ptsp.2019.04.011 [DOI] [PubMed] [Google Scholar]
17.Ieiri A, Tushima E, Ishida K, Inoue M, Kanno T, Masuda T. Reliability of measurements of hip abduction strength obtained with a hand-held dynamometer. Physiother Theory Pract 2015;31:146–52. doi: 10.3109/09593985.2014.960539 [DOI] [PubMed] [Google Scholar]
18.Kim S-G, Lee Y-S. The intra- and inter-rater reliabilities of lower extremity muscle strength assessment of healthy adults using a hand held dynamometer. J Phys Ther Sci 2015;27:1799–801. doi: 10.1589/jpts.27.1799 [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Kottner J, Audigé L, Brorson S, Donner A, Gajewski BJ, Hróbjartsson A, et al. Guidelines for reporting reliability and agreement studies (GRRAS) were proposed. J Clin Epidemiol 2011;64:96–106. doi: 10.1016/j.jclinepi.2010.03.002 [DOI] [PubMed] [Google Scholar]
20.Kohn MD, Sassoon AA, Fernando ND. Classifications in Brief: Kellgren-Lawrence Classification of Osteoarthritis. Clin Orthop Relat Res 2016;474:1886–93. doi: 10.1007/s11999-016-4732-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Kellgren JH, Lawrence JS. Radiological assessment of osteo-arthrosis. Ann Rheum Dis 1957;16:494–502. doi: 10.1136/ard.16.4.494 [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Leão MG de S, Martins Neta GP, Coutinho LI, da Silva TM, Ferreira YMC, Dias WRV. Análise comparativa da dor em pacientes submetidos à artroplastia total do joelho em relação aos níveis pressóricos do torniquete pneumático. Rev Bras Ortop (Sao Paulo) 2016;51:672–9. 10.1016/j.rbo.2016.02.002. [DOI] [Google Scholar]
23.Wong DL BCM. Pain in children: comparison of assessment scales. Pediatr Nurs 1988;14:9–17. . [PubMed] [Google Scholar]
24.Woolacott NF, Corbett MS, Rice SJC. The use and reporting of WOMAC in the assessment of the benefit of physical therapies for the pain of osteoarthritis of the knee: findings from a systematic review of clinical trials. Rheumatology (Oxford) 2012;51:1440–6. doi: 10.1093/rheumatology/kes043 [DOI] [PubMed] [Google Scholar]
25.Bellamy N, Buchanan WW, Goldsmith CH, Campbell J, Stitt LW. Validation study of WOMAC: a health status instrument for measuring clinically important patient relevant outcomes to antirheumatic drug therapy in patients with osteoarthritis of the hip or knee. J Rheumatol 1988;15:1833–40. . [PubMed] [Google Scholar]
26.Lane NE, Hochberg MC, Nevitt MC, Simon LS, Nelson AE, Doherty M, et al. OARSI Clinical Trials Recommendations: Design and conduct of clinical trials for hip osteoarthritis. Osteoarthritis Cartilage 2015;23:761–71. doi: 10.1016/j.joca.2015.03.006 [DOI] [PubMed] [Google Scholar]
27.Guimarães RP, Alves DPL, Silva GB, Bittar ST, Ono NK, Honda E, et al. Tradução e adaptação transcultural do instrumento de avaliação do quadril “Harris Hip Score.” Acta Ortop Bras 2010;18:142–7. 10.1590/S1413-78522010000300005. [DOI] [Google Scholar]
28.Harris WH. Traumatic arthritis of the hip after dislocation and acetabular fractures: treatment by mold arthroplasty. An end-result study using a new method of result evaluation. J Bone Joint Surg Am 1969;51:737–55. . [PubMed] [Google Scholar]
29.Terwee CB, Bot SDM, de Boer MR, van der Windt DAWM, Knol DL, Dekker J, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol 2007;60:34–42. doi: 10.1016/j.jclinepi.2006.03.012 [DOI] [PubMed] [Google Scholar]
30.Harris PA, Taylor R, Minor BL, Elliott V, Fernandez M, O’Neal L, et al. The REDCap consortium: Building an international community of software platform partners. J Biomed Inform 2019;95. doi: 10.1016/j.jbi.2019.103208 [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap)—a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform 2009;42:377–81. doi: 10.1016/j.jbi.2008.08.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Portney Leslie Gross MPW. Foundations of Clinical Research: Applications to Practice. 3rd ed. Upper Saddle River, New Jersey: Pearson/Prentice Hall; 2015. [Google Scholar]
33.Giavarina D. Understanding Bland Altman analysis. Biochem Med (Zagreb) 2015;25:141–51. doi: 10.11613/BM.2015.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Bland JM, Altman DG. Measuring agreement in method comparison studies. Stat Methods Med Res 1999;8:135–60. doi: 10.1177/096228029900800204 [DOI] [PubMed] [Google Scholar]
35.Haley SM, Fragala-Pinkham MA. Interpreting change scores of tests and measures used in physical therapy. Phys Ther 2006;86:735–43. s. [PubMed] [Google Scholar]
36.Cardoso JR, Beisheim EH, Horne JR, Sions JM. Test-Retest Reliability of Dynamic Balance Performance-Based Measures Among Adults With a Unilateral Lower-Limb Amputation. PM R 2019;11:243–51. doi: 10.1016/j.pmrj.2018.07.005 [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
37.Walter SD, Eliasziw M, Donner A. Sample size and optimal designs for reliability studies. Stat Med 1998;17:101–10. . [DOI] [PubMed] [Google Scholar]
38.DeLoach LJ, Higgins MS, Caplan AB, Stiff JL. The visual analog scale in the immediate postoperative period: intrasubject variability and correlation with a numeric scale. Anesth Analg 1998;86:102–6. doi: 10.1097/00000539-199801000-00020 [DOI] [PubMed] [Google Scholar]
39.Stauffer ME, Taylor SD, Watson DJ, Peloso PM, Morrison A. Definition of Nonresponse to Analgesic Treatment of Arthritic Pain: An Analytical Literature Review of the Smallest Detectable Difference, the Minimal Detectable Change, and the Minimal Clinically Important Difference on the Pain Visual Analog Scale. Int J Inflam 2011;2011:1–6. doi: 10.4061/2011/231926 [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Tubach F, Ravaud P, Baron G, Falissard B, Logeart I, Bellamy N, et al. Evaluation of clinically relevant changes in patient reported outcomes in knee and hip osteoarthritis: the minimal clinically important improvement. Ann Rheum Dis 2005;64:29–33. doi: 10.1136/ard.2004.022905 [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Wikholm JB, Bohannon RW. Hand-held Dynamometer Measurements: Tester Strength Makes a Difference. Journal of Orthopaedic & Sports Physical Therapy 1991;13:191–8. doi: 10.2519/jospt.1991.13.4.191 [DOI] [PubMed] [Google Scholar]
42.Bohannon RW. Hand-held dynamometry: A practicable alternative for obtaining objective measures of muscle strength. Isokinet Exerc Sci 2012;20:301–15. 10.3233/IES-2012-0476. [DOI] [Google Scholar]
43.Brinkmann JR. Comparison of a hand-held and fixed dynamometer in measuring strength of patients with neuromuscular disease. J Orthop Sports Phys Ther 1994;19:100–4. doi: 10.2519/jospt.1994.19.2.100 [DOI] [PubMed] [Google Scholar]
44.Iragorri N, Spackman E. Assessing the value of screening tools: reviewing the challenges and opportunities of cost-effectiveness analysis. Public Health Rev 2018;39:17. doi: 10.1186/s40985-018-0093-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Morin M, Hébert LJ, Perron M, Petitclerc E, Lake S, Duchesne E. VP.28 Psychometric properties of muscle strength assessment by hand-held dynamometry in healthy adults: A reliability study. Neuromuscular Disorders 2022;32:S72. 10.1016/j.nmd.2022.07.128. [DOI] [PMC free article] [PubMed] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0278086.r001

Decision Letter 0

Theodoros M Bampouras

16 Jan 2023

PONE-D-22-30852Intra- and inter-rater reliability, agreement, and minimal detectable change of the handheld dynamometer in individuals with symptomatic hip osteoarthritis.PLOS ONE

Dear Dr. Vaz,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please consider carefully the points made by both reviewers and address them accordingly. Comments by Reviewer 2, in particular, are very helpful in strengthening the practical application of the study as well as ensuring clarity on the statistical analyses used.

Please submit your revised manuscript by Mar 02 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Theodoros M. Bampouras

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. We note that you have stated that you will provide repository information for your data at acceptance. Should your manuscript be accepted for publication, we will hold it until you provide the relevant accession numbers or DOIs necessary to access your data. If you wish to make changes to your Data Availability statement, please describe these changes in your cover letter and we will update your Data Availability statement to reflect the information you provide.

3. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: It is a very interesting study. I think that it is highly significant to appropriately evaluate the hip joint muscle strength of people with hip osteoarthritis. Please correct according to comments from the editorial committee.

L391-　As a result of the hip flexion, examiner B, who weighs 68 kg, may not be able to fix HHD at less than 25% of his body-weight. It is possible that this tendency may become more pronounced in examiners weighing 68 kg or less. I think there are a lot of cases where the weight of the examiner is 68 kg or less. Therefore, the reliability of hip flexion is questionable. If the examiner's body weight is even smaller, it may affect the results of other measurement items. Therefore, the weight of the examiner to whom the results of this study are applicable is considered to be limited.

L378-　The random error between examiners is around 5%, but considering that the random error within examiners is 5-20%, how should we measure in clinical practice? This is an important interpretation that will lead to clinical practice, so please add it.

L207-　Are the ICCs used in this study ICC(1,1) and (2,1)?

L217　In this study, the significance level is set to 5%, so I think it is better to use MDC95 as the MDC. The random error increases accordingly, but it is also balanced with the 95% CI of the ICC.

L299-　Although we consider systematic errors by the Bland-Altman method, please use the terms fixed error for the LOA and proportional error for the regression equation.</minor></major>

Reviewer #2: Congratulate the authors for their proposal and article. Some minor comments:

The graphics don't look good.

I recommend being consistent with the use of the point for decimals (bug in ln 292).

I recommend providing images of the tests (for example, https://doi.org/10.1080/1091367X.2020.1822363)

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Rodrigo Martin-San Agustin

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2023 Jun 8;18(6):e0278086. doi: 10.1371/journal.pone.0278086.r002

Author response to Decision Letter 0

12 Mar 2023

Dear Editors, in an effort to answer all your questions, we resubmit our manuscript with a response letter, as well as 2 files of figures, a new one (fig1) and a remade one (fig2). The answers are as follow in the word file attached.

EDITOR TEAM COMMENTS BELOW:

Answer: Dear reviewer, thank you very much for this important comment. The use of a handheld dynamometer is very simple, and the rater's ability to resist hip strength is a very relevant point to be highlighted, since lighter raters or even evaluations in conditions such as with healthy individuals may interfere with the reproducibility of measures. However, we partially agree with your argument, as we believe that a trained assessor, with knowledge of biomechanics and positioning, may be able to perform measurements with adequate consistency, with a minor influence of his/her body weight. To clarify this limitation, we have added in line 415 a suggestion for caution concerning extrapolation of our results to raters lighter than 68kg, presuming difficulties in resisting during the tests and consequently lower agreement between measures and raters, and suggesting intensive training to achieve better intrarater reliability.

L378 – The random error between examiners is around 5%, but considering that the random error within examiners is 5-20%, how should we measure in clinical practice? This is an important interpretation that will lead to clinical practice, so please add it.

Answer: Thank you for your helpful comment. We have tried to address the issue above in line 389, emphasizing the improvement in MDC% when considering two measures in the inter-rater analysis, suggesting that the mean of at least two measurements should be used to achieve appropriate agreement and consistency.

L207-　Are the ICCs used in this study ICC(1,1) and (2,1)?

Answer: We appreciate the question raised, as this important information was not clear in the first version of the manuscript. We used ICC (2,1) in our statistical analysis. To make this information available in the methods, we have added information in the Statistical Analysis (line 213) and table 2 legend (line 278 and 282).

L217-　In this study, the significance level is set to 5%, so I think it is better to use MDC95 as the MDC. The random error increases accordingly, but it is also balanced with the 95% CI of the ICC.

Answer: Thank you for your suggestion. We decided to revise the report according to your request. We had some doubts about how rigorous we should be when calculating the MDC% for manual dynamometry and its impact on clinical practice. Indeed, as expressed in table 2 (page 13), the change in CI elevated MDC and MDC% for both raters in the intra-rater analysis, mainly for rater B. However, the inter-rater comparison was still considered very satisfactory and applicable in the clinical setting. Thus, changes have been made in line 42 to point to the two variables that were not under the 10% limit; in lines 220 and 222 regarding the MDC equation; table 2 was adequate to show values with 95%CI (title, legend, and content on 7th, 12th, and 17th column); in line 305, without changes in the conclusion, and in line 389, in agreement with the alterations proposed in your second suggestion.

L299-　Although we consider systematic errors by the Bland-Altman method, please use the terms fixed error for the LOA and proportional error for the regression equation.

Answer: Dear reviewer, in an effort to use the correct terms considering the Bland-Altman method, we have made some changes in lines 316, 322, 406, 407, and 409.

___________________________________________________________________________________

Reviewer #2: Congratulate the authors for their proposal and article. Some minor comments:

1) The graphics don't look good.

Answer: Dear reviewer, thank you for your attention regarding the graphics. We have reformulated them to give readers clearer and more precise information, including the 95%CI for bias and LoA, according to the formula provided by Giavarina et al (2015). In addition, we have improved the quality of the image. The figure is resubmitted with alterations.

Reference: Giavarina D. Understanding Bland Altman analysis. Biochem Med (Zagreb) 2015; 25:141–51. https://doi.org/10.11613/BM.2015.015.

2) I recommend being consistent with the use of the point for decimals (bug in ln 292).

Answer: Thank you very much for the comment. We have carefully reviewed the manuscript, and changes have been made in lines 301 and 302, to correct the use of the point for decimal numbers.

3) I recommend providing images of the tests for example (https://doi.org/10.1080/1091367X.2020.1822363):

Answer: Thank you again for your recommendation. We appreciate your thoughtful suggestion. To illustrate the protocol sequence we used, we have added a figure composed of four parts, to allow readers to visualize the position of the rater and participant during the test routine, named Fig1 (A-D). As a consequence, the corresponding legend has been added in line 181, references to this figure have been added in lines 161, 167, 171, and 175, and the title and legend of the graphics have been renamed as figure 2.

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(33.3KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0278086.r003

Decision Letter 1

Theodoros M Bampouras

29 Mar 2023

PONE-D-22-30852R1Intra- and inter-rater reliability, agreement, and minimal detectable change of the handheld dynamometer in individuals with symptomatic hip osteoarthritis.PLOS ONE Dear Dr. Vaz,

You will see that the reviewers were satisfied by your replies. Please address these final, specific (minor) points below, before the manuscript is accepted for publication:

The reply to Reviewer 1's point regarding raters of <68kg mass, is more informative than what was included in the manuscript. Please expand in the manuscript on this limitation in line with your response, as it is a crucial point.
You discuss the reduction in error by averaging the two trials and state that this makes the measurement sufficiently sensitive to detect clinically meaningful changes, but provide no reference / reminder of what those changes are (and thus that the averaged measurements error allows detection of such changes). Reference 3 might be sufficient - but you need to provide a 'yardstick' for that statement.
The statement in Lines 273-275 re the increased pain not affecting the measurement can only hold true if you knew that the participants truly performed at maximum capacity; however, this is not possible to know. For example, it could be that the increased pain in the second trial resulted in lower force generation (e.g. https://ddec1-0-en-ctp.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2fwww.frontiersin.org%2farticles%2f10.3389%2ffpsyg.2010.00210%2ffull&umid=d55a5243-5760-49d2-acea-601c4247b06a&auth=6b639a990a359ff1d6cc8761081d57748ce3c81e-650e099fe74b65ee9f9331a4049e4dd2d0c53860) and, therefore, a closer score to the first trial which would, in turn, result in higher ICC but through a submaximal second trial and a maximal first trial. On the other hand, it could be that the difference of <1 cm in pain is not clinically significant / meaningful (e.g. https://ddec1-0-en-ctp.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2fpubmed.ncbi.nlm.nih.gov%2f9428860%2f&umid=d55a5243-5760-49d2-acea-601c4247b06a&auth=6b639a990a359ff1d6cc8761081d57748ce3c81e-80734d227bdb7133929fdb958254a41191226b88). Please revise that point to remove the statement you currently have.

Please submit your revised manuscript by May 13 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

We look forward to receiving your revised manuscript.

Kind regards,

Theodoros M. Bampouras

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Reviewer #1: The resubmitted manuscript responds appropriately to comments. However, Fig.1 and 2 could not be confirmed because they were not attached. In the rehabilitation of hip osteoarthritis, I think that a measurement that can clearly indicate hip joint muscle strength is very effective. I hope that this kind of muscle strength measurement will spread in clinical practice.

Reviewer #2: (No Response)

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

PLoS One. 2023 Jun 8;18(6):e0278086. doi: 10.1371/journal.pone.0278086.r004

Author response to Decision Letter 1

8 May 2023

EDITOR TEAM COMMENTS BELOW:

Reviewer’s comment #1: The reply to Reviewer 1's point regarding raters of <68kg mass, is more informative than what was included in the manuscript. Please expand in the manuscript on this limitation in line with your response, as it is a crucial point.

Answer: Dear reviewer, thank you for pointing out that the information in the manuscript was not sufficiently clear in our previous response letter. We have now added a new statement regarding the limitations of this paper in lines 421-427 aligned with our previous response, as requested.

We have expanded our previous statement from: “Considering that rater B, who weighs 68kg, had some difficulty stabilizing the HHD for hip flexor measurements, we suggest that lighter raters should be intensively trained to achieve better intra-rater reliability” to the following phrase: “The rater's ability to resist hip strength is a very relevant point that could interfere with the reproducibility of measurements [1,2]. Considering that rater B, who weighs 68kg, had some difficulty stabilizing the HHD for hip flexor measurements, we suggest that lighter raters should be intensively trained to achieve better consistency and to rigorously follow the standardized protocol, since it is possible that knowledge of biomechanics and positioning may overcome the influence of his/her body weight and presumed strength [3]”.

References:

1- Kelln BM, McKeon PO, Gontkof LM, Hertel J. Hand-held dynamometry: reliability of lower extremity muscle testing in healthy, physically active,young adults. J Sport Rehabil 2008;17:160–70. https://doi.org/10.1123/jsr.17.2.160.

2- Krause DA, Neuger MD, Lambert KA, Johnson AE, DeVinny HA, Hollman JH. Effects of examiner strength on reliability of hip-strength testing using a handheld dynamometer. J Sport Rehabil 2014;23:56–64. https://doi.org/10.1123/jsr.2012-0070.

3- Morin M, Hébert LJ, Perron M, Petitclerc É, Lake SR, Duchesne E. Psychometric properties of a standardized protocol of muscle strength assessment by hand-held dynamometry in healthy adults: a reliability study. BMC Musculoskelet Disord. 2023 Apr 14;24(1):294. doi: 10.1186/s12891-023-06400-2. PMID: 37060020; PMCID: PMC10103411.

Reviewer’s comment #2: You discuss the reduction in error by averaging the two trials and state that this makes the measurement sufficiently sensitive to detect clinically meaningful changes, but provide no reference/reminder of what those changes are (and thus that the averaged measurements error allows detection of such changes). Reference 3 might be sufficient - but you need to provide a 'yardstick' for that statement.

Answer:

Dear reviewer, we have expanded our discussion by adding new references to anchor our discussion concerning the number or measurements, as stated in lines 397-401. Our results suggest that the protocol of measurement with HHD is reliable. We consider the mean of two measures, as the results expressed higher values of ICC when comparing raters. Sources of variability may result from the measuring instrument, raters, or characteristics of the measure taken [1,2]. Biological variations in generating peak force, such as muscle fibers, elicited fatigue, limb and device positioning, stabilization of the HHD, attention of subject/raters, environmental conditions and symptoms related to the disease, and others, may influence the measurements taken. When averaging results across trials, the sources of variability between these measurements are theoretically reduced, as stated by Portney and Watkins (2015). The authors described that random error is not related to the true score when enough measurements are taken. Positive random errors would eventually cancel negative ones, making the average scores a reasonable estimate of the true score. Therefore, averaging two measures would help to reduce variability and consequently improve ICC values [1]. This is consistent with a practical protocol of measurements that could easily be reproduced in a clinical scenario. It can also minimize the time requirements and reduce discomfort/pain from repeated strength tests in a compromised joint, such as a hip with osteoarthritis.

Regarding what was described in the discussion section: “seems to be reliable for clinical purposes since it can detect small variations that could be attributed to a real clinical change”; We have rephrased this statement, excluding the word “clinical”, to make sure that readers do not mistake it with the minimal clinical important difference (MCID) [3,4], since we refer to the minimal detectable change (MDC) that could be attributed to a real difference of peak force and average peak force instead of a difference resulting from random error.

References:

1- Portney, Leslie Gross MPW. Foundations of Clinical Research: Applications to Practice. 3rd ed. Upper Saddle River, New Jersey: Pearson/Prentice Hall; 2015.

2- Bialocerkowski, A. E., & Bragge, P. (2008). Measurement error and reliability testing: Application to rehabilitation. International Journal of Therapy and Rehabilitation, 15(10), 422–427. https://doi.org/10.12968/ijtr.2008.15.10.31210

3- Suijker, J. J., Van Rijn, M., Ter Riet, G., van Charante, E. M., De Rooij, S. E., & Buurman, B. M. (2017). Minimal important change and minimal detectable change in activities of daily living in community-living older people. The journal of nutrition, health & aging, 21, 165-172.

4- Turner, D., Schünemann, H. J., Griffith, L. E., Beaton, D. E., Griffiths, A. M., Critch, J. N., & Guyatt, G. H. (2010). The minimal detectable change cannot reliably replace the minimal important difference. Journal of clinical epidemiology, 63(1), 28-36.

Reviewer’s comment #3: The statement in Lines 273-275 are the increased pain not affecting the measurement can only hold true if you knew that the participants truly performed at maximum capacity; however, this is not possible to know. For example, it could be that the increased pain in the second trial resulted in lower force generation (e.g. https://ddec1-0-en-ctp.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2fwww.frontiersin.org%2farticles%2f10.3389%2ffpsyg.2010.00210%2ffull&umid=d55a5243-5760-49d2-acea-601c4247b06a&auth=6b639a990a359ff1d6cc8761081d57748ce3c81e-650e099fe74b65ee9f9331a4049e4dd2d0c53860) and, therefore, a closer score to the first trial which would, in turn, result in higher ICC but through a submaximal second trial and a maximal first trial. On the other hand, it could be that the difference of <1 cm in pain is not clinically significant / meaningful (e.g. https://ddec1-0-en-ctp.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2fpubmed.ncbi.nlm.nih.gov%2f9428860%2f&umid=d55a5243-5760-49d2-acea-601c4247b06a&auth=6b639a990a359ff1d6cc8761081d57748ce3c81e-80734d227bdb7133929fdb958254a41191226b88). Please revise that point to remove the statement you currently have.

Answer: We would like firstly to thank you for the references suggested. They were well-conducted studies and essential to discuss our data. We have performed a modification following your comment. We searched the literature to verify if, beyond the imprecision of VAS for pain around ± 20mm as pointed out by DeLoach et al. (1998), there was some definition of MCID concerning our study population. We found that a clinical perceived difference in individuals with hip osteoarthritis depends on the baseline value. Therefore, if we consider the first trial as a baseline, values between the interval of 50-65 would require a difference of 24mm to be considered clinically important, and for baseline values > 65mm, a clinically perceived difference must be even higher, 30mm. Thus, the statistical difference between the two trials of rater A was not high enough to be considered clinically relevant. Comparing the first and second trials of rater B, differences were neither statistically significant nor clinically meaningful, similar to the interrater comparison. In conclusion, besides differences that were not clinically relevant, the test protocol was still painful, and statements regarding the influence of pain on the capability of generating maximal force in all trials and, consequently the instrument reliability could not be inferred in this study design. We have removed the statement: “Although there was an existing difference in VAS for pain intensity after the rater A test compared to the retest, it did not seem to have a relevant effect on intra-rater ICC, since rater A presented better ICC and lower SEM values than rater B”; and added some comments in line 273-277 according to your suggestion.

References:

1- Stauffer ME, Taylor SD, Watson DJ, Peloso PM, Morrison A. Definition of Nonresponse to Analgesic Treatment of Arthritic Pain: An Analytical Literature Review of the Smallest Detectable Difference, the Minimal Detectable Change, and the Minimal Clinically Important Difference on the Pain Visual Analog Scale. Int J Inflam 2011; 2011:1–6. https://doi.org/10.4061/2011/231926.

2- Tubach F, Ravaud P, Baron G, Falissard B, Logeart I, Bellamy N, et al. Evaluation of clinically relevant changes in patient reported outcomes in knee and hip osteoarthritis: the minimal clinically important improvement. Ann Rheum Dis 2005; 64:29–33. https://doi.org/10.1136/ard.2004.022905.

Attachment

Submitted filename: Response to reviwers.docx

Click here for additional data file.^{(19.7KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0278086.r005

Decision Letter 2

Theodoros M Bampouras

10 May 2023

Intra- and inter-rater reliability, agreement, and minimal detectable change of the handheld dynamometer in individuals with symptomatic hip osteoarthritis.

PONE-D-22-30852R2

Dear Dr. Vaz,

Thank you for the very thoughtful addressing of the relevant points. We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Theodoros M. Bampouras

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

PLoS One. doi: 10.1371/journal.pone.0278086.r006

Acceptance letter

Theodoros M Bampouras

1 Jun 2023

PONE-D-22-30852R2

Intra- and inter-rater reliability, agreement, and minimal detectable change of the handheld dynamometer in individuals with symptomatic hip osteoarthritis.

Dear Dr. Vaz:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Theodoros M. Bampouras

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(33.3KB, docx)}

Attachment

Submitted filename: Response to reviwers.docx

Click here for additional data file.^{(19.7KB, docx)}

Data Availability Statement

Raw data was uploaded to the repository Figshare as recommended and is available at: dx.doi.org/10.6084/m9.figshare.6025748.

[pone.0278086.ref001] 1.Jordan JM, Helmick CG, Renner JB, Luta G, Dragomir AD, Woodard J, et al. Prevalence of hip symptoms and radiographic and symptomatic hip osteoarthritis in African Americans and Caucasians: the Johnston County Osteoarthritis Project. J Rheumatol 2009;36:809–15. doi: 10.3899/jrheum.080677 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0278086.ref002] 2.Blanco FJ, Silva-Díaz M, Quevedo Vila V, Seoane-Mato D, Pérez Ruiz F, Juan-Mas A, et al. Prevalence of symptomatic osteoarthritis in Spain: EPISER2016 study. Reumatol Clin 2021;17:461–70. doi: 10.1016/j.reumae.2020.01.005 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref003] 3.Loureiro A, Mills PM, Barrett RS. Muscle weakness in hip osteoarthritis: a systematic review. Arthritis Care Res (Hoboken) 2013;65:340–52. doi: 10.1002/acr.21806 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref004] 4.Loureiro A, Constantinou M, Diamond LE, Beck B, Barrett R. Individuals with mild-to-moderate hip osteoarthritis have lower limb muscle strength and volume deficits. BMC Musculoskelet Disord 2018;19:303. doi: 10.1186/s12891-018-2230-4 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0278086.ref005] 5.Zacharias A, Pizzari T, English DJ, Kapakoulakis T, Green RA. Hip abductor muscle volume in hip osteoarthritis and matched controls. Osteoarthritis Cartilage 2016; 24:1727–35. doi: 10.1016/j.joca.2016.05.002 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref006] 6.Kolasinski SL, Neogi T, Hochberg MC, Oatis C, Guyatt G, Block J, et al. 2019 American College of Rheumatology/Arthritis Foundation Guideline for the Management of Osteoarthritis of the Hand, Hip, and Knee. Arthritis Care Res (Hoboken) 2020;72:149–62. doi: 10.1002/acr.24131 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0278086.ref007] 7.Fransen M, Mcconnell S, Reichenbach S. Exercise for osteoarthritis of the hip (Review). Cochrane Database of Systematic Reviews 2014. 10.1002/14651858.CD007912.pub2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0278086.ref008] 8.Kannus P. Isokinetic Evaluation of Muscular Performance. Int J Sports Med 1994;15: S11–8. 10.1055/s-2007-1021104. [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref009] 9.Krause DA, Neuger MD, Lambert KA, Johnson AE, DeVinny HA, Hollman JH. Effects of examiner strength on reliability of hip-strength testing using a handheld dynamometer. J Sport Rehabil 2014;23:56–64. doi: 10.1123/jsr.2012-0070 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref010] 10.Arnold CM, Warkentin KD, Chilibeck PD, Magnus CRA. The reliability and validity of handheld dynamometry for the measurement of lower-extremity muscle strength in older adults. J Strength Cond Res 2010;24:815–24. doi: 10.1519/JSC.0b013e3181aa36b8 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref011] 11.Mentiplay BF, Perraton LG, Bower KJ, Adair B, Pua Y-H, Williams GP, et al. Assessment of Lower Limb Muscle Strength and Power Using Hand-Held and Fixed Dynamometry: A Reliability and Validity Study. PLoS One 2015;10:e0140822. doi: 10.1371/journal.pone.0140822 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0278086.ref012] 12.Kelln BM, McKeon PO, Gontkof LM, Hertel J. Hand-held dynamometry: reliability of lower extremity muscle testing in healthy, physically active, young adults. J Sport Rehabil 2008;17:160–70. doi: 10.1123/jsr.17.2.160 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref013] 13.Oliveira GDS, Ribeiro-Alvares JB de A, de Lima-E-Silva FX, Rodrigues R, Vaz MA, Baroni BM. Reliability of a Clinical Test for Measuring Eccentric Knee Flexor Strength Using a Handheld Dynamometer. J Sport Rehabil 2022;31:115–9. doi: 10.1123/jsr.2020-0014 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref014] 14.Sisto SA, Dyson-Hudson T. Dynamometry testing in spinal cord injury. J Rehabil Res Dev 2007;44:123–36. doi: 10.1682/jrrd.2005.11.0172 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref015] 15.Fulcher ML, Hanna CM, Raina Elley C. Reliability of handheld dynamometry in assessment of hip strength in adult male football players. J Sci Med Sport 2010;13:80–4. doi: 10.1016/j.jsams.2008.11.007 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref016] 16.Florencio LL, Martins J, da Silva MRB, da Silva JR, Bellizzi GL, Bevilaqua-Grossi D. Knee and hip strength measurements obtained by a hand-held dynamometer stabilized by a belt and an examiner demonstrate parallel reliability but not agreement. Phys Ther Sport 2019;38:115–22. doi: 10.1016/j.ptsp.2019.04.011 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref017] 17.Ieiri A, Tushima E, Ishida K, Inoue M, Kanno T, Masuda T. Reliability of measurements of hip abduction strength obtained with a hand-held dynamometer. Physiother Theory Pract 2015;31:146–52. doi: 10.3109/09593985.2014.960539 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref018] 18.Kim S-G, Lee Y-S. The intra- and inter-rater reliabilities of lower extremity muscle strength assessment of healthy adults using a hand held dynamometer. J Phys Ther Sci 2015;27:1799–801. doi: 10.1589/jpts.27.1799 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0278086.ref019] 19.Kottner J, Audigé L, Brorson S, Donner A, Gajewski BJ, Hróbjartsson A, et al. Guidelines for reporting reliability and agreement studies (GRRAS) were proposed. J Clin Epidemiol 2011;64:96–106. doi: 10.1016/j.jclinepi.2010.03.002 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref020] 20.Kohn MD, Sassoon AA, Fernando ND. Classifications in Brief: Kellgren-Lawrence Classification of Osteoarthritis. Clin Orthop Relat Res 2016;474:1886–93. doi: 10.1007/s11999-016-4732-4 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0278086.ref021] 21.Kellgren JH, Lawrence JS. Radiological assessment of osteo-arthrosis. Ann Rheum Dis 1957;16:494–502. doi: 10.1136/ard.16.4.494 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0278086.ref022] 22.Leão MG de S, Martins Neta GP, Coutinho LI, da Silva TM, Ferreira YMC, Dias WRV. Análise comparativa da dor em pacientes submetidos à artroplastia total do joelho em relação aos níveis pressóricos do torniquete pneumático. Rev Bras Ortop (Sao Paulo) 2016;51:672–9. 10.1016/j.rbo.2016.02.002. [DOI] [Google Scholar]

[pone.0278086.ref023] 23.Wong DL BCM. Pain in children: comparison of assessment scales. Pediatr Nurs 1988;14:9–17. . [PubMed] [Google Scholar]

[pone.0278086.ref024] 24.Woolacott NF, Corbett MS, Rice SJC. The use and reporting of WOMAC in the assessment of the benefit of physical therapies for the pain of osteoarthritis of the knee: findings from a systematic review of clinical trials. Rheumatology (Oxford) 2012;51:1440–6. doi: 10.1093/rheumatology/kes043 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref025] 25.Bellamy N, Buchanan WW, Goldsmith CH, Campbell J, Stitt LW. Validation study of WOMAC: a health status instrument for measuring clinically important patient relevant outcomes to antirheumatic drug therapy in patients with osteoarthritis of the hip or knee. J Rheumatol 1988;15:1833–40. . [PubMed] [Google Scholar]

[pone.0278086.ref026] 26.Lane NE, Hochberg MC, Nevitt MC, Simon LS, Nelson AE, Doherty M, et al. OARSI Clinical Trials Recommendations: Design and conduct of clinical trials for hip osteoarthritis. Osteoarthritis Cartilage 2015;23:761–71. doi: 10.1016/j.joca.2015.03.006 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref027] 27.Guimarães RP, Alves DPL, Silva GB, Bittar ST, Ono NK, Honda E, et al. Tradução e adaptação transcultural do instrumento de avaliação do quadril “Harris Hip Score.” Acta Ortop Bras 2010;18:142–7. 10.1590/S1413-78522010000300005. [DOI] [Google Scholar]

[pone.0278086.ref028] 28.Harris WH. Traumatic arthritis of the hip after dislocation and acetabular fractures: treatment by mold arthroplasty. An end-result study using a new method of result evaluation. J Bone Joint Surg Am 1969;51:737–55. . [PubMed] [Google Scholar]

[pone.0278086.ref029] 29.Terwee CB, Bot SDM, de Boer MR, van der Windt DAWM, Knol DL, Dekker J, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol 2007;60:34–42. doi: 10.1016/j.jclinepi.2006.03.012 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref030] 30.Harris PA, Taylor R, Minor BL, Elliott V, Fernandez M, O’Neal L, et al. The REDCap consortium: Building an international community of software platform partners. J Biomed Inform 2019;95. doi: 10.1016/j.jbi.2019.103208 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0278086.ref031] 31.Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap)—a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform 2009;42:377–81. doi: 10.1016/j.jbi.2008.08.010 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0278086.ref032] 32.Portney Leslie Gross MPW. Foundations of Clinical Research: Applications to Practice. 3rd ed. Upper Saddle River, New Jersey: Pearson/Prentice Hall; 2015. [Google Scholar]

[pone.0278086.ref033] 33.Giavarina D. Understanding Bland Altman analysis. Biochem Med (Zagreb) 2015;25:141–51. doi: 10.11613/BM.2015.015 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0278086.ref034] 34.Bland JM, Altman DG. Measuring agreement in method comparison studies. Stat Methods Med Res 1999;8:135–60. doi: 10.1177/096228029900800204 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref035] 35.Haley SM, Fragala-Pinkham MA. Interpreting change scores of tests and measures used in physical therapy. Phys Ther 2006;86:735–43. s. [PubMed] [Google Scholar]

[pone.0278086.ref036] 36.Cardoso JR, Beisheim EH, Horne JR, Sions JM. Test-Retest Reliability of Dynamic Balance Performance-Based Measures Among Adults With a Unilateral Lower-Limb Amputation. PM R 2019;11:243–51. doi: 10.1016/j.pmrj.2018.07.005 [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]

[pone.0278086.ref037] 37.Walter SD, Eliasziw M, Donner A. Sample size and optimal designs for reliability studies. Stat Med 1998;17:101–10. . [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref038] 38.DeLoach LJ, Higgins MS, Caplan AB, Stiff JL. The visual analog scale in the immediate postoperative period: intrasubject variability and correlation with a numeric scale. Anesth Analg 1998;86:102–6. doi: 10.1097/00000539-199801000-00020 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref039] 39.Stauffer ME, Taylor SD, Watson DJ, Peloso PM, Morrison A. Definition of Nonresponse to Analgesic Treatment of Arthritic Pain: An Analytical Literature Review of the Smallest Detectable Difference, the Minimal Detectable Change, and the Minimal Clinically Important Difference on the Pain Visual Analog Scale. Int J Inflam 2011;2011:1–6. doi: 10.4061/2011/231926 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0278086.ref040] 40.Tubach F, Ravaud P, Baron G, Falissard B, Logeart I, Bellamy N, et al. Evaluation of clinically relevant changes in patient reported outcomes in knee and hip osteoarthritis: the minimal clinically important improvement. Ann Rheum Dis 2005;64:29–33. doi: 10.1136/ard.2004.022905 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0278086.ref041] 41.Wikholm JB, Bohannon RW. Hand-held Dynamometer Measurements: Tester Strength Makes a Difference. Journal of Orthopaedic & Sports Physical Therapy 1991;13:191–8. doi: 10.2519/jospt.1991.13.4.191 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref042] 42.Bohannon RW. Hand-held dynamometry: A practicable alternative for obtaining objective measures of muscle strength. Isokinet Exerc Sci 2012;20:301–15. 10.3233/IES-2012-0476. [DOI] [Google Scholar]

[pone.0278086.ref043] 43.Brinkmann JR. Comparison of a hand-held and fixed dynamometer in measuring strength of patients with neuromuscular disease. J Orthop Sports Phys Ther 1994;19:100–4. doi: 10.2519/jospt.1994.19.2.100 [DOI] [PubMed] [Google Scholar]

[pone.0278086.ref044] 44.Iragorri N, Spackman E. Assessing the value of screening tools: reviewing the challenges and opportunities of cost-effectiveness analysis. Public Health Rev 2018;39:17. doi: 10.1186/s40985-018-0093-8 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0278086.ref045] 45.Morin M, Hébert LJ, Perron M, Petitclerc E, Lake S, Duchesne E. VP.28 Psychometric properties of muscle strength assessment by hand-held dynamometry in healthy adults: A reliability study. Neuromuscular Disorders 2022;32:S72. 10.1016/j.nmd.2022.07.128. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Intra- and inter-rater reliability, agreement, and minimal detectable change of the handheld dynamometer in individuals with symptomatic hip osteoarthritis

Gilvan Ferreira Vaz

Felipe Florêncio Freire

Henrique Mansur Gonçalves

Marcus Alexandre Brito de Aviz

Wagner Rodrigues Martins

João Luiz Quagliotti Durigan

Roles

Abstract

Introduction

Methods

Results

Conclusion

Introduction

Methods

Study design

Participants

Instruments

Fig 1. Test positions for hip muscle strength assessment.

Statistical analysis

Results

Table 1. Characteristics of participants.

Table 2. Handheld dynamometer reliability analysis for hip muscle groups.

Fig 2.

Discussion

Conclusion

Acknowledgments

Data Availability

Funding Statement

References

Decision Letter 0

Theodoros M Bampouras

Roles

Author response to Decision Letter 0

Decision Letter 1

Theodoros M Bampouras

Roles

Author response to Decision Letter 1

Decision Letter 2

Theodoros M Bampouras

Roles

Acceptance letter

Theodoros M Bampouras

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases