Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2023 Dec 27;18(12):e0289052. doi: 10.1371/journal.pone.0289052

Comparison of devices used to measure blood pressure, grip strength and lung function: A randomised cross-over study

Carli Lessof 1, Rachel Cooper 2,3, Andrew Wong 4, Rebecca Bendayan 5,6, Rishi Caleyachetty 7,8, Hayley Cheshire 9, Theodore Cosco 10, Ahmed Elhakeem 11, Anna L Hansell 12, Aradhna Kaushal 13, Diana Kuh 4, David Martin 1, Cosetta Minelli 14, Stella Muthuri 4, Maria Popham 4, Seif O Shaheen 15, Patrick Sturgis 16, Rebecca Hardy 17,18,*
Editor: Masaki Mogi19
PMCID: PMC10752545  PMID: 38150442

Abstract

Background

Blood pressure, grip strength and lung function are frequently assessed in longitudinal population studies, but the measurement devices used differ between studies and within studies over time. We aimed to compare measurements ascertained from different commonly used devices.

Methods

We used a randomised cross-over study. Participants were 118 men and women aged 45–74 years whose blood pressure, grip strength and lung function were assessed using two sphygmomanometers (Omron 705-CP and Omron HEM-907), four handheld dynamometers (Jamar Hydraulic, Jamar Plus+ Digital, Nottingham Electronic and Smedley) and two spirometers (Micro Medical Plus turbine and ndd Easy on-PC ultrasonic flow-sensor) with multiple measurements taken on each device. Mean differences between pairs of devices were estimated along with limits of agreement from Bland-Altman plots. Sensitivity analyses were carried out using alternative exclusion criteria and summary measures, and using multilevel models to estimate mean differences.

Results

The mean difference between sphygmomanometers was 3.9mmHg for systolic blood pressure (95% Confidence Interval (CI):2.5,5.2) and 1.4mmHg for diastolic blood pressure (95% CI:0.3,2.4), with the Omron HEM-907 measuring higher. For maximum grip strength, the mean difference when either one of the electronic dynamometers was compared with either the hydraulic or spring-gauge device was 4-5kg, with the electronic devices measuring higher. The differences were small when comparing the two electronic devices (difference = 0.3kg, 95% CI:-0.9,1.4), and when comparing the hydraulic and spring-gauge devices (difference = 0.2kg, 95% CI:-0.8,1.3). In all cases limits of agreement were wide. The mean difference in FEV1 between spirometers was close to zero (95% CI:-0.03,0.03), limits of agreement were reasonably narrow, but a difference of 0.47l was observed for FVC (95% CI:0.53,0.42), with the ndd Easy on-PC measuring higher.

Conclusion

Our study highlights potentially important differences in measurement of key functions when different devices are used. These differences need to be considered when interpreting results from modelling intra-individual changes in function and when carrying out cross-study comparisons, and sensitivity analyses using correction factors may be helpful.

Introduction

Blood pressure, grip strength and lung function are commonly assessed in longitudinal population studies. All three are non-invasive measures of physiological function that are practical for a nurse or interviewer to administer in a home or clinical setting using portable equipment. They avoid the subjectivity of self-reports of health, enable researchers and clinicians to track changes in health and function over the life course [1] and are important biomarkers of healthy ageing [2]. Their repeat assessment within longitudinal studies, and inclusion in many studies, facilitates comparisons over time and across ages and cohorts [3,4].

Although there have been a number of initiatives to encourage standardisation of these measures [57], different devices have been adopted by different studies for a variety of practical reasons [8,9]. Furthermore, the device used within a long-running longitudinal study will often need to change over time as obsolete or outdated models are replaced with devices that are more technologically advanced and improve or extend measurement, are less costly, more portable or easier to use. Because devices of this kind are only subject to moderate regulation [10,11], the measures obtained from different makes and models of device are unlikely to be equivalent. This has important implications for research which either compares findings across studies or considers change in function longitudinally. For example, in a study modelling age-related changes in blood pressure across the life course which used data from eight British longitudinal studies, switching from a manual sphygmomanometer to an automated device, without correction for the difference in measurement, resulted in a steeper increase in mean trajectory of systolic blood pressure [4]. Similarly, artefactual findings attributable to a change in device have been observed in studies of lung function [12,13]. Indeed, concerns about potential differences in measures due to differences in spirometry devices have contributed to study investigators in the UK discouraging within- and cross-study analyses [14,15].

There are existing studies which have shown differences between devices used to measure blood pressure [1620], grip strength [2124] and lung function [6,12,13,2527], but these have not yet compared all the devices commonly used in cohort and longitudinal population studies in the UK and many other countries. Further, these are only occasionally discussed in the context of both within- and between-study comparisons. To address this gap, a randomised cross-over trial was undertaken to compare measurements between devices used to assess blood pressure, grip strength and lung function commonly used in UK longitudinal population studies within the CLOSER consortium [28].

Methods

Study design and sample

For each of blood pressure, grip strength and lung function, a randomised cross-over study was carried out, so as to make within-person measurement comparisons. The study was conducted following established (CONSORT) guidelines [29]. The target sample, based on sample size calculations (S1 Appendix), was 120 men and women from the general population aged 45 to 74 years comprising 20 men and 20 women from each of three age groups (45–54, 55–64, 65–74). Participants were drawn from a list of individuals who had participated in a market research study, consented to be re-contacted for research purposes, and were living in London and the South East of England. An invitation letter and information sheet was sent and this was followed-up with a telephone recruitment process including assessment of health-related exclusion criteria (S1 Appendix). Eligible participants were then invited to attend a face-to-face assessment and each participant was measured on every machine (Table 1) at a single assessment visit.

Table 1. Makes and models of devices included in study.

Measurement
Sphygmomanometer
(blood pressure)
Omron 705-CP Omron HEM-907
Hand-held dynamometer
(grip strength)
Jamar Hydraulic
analog hand dynamometer
Jamar Plus+ digital hand dynamometer Nottingham electronic handgrip dynamometer Smedley
spring-gauge dynamometer
Spirometer
(lung function)
Micro Medical Micro Plus turbine spirometer ndd Easy on-PC ultrasonic flow-sensor spirometer

All 90-minute face-to-face assessments took place in central London between October 2015 and January 2016 and were conducted by one of seven researchers who were trained and tested in all relevant protocols. All participants gave informed, written consent. The analytical dataset was pseudo anonymised with each participant given a study number so that individuals could not be identified. Ethical approval for data collection was given by University College London (UCL) (Ethics Project Number: 6338/001) and, for analysis, by the University of Southampton (Ethics Project Number: 18498). Participants received feedback on their results, advice to contact their General Practitioner if their blood pressure was elevated, and a gift voucher.

During the assessment, each participant was assessed in the sequence shown in Table 2. Blood pressure was measured consecutively on each device and the remaining measures were ordered to ensure that there was sufficient time between the four grip strength and two spirometry measurements to avoid participant fatigue. Multiple measurements were recorded on each device as would be done in survey research. Height and weight were also measured and a short self-completion questionnaire was administered (S2 Appendix).

Table 2. Flow chart of assessment.

Randomisation
Introduction and consent module
Three minute rest period
Blood pressure 1
Two minute rest period
Blood pressure 2
Height and weight
Grip strength 1
Lung function 1
Grip strength 2
10 minute break including paper self-completion questionnaire
Grip strength 3
Lung function 2
Grip strength 4
Copy of assessment data and gift voucher given to participant

For each of the three measures, the order of devices was determined before fieldwork began, using computer-generated random numbers within each age-sex strata. Individuals were randomly allocated to one of two possible orders of blood pressure and lung function devices and to one of 24 possible orders of grip strength devices.

Blood pressure, grip strength and lung function measurement

Standardised measurement protocols were used as follows. For blood pressure, the participant was asked to sit on a chair with legs uncrossed and their right arm resting comfortably, palm up, on a table, with the sphygmomanometers positioned so that they could not see the display. The participant was asked to expose their right arm, making sure that rolled up sleeves did not restrict circulation and that any watches or bracelets had been removed and, the sphygmomanometer cuff was then positioned over the brachial artery. After 3 minutes of quiet rest, 3 readings with a minute’s rest between each reading were recorded using the first device. The device was then changed and after a further 2 minutes rest, 3 readings were taken using the second device. There was no talking until three readings on both devices had been completed.

Grip strength assessment was based on a published measurement protocol [30]. While seated in a chair with fixed arms, participants were asked to place their forearm on the arm of the chair in the mid-prone position (the thumb facing up) with their wrist just over the end of the arm of the chair in a neutral but slightly extended position. Adjustments were made to each dynamometer to accommodate different hand sizes according to the make and model of the device. On hearing the words “And Go”, the participant was encouraged, through strong verbal instruction, to squeeze as hard as possible for a few seconds until told to stop. For each device, two measurements were carried out in each hand in the sequence Left-Right-Left-Right. The value on the display was recorded to the nearest 0.1kg for the Jamar Plus+ and Nottingham Electronic, to the nearest 0.5kg for the Smedley and to the nearest 1kg for the Jamar Hydraulic.

Lung function measurements adhered to the American Thoracic Society/European Respiratory Society (ATS/ERS) lung function protocol [6]. The procedure was explained and demonstrated, and the participant then had a practice blow without completely emptying their lungs. All measurements were carried out with the participant standing unless they felt unable to do so. During measurement, maximum effort was encouraged verbally. In addition, the ndd Easy on-PC was linked to a laptop which showed a cartoon of a child blowing up a balloon. This represents a real-time trace and as the participant is encouraged to exhale until the balloon pops this helps ensure a maximal FVC is achieved. After each trial the researcher recorded whether it satisfied the protocol, for example a trial was classified as not valid if the participant did not form a tight seal around the mouthpiece or coughed during the procedure, and in these instances, feedback was provided before the next attempt. Participants had up to five attempts to produce three valid measurements of lung function from each spirometer.

Readings for blood pressure, grip strength and lung function using the Micro Medical spirometer were data entered twice, independently, and compared to ensure accuracy. Lung function readings taken using the ndd Easy on-PC spirometer were downloaded directly from the laptop.

Other measures

Height was measured using a portable Marsden Leicester stadiometer and weight using Tanita 352 scales according to standardised procedures, from which body mass index was calculated as weight (kg)/height (m)2. Responses to the self-completion questionnaire provided additional information on: age at completing full-time education, self-rated health, smoking history, medication use and musculoskeletal, cardiovascular and respiratory conditions which might influence performance on the functional tests (S2 Appendix).

Primary outcome measures

For the purposes of the main analyses, outcomes commonly used in epidemiological research were derived. The mean of the second and third readings of systolic blood pressure and diastolic blood pressure in millimeters of mercury (mmHg) were used. For grip strength, the maximum of the four readings in kilograms (kg) was used. For lung function, the maximum forced expiratory volume in 1 second (FEV1) and forced vital capacity (FVC) in millilitres (ml) from the highest quality readings (quality A or B) were used. Quality grade A was when 3 or more acceptable tests were achieved with repeatability within 100 ml, and B when 3 acceptable tests were achieved with repeatability within 150 ml, as per ATS/ERS criteria [6].

Statistical analyses

We described relevant characteristics by randomisation group for each measure. For each device we estimated the reliability using intraclass correlations (or Rho) and within-person standard deviations using a variance-components model [31]. To investigate order effects we used two sample t-tests to compare the difference in mean values between groups with the measurements carried out in one sequence (device A followed by device B) compared with the opposite order (BA). For grip strength where 4 devices were tested, 6 pairwise comparisons were made, ignoring the exact placement of devices within the sequence.

We calculated the differences in measurement between pairs of devices then assessed the mean within-person differences between pairs of devices using paired t-tests. The assumption that the mean differences were normally distributed was checked by plotting histograms, and Bland-Altman plots (the difference between measures versus the average of the measures from the two devices for each individual) were used to assess whether the variation was dependent on the magnitude of the measurements [32,33]. The mean difference in values between the two devices, and the 95% limits of agreement, which give the range in which we would expect 95% of future differences in measurements between the two devices to lie, were plotted [33,34].

We also performed a series of sensitivity analyses to test the robustness of the results. We repeated analyses having: (i) excluded measurements where the devices were administered in the incorrect order (n = 2 for blood pressure, n = 5 for grip strength and n = 1 for lung function); (ii) removed extreme outliers identified using scatter plots (n = 1 for blood pressure and n = 2 for grip strength) and; (iii) used alternative outcome definitions commonly used in analyses. For blood pressure, we considered the mean of three readings [35] and the second reading only [36] and for grip strength, the mean of the four readings [37,38]. For lung function, we used the highest reading of FEV1 and FVC drawn from all available readings irrespective of whether they adhered to the ATS/ERS quality criteria.

Finally, we used multilevel modelling, as an alternative statistical approach, to estimate the differences between devices, using all available readings rather than a summary measure, in order to account for variance between readings. The models treat the repeated readings as Level 1 and the individual as Level 2 to account for non-independence of measurements from the same person. Model 1 included device treated as a fixed effect. Model 2 also included covariates to account for the order in which the devices were administered and the position of the reading in the sequence (1 to 3 for blood pressure, 1 or 2 for the dominant and non-dominant hands for grip strength, and 1 to 5 for lung function). Model 3 was additionally adjusted for age, sex and, for blood pressure only, body mass index.

Data cleaning and management were carried out using Excel, IBM-SPSS Version 22 and STATA 14.0 and analyses were conducted using STATA 15.0.

Results

During fieldwork, 118 assessments were completed, with 18–21 participants in each of the age-sex strata (S1 Table). Of the seven researchers, three carried out 20–30 assessments, two carried out 10–20 assessments and two carried out fewer than ten assessments.

The socio-demographic characteristics of the randomised groups were reasonably well balanced as were key aspects of cardiovascular, musculoskeletal and respiratory health (Tables 3 and 4). The reliability of every device was good. The intra-cluster correlations were lowest for blood pressure (0.89–0.94), due to the acknowledged within-person variation in this measure (S2 Table). The values for grip strength of dominant hand were above 0.95 for all devices except the Smedley dynamometer (0.92). Reliability was best for lung function (≥0.96), where within-person standard deviations were small. Reliability was slightly better when including only assessments adhering to the ATS/ERS quality criteria because two measures must be within 150ml of each other. There was no evidence of order effects for blood pressure or lung function. For grip strength, there was evidence of an order effect for the comparison between the Nottingham Electronic and Smedley dynamometers (difference = -3.08kg (95% CI = -5.93, -0.23, p = 0.03) (S3 Table). Histograms show that for all three measures, the mean differences between devices were approximately normally distributed (S1 Fig).

Table 3. Characteristics of the study population by first device used (N = 118).

Blood pressure Grip Strength Lung function
First device Omron705-CP
(n = 58a)
Omron HEM-907 (n = 60) Jamar hydraulic (n = 30) Smedley
(n = 28)
Nottingham
(n = 30)
Jamar Plus+
(n = 30a)
Micro Medical
(n = 59)
ndd Easy on-PC
(n = 59a)
Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD
Age (years) 59.4 8.2 59.8 7.8 58.5 8.2 59.8 7.4 59.0 9.0 61.2 7.4 59.8 7.8 59.4 8.3
Weight (kg) 76.9 21.1 77.3 16.7 73.8 16.1 82.1 22.2 77.3 17.6 75.5 19.4 77.3 16.5 76.9 21.2
Height (cm) 168.5 9.0 167.6 8.9 166.5 8.2 170.2 9.3 165.9 9.9 169.6 7.9 168.2 9.6 167.8 8.2
BMI (kg/m2) 27.5 4.6 27.4 4.9 26.5 4.7 28.2 6.0 27.8 4.6 27.5 3.6 27.2 4.5 27.7 5.1
N % N % N % N % N % N % N % N %
Sex
Men 29 50 30 50 14 47 15 54 14 47 16 53 30 51 29 49
Women 29 50 30 50 16 53 13 46 16 53 14 47 29 49 30 51
Age (years) left full time education
Under 16 20 34 19 32 6 20 8 29 11 37 14 47 24 41 15 25
17/18 7 12 10 17 5 17 6 21 3 10 3 10 5 8 12 20
19 + 31 54 31 52 19 63 14 50 16 53 13 43 30 51 32 54
Self-reported health
Excellent/
Very good
32 55 32 53 17 57 15 54 18 60 14 47 32 54 32 55
Good 20 36 22 37 7 23 12 43 11 37 12 40 20 34 22 38
Poor/Very poor 6 10 6 10 6 20 1 4 1 3 4 13 7 12 4 7

a Sample size reduced by 1 for body mass index (BMI).

Table 4. Cardiovascular, musculoskeletal and respiratory health status of the study population by first device used (N = 118).

First device Omron 705-CP
(n = 58)
Omron HEM-907
(n = 60)
N % N %
CV conditiona 4 7 9 15
Hypertension 18 31 19 32
On blood pressure medication 14 24 17 28
First device Jamar Hydraulic
(n = 30)
Smedley
(n = 28)
Nottingham
(n = 30)
Jamar Plus+
(n = 30)
N % N % N % N %
Dominant hand–right 29 97 25 89 27 90 27 90
Arthritis 6 20 5 18 4 13 5 17
Some/ a lot of difficulty gripping 5 17 8 29 6 20 5 17
First device Micro Medical
(n = 59)
ndd Easy on-PC
(n = 59)
N % N %
Respiratory conditionb 24 41 32 54
On medication for condition 4 7 2 3
Current smoker 13 22 8 14
Ever smoker 21 36 27 46

a Includes doctor diagnosed heart attack, angina and other heart condition

b Includes eczema, hay fever, asthma, COPD, bronchitis, emphysema and other respiratory problems.

Blood pressure

Three participants were excluded from analyses due to missing readings leaving 115 for analysis. The mean difference in SBP between the two devices was 3.9mmHg (95% CI: 2.5, 5.2, p<0.001) and for DBP was 1.4mmHg (95% CI: 0.3, 2.4, p = 0.1), with the Omron HEM-907 measuring higher than the Omron 705-CP (Table 5). The Bland-Altman plots showed that as blood pressure increased, the difference between the two devices remained approximately constant (Figs 1 and 2). The limits of agreement were wide, being -10.6 to 18.3mmHg for SBP and -9.8 to 12.5mmHg for DBP.

Table 5. Differences in mean and limits of agreement for each pair of devices used to measure blood pressure, grip strength and lung function.

Measures compared Limits of Agreement
Blood Pressure, mean of 2+3 (mmHg) N Difference (95% CI) p-value* Lower Upper
SBP, Omron HEM-907 –Omron 705-CP 115 3.9 (2.5, 5.2) <0.001 -10.6 18.3
DBP, Omron HEM-907 –Omron 705-CP 115 1.4 (0.3, 2.4) 0.01 -9.8 12.5
Grip strength, max of 4 readings (kg)
Nottingham–Jamar Plus+ 118 0.3 (-0.9, 1.4) 0.6 -12.1 12.7
Jamar Hydraulic–Smedley 118 0.2 (-0.8, 1.3) 0.7 -10.8 11.3
Jamar Plus+–Jamar Hydraulic 118 4.5 (3.9, 5.1) <0.001 -2.0 10.9
Jamar Plus+–Smedley 118 4.7 (3.7, 5.7) <0.001 -6.3 15.7
Nottingham–Jamar Hydraulic 118 4.7 (3.6, 5.9) <0.001 -7.9 17.3
Nottingham–Smedley 118 5.0 (3.5, 6.4) <0.001 -10.6 20.5
Lung function, maximum (litres), American Thoracic Society criteria
FEV1, Micro Medical–ndd Easy on-PC 74 0.00 (-0.03, 0.03) 0.9 -0.25 0.25
FVC, Micro Medical–ndd Easy on-PC 67 -0.47 (-0.53, -0.42) <0.001 -0.92 -0.03

* p-value from paired t-test.

Fig 1. Bland Altman plot for SBP.

Fig 1

Plot of the difference in mean SBP (mmHg) between the Omron 705-CP and Omron HEM-907 by the average SBP with 95% limits of agreement.

Fig 2. Bland Altman plot for DBP.

Fig 2

Plot of the difference in mean DBP (mmHg) between the Omron 705-CP and Omron HEM-907 by the average DBP with 95% limits of agreement.

Grip strength

All 118 participants were included in the analyses. There was no evidence of a difference in mean maximum grip strength when comparing the two electronic dynamometers, the Nottingham Electronic and Jamar Plus+ (difference = 0.3kg (95% CI: -0.9, 1.4, p = 0.6), or when comparing the hydraulic and spring-gauge dynamometers, the Jamar Hydraulic and Smedley (difference = 0.2kg (95%CI:-0.8, 1.3, p = 0.7). However, there were mean differences in maximum grip strength of between 4 and 5kg when comparing either of the electronic dynamometers with either the hydraulic or spring-gauge dynamometer (Table 5). The limits of agreement varied depending on the pair of devices being compared, for example, these were narrower (-2.0 and 10.1 kg) when comparing the Jamar Plus+ and Jamar Hydraulic but very wide (-10.6 and 20.5 kg) when comparing the Nottingham Electronic and Smedley dynamometers. Even in cases where the mean difference was near zero, the limits of agreement indicated substantial differences in measurement between devices. The Bland-Altman plots (Figs 38) showed that for the comparisons of the Smedley dynamometer with all other devices, the difference increased at higher magnitudes of mean grip strength (Figs 4, 6 and 8).

Fig 3. Bland Altman plots of grip strength (Jamar Plus+–Nottingham).

Fig 3

Plot of the difference in maximum grip strength (kg) between devices by average maximum grip strength on both devices with 95% limits of agreement.

Fig 8. Bland Altman plots of grip strength (Nottingham–Smedley).

Fig 8

Plot of the difference in maximum grip strength (kg) between devices by average maximum grip strength on both devices with 95% limits of agreement.

Fig 4. Bland Altman plots of grip strength (Jamar Hydraulic–Smedley).

Fig 4

Plot of the difference in maximum grip strength (kg) between devices by average maximum grip strength on both devices with 95% limits of agreement.

Fig 6. Bland Altman plots of grip strength (Jamar Plus+–Smedley).

Fig 6

Plot of the difference in maximum grip strength (kg) between devices by average maximum grip strength on both devices with 95% limits of agreement.

Fig 5. Bland Altman plots of grip strength (Jamar Plus+–Jamar Hydraulic).

Fig 5

Plot of the difference in maximum grip strength (kg) between devices by average maximum grip strength on both devices with 95% limits of agreement.

Fig 7. Bland Altman plots of grip strength (Nottingham–Jamar Hydraulic).

Fig 7

Plot of the difference in maximum grip strength (kg) between devices by average maximum grip strength on both devices with 95% limits of agreement.

Lung function

Twelve participants had missing lung function measures and just under a third (n = 32 for FEV1 and n = 39 for FVC) of the remaining participants were excluded because there were no readings of a sufficiently high quality. There was no evidence of a difference in mean FEV1 between devices (difference = 0.00 litres (95% CI:-0.03,0.03, p = 0.9)) but there was evidence of a difference in FVC (-0.47 litres (95% CI:-0.53,-0.42, p<0.001)) with the ndd Easy on-PC measuring higher than the Micro Medical (Table 5). The Bland-Altman plots suggested that for FEV1, the difference between the two devices was approximately constant as measurements increased and close to zero (Fig 9) with reasonably narrow limits of agreement (-0.25 and 0.25 litres). The plot for FVC suggested that the difference between devices remained constant as values of FVC increased (Fig 10) but the limits of agreement were wider (-0.92 and -0.03).

Fig 9. Bland Altman plot for FEV1.

Fig 9

Plot of the difference in mean maximum FEV1 between the Micro Medical and ndd Easy on-PC ultrasonic flow-sensor by average maximum FEV1 on both devices with 95% limits of agreement.

Fig 10. Bland Altman plot for FVC.

Fig 10

Plot of the difference in mean maximum FVC between the Micro Medical and ndd Easy on-PC ultrasonic flow-sensor by average maximum FVC on both devices with 95% limits of agreement.

Sensitivity analyses

When we repeated the analyses having excluded measurements where the devices were administered in the incorrect order (n = 8), removed outliers (n = 3), included the lung function readings that did not meet ATS/ERS criteria (n = 32 for FEV1 and n = 39 for FVC), and used alternative definitions of outcomes, there were only small changes in the estimated differences between devices such that the conclusions were unaltered (S4 Table). The only differences found were a small number of additional order effects (S5 Table), but these had no impact on the findings when order of device was controlled for through multilevel analysis. Indeed, when the data were reanalysed using multilevel models, the estimates of differences between devices showed only marginal changes, though the standard errors were reduced (S6S8 Tables).

Discussion

In a randomised cross-over study of 118 adults aged 45–74 years, we found evidence of differences in measurement of blood pressure, grip strength and lung function when assessed using different devices. For blood pressure, the newer Omron HEM-907 measured higher than the older Omron 705-CP with wide limits of agreement. For grip strength, the two electronic dynamometers recorded measurements on average 4-5kg higher than either the hydraulic or the spring-gauge dynamometer, but there were only small mean differences when comparing the two electronic dynamometers or the hydraulic and spring-gauge dynamometers. However, limits of agreement were wide for all comparisons. For lung function, the ndd Easy on-PC measures of FVC were an average of 0.47 litres higher than those for the Micro Medical, but there was no difference between measures of FEV1 and the limits of agreement were reasonably narrow.

We are aware of only a few studies that have compared combinations of dynamometers previously. For example, King [21] compared the Jamar Hydraulic with the Jamar Plus+ dynamometer and, in contrast to our findings, reported that the electronic dynamometer had consistently lower readings than the hydraulic device and narrower. However, the study population was younger, with an average age of 32 years, comprising a convenience sample of 40 men and women and may have better function than our older sample which could influence comparability across machines. Another study reported a difference of 3.2kg (limits of agreement -6.3 to 12.6) when comparing the Smedley dynamometer and the Jamar Hydraulic dynamometer, which contrasts with our finding of a smaller mean difference (0.2kg) but wider limits of agreement (-10.8 to 11.3) [22]. However, this other study was carried out in an older, smaller sample of 55 participants aged 65–99 years recruited from a retirement home and social day care centre. Another study [23], found that the Smedley dynamometer measured lower than the Jamar+ Digital, similar to our study, although in this other study there were other potentially important variations in measurement protocol–measures using the Smedley device were undertaken in a standing position and those using the Jamar device were undertaken seated. Our findings provide some reassurance that there is a lack of bias in measurement between specific device combinations (i.e. the Jamar Plus+ and Nottingham electronic; the Jamar Hydraulic and Smedley), although the limits of agreement suggest that the variation can still be substantial.

We have not identified a comparison of Micro Medical or other turbine spirometers with the ndd Easy on-PC spirometer. However, in a study of 35 volunteers, the Micro Medical turbine spirometer, used in our study, gave lower readings compared with the Vitalograph Micro pneumotachograph spirometer [13], for both FEV1 (mean difference of 0.24l) and FVC (0.34l). Another study of 49 volunteers found that the handheld ndd Easy on-PC spirometer produced systematically lower values than a pneumotachograph spirometer (Masterscreen) [25], for both FEV1 (mean difference of 0.24l) and for FVC (0.37l).

For lung function, the accuracy of measurement relies primarily on optimal coaching: maximally deep breath, a rapid blast and appropriate encouragement as well as a full seal around the mouthpiece and correct body posture [6]. The ndd Easy on-PC spirometer presents visualisation of the volume-time graph in real time, meaning that the participant can be encouraged to blow until the curve has reached a plateau, that is, when the true FVC has been achieved. In the absence of this visual display the forced manoeuvre may be terminated prematurely, and the FVC underestimated. We propose that this is the most likely explanation for the substantially higher FVC values obtained using the ndd Easy on-PC device than the Micro Medical device in our study, while there was no difference for FEV1. For FEV1 the mean difference between the 2 spirometers was zero and are, therefore, within the 150ml ATS/ERS criteria for replication of measurement. In addition, the limits of agreement did not exceed the 350ml criterion set in previous spirometry studies [27]. Whether using a group correction for FVC is valid, however, remains debateable as in the SAPALDIA study, a group correction from a quasi-experimental study was found not to be adequate, and an approach using spirometer-specific reference equations from longitudinal measurements to describe individualised corrections terms was preferred [12].

In considering the potential clinical significance of the differences between devices, we have referred to published normative or predicted values of blood pressure, grip strength and lung function [3,39,40]. Based on analysis of age-related differences in mean blood pressure in the Health Survey for England 2016, the mean differences in SBP and DBP between devices that we observed are equivalent to an age difference of approximately five years, although the possible non-linearity of change with age in diastolic blood pressure across the age range of interest [41] that comparison more difficult. Further, the within-person standard deviation for systolic blood pressure is larger than the mean difference between devices. For grip strength [3] the observed 4-5kg difference in grip strength is equivalent to an age difference of approximately 5 years among men and 10 years among women aged 65 years and above. For lung function, based on the National Health and Nutrition Examination Survey (NHANES) III data [42], predicted values for five-year age-groups (with male height of 175cm and female height of 160cm), show that a difference of 0.47l in FVC is equivalent to an age difference of around 15 years, between 45–75 years. Therefore, together with the wide limits of agreement and good measurement reliability for each device, the difference that we observed between devices are likely to have important practical implications for both grip strength and lung function. For example, the differences in dynamometers may result in discrepancies in clinical diagnoses which use cut-points when identifying an individual as sarcopenic [43]. Similarly, the difference in FVC, but not FEV1, between machines will have implications for defining participants with COPD based on the ratio FEV1/FVC.

Maintaining consistency in the make and model of device used in studies reduces the likelihood of measurement differences, but is not always realistic given that equipment becomes obsolete and new technology can improve measurement, for example through automation (as is the case with the Omron 907), the transition from analogue to digital (as is the case with the transition from the Jamar hydraulic to Jamar Plus+ devices) or the introduction of visual encouragement and specific feedback (as provided by the ndd Easy on-PC). An important implication of our findings is that it would be advisable for researchers, therefore, to include simple experiments to assess machine comparability when a new device is introduced into a study. Conducting external comparison studies, such as ours, would also help interpretation for both within-study and between-study comparisons. In addition, the differences between devices need to be considered in the context of reliability of measurements for each device being compared. Our analysis showed good reliability of measurements, particularly for the dynamometers and spirometers, suggesting the differences observed are important. The ATS/ERS quality control for lung function ensures excellent reliability, but does result in exclusion of those who cannot meet the criteria.

A key strength of this study design was that it used the same standardised measurement protocols for all devices, which is important, as for all three functional measures, the type of device is only one of several factors which can affect measurements unless these other factors are kept constant as in our study. Blood pressure is affected by multiple factors [10] including the participant talking, actively listening, being exposed to cold, ingesting alcohol, having a distended bladder, recent smoking [44] and also to measurement protocols such as arm position and cuff size [45]. For grip strength, the values and precision of measurements have also been shown to be influence by a range of factors [30,37] including whether allowance is made for hand size and hand-dominance [46], dynamometer handle shape [47], position of the elbow [48] and wrist during testing [49], setting of the dynamometer [50,51], effort and encouragement, frequency of testing and time of day and training of the assessor [30,51]. The study also included a relatively large sample size, based on a priori sample size calculation, compared with other similar studies, and implemented a randomised design. While confidence in the results rests primarily on this randomised design [29], the fact that participants were drawn from a large database of members of the public, who had been involved in previous market research and consented to be re-contacted, suggests they may be more representative of the general population than the small-scale volunteer samples used in many previous studies. We also acknowledge the limitations of the study. The study findings cannot be generalised beyond the parameters of the research design; for example, results might differ for those outside the sampled age range (i.e., 45 to 74), and while the trial compared devices most commonly used in UK population-based studies, no comment can be made about device combinations which were not included [15]. While standardising the measurement protocols was an important aspect of the research design, it meant deviating from the protocol for the Smedley dynamometer (normally assessed standing rather than sitting) and so may limit the applicability of the findings for this device [30]. Furthermore, in the primary analyses of lung function, a number of participants were excluded due to missing or low-quality readings, particularly on the ndd Easy on-PC, thus reducing the sample size and power of these analyses. Nevertheless, sensitivity analyses using all available readings, irrespective of quality, suggested that this did not have a big impact on findings. Indeed, sensitivity analyses considering outliers, incorrectly ordered tests and alternative coding of measures, all showed that our results were robust. Assessor may be a source of variation in our study which we have not accounted for, although this variation was minimised by the consistent training and protocol, and is not likely to have had a substantial impact on differences between devices since this was a within-person comparison and the same researcher assessed the same person on all machines.

In conclusion, this randomised cross-over study showed measurement differences between devices commonly used to assess blood pressure, grip strength and lung function which researchers should be aware of when carrying out comparative research between studies and within studies over time.

Supporting information

S1 Table. Sample size by age group and sex.

(DOCX)

S2 Table. Reliability of the devices included in the study.

(DOCX)

S3 Table. Assessment of order effects for all measures.

(DOCX)

S4 Table. Sensitivity analysis for differences in mean and limits of agreement for all measures.

(DOCX)

S5 Table. Sensitivity analysis of order effects for all measures.

(DOCX)

S6 Table. Sensitivity analysis using multilevel models for blood pressure.

(DOCX)

S7 Table. Sensitivity analysis using multilevel models for grip strength.

(DOCX)

S8 Table. Sensitivity analysis using multilevel models for lung function.

(DOCX)

S1 Fig. Histograms of mean differences in SBP (mmHg), DBP (mmHg), grip strength (kg) and lung function (FEV1 and FVC, litres) for all device combinations.

(DOCX)

S1 Appendix. Supplementary methods.

(DOCX)

S2 Appendix. Questionnaire.

(DOCX)

Acknowledgments

We thank the study participants, Kantar who provided access to the study sample, George Kyriakopoulos, now at BP (British Petroleum), for providing study advice, and Shaun Scholes, at UCL, for assistance with the analysis of Health Survey for England data referred to in the discussion.

Data Availability

All relevant data are publicly available from the ReShare repository (https://dx.doi.org/10.5255/UKDA-SN-856306).

Funding Statement

The project was funded by CLOSER (https://closer.ac.uk/), whose mission is to maximise the use, value and impact of longitudinal studies. CLOSER is funded by the UK Economic and Social Research Council (grant reference: ES/K000357/1). CL was funded by an ESRC Doctoral Training Programme grant at the University of Southampton (ES/J500161/1). The UK Medical Research Council funded AE, AK, AW, DH, MP (MC_UU_12019/1), RB, RH, RCa (MC_UU_12019/2) and RCo, SM, TC (MC_UU_12019/4) when undertaking this work. RB is funded in part by the King’s College London UK Medical Research Council Skills Development Fellowship programme (MR/R016372/1) and by the National Institute for Health Research (NIHR) Biomedical Research Centre (IS-BRC-1215-20018) at South London and Maudsley NHS Foundation Trust and King’s College London. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Kuh D, Cooper R, Hardy R, Richards M, Ben-Shlomo Y. A life course approach to healthy ageing. Oxford: Oxford University Press; 2014. [Google Scholar]
  • 2.Lara J, Cooper R, Nissan J, Ginty AT, Khaw K-T, Deary IJ, et al. A proposed panel of biomarkers of healthy ageing. BMC Medicine. 2015;13(1):222. doi: 10.1186/s12916-015-0470-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Dodds RM, Syddall HE, Cooper R, Benzeval M, Deary IJ, Dennison EM, et al. Grip strength across the life course: normative data from twelve British studies. PLoS One. 2014;9(12):e113637. doi: 10.1371/journal.pone.0113637 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Wills AK, Lawlor DA, Matthews FE, Sayer AA, Bakra E, Ben-Shlomo Y, et al. Life course trajectories of systolic blood pressure using longitudinal data from eight UK cohorts. PLoS Med. 2011;8(6):e1000440. doi: 10.1371/journal.pmed.1000440 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Standardization of methods of measuring the arterial blood pressure. A joint report of the committees appointed by the Cardiac Society of Great Britain and Ireland and the American Heart Association. 1939;1(3):261–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Miller MR, Hankinson J, Brusasco V, Burgos F, Casaburi R, Coates A, et al. Standardisation of spirometry. European Respiratory Journal. 2005;26(2):319–38. doi: 10.1183/09031936.05.00034805 [DOI] [PubMed] [Google Scholar]
  • 7.Reuben DB, Magasi S, McCreath HE, Bohannon RW, Wang YC, Bubela DJ, et al. Motor assessment using the NIH Toolbox. Neurology. 2013;80(11 Suppl 3):S65–75. doi: 10.1212/WNL.0b013e3182872e01 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Goisis A, Brown M, Kumari M, Sullivan A. Overview of bio measures in longitudinal and life course research. 2014. [Google Scholar]
  • 9.Tolonen H, Koponen P, Naska A, Männistö S, Broda G, Palosaari T, et al. Challenges in standardization of blood pressure measurement at the population level. BMC Medical Research Methodology. 2015;15(1):33. doi: 10.1186/s12874-015-0020-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Jones DW, Appel LJ, Sheps SG, Roccella EJ, Lenfant C. Measuring blood pressure accurately: new and persistent challenges. JAMA. 2003;289(8):1027–30. doi: 10.1001/jama.289.8.1027 [DOI] [PubMed] [Google Scholar]
  • 11.Mohandas A, Foley KA. Medical devices: adapting to the comparative effectiveness landscape. Biotechnol Healthc. 2010;7(2):25–8. [PMC free article] [PubMed] [Google Scholar]
  • 12.Bridevaux PO, Dupuis-Lozeron E, Schindler C, Keidel D, Gerbase MW, Probst-Hensch NM, et al. Spirometer Replacement and Serial Lung Function Measurements in Population Studies: Results From the SAPALDIA Study. Am J Epidemiol. 2015;181(10):752–61. doi: 10.1093/aje/kwu352 [DOI] [PubMed] [Google Scholar]
  • 13.Orfei L, Strachan DP, Rudnicka AR, Wadsworth M. Early influences on adult lung function in two national British cohorts. Archives of disease in childhood. 2008;93(7):570–4. doi: 10.1136/adc.2006.112201 [DOI] [PubMed] [Google Scholar]
  • 14.Craig R, Mindell J. Health Survey for England 2010. Respiratory health. 2011. [Google Scholar]
  • 15.McFall S, Petersen J, Kaminska O, Lynn P. Understanding Society Waves 2 and 3 Nurse Health Assessment, 2010–2012. Guide to Nurse Health Assessment ISER, University of Essex. 2014. [Google Scholar]
  • 16.Stang A, Moebus S, Mohlenkamp S, Dragano N, Schmermund A, Beck EM, et al. Algorithms for converting random-zero to automated oscillometric blood pressure values, and vice versa. Am J Epidemiol. 2006;164(1):85–94. doi: 10.1093/aje/kwj160 [DOI] [PubMed] [Google Scholar]
  • 17.Campbell NRC, McKay DW. Accurate blood pressure measurement. Why does it matter? 1999;161(3):277–8. [PMC free article] [PubMed] [Google Scholar]
  • 18.Wan Y, Heneghan C, Stevens R, McManus RJ, Ward A, Perera R, et al. Determining which automatic digital blood pressure device performs adequately: a systematic review. J Hum Hypertens. 2010;24(7):431–8. doi: 10.1038/jhh.2010.37 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Skirton H, Chamberlain W, Lawson C, Ryan H, Young E. A systematic review of variability and reliability of manual and automated blood pressure readings. J Clin Nurs. 2011;20(5–6):602–14. doi: 10.1111/j.1365-2702.2010.03528.x [DOI] [PubMed] [Google Scholar]
  • 20.Bolling K. The Dinamap 8100 calibration study: HM Stationery Office; 1994. [Google Scholar]
  • 21.King TI, 2nd. Interinstrument reliability of the Jamar electronic dynamometer and pinch gauge compared with the Jamar hydraulic dynamometer and B&L Engineering mechanical pinch gauge. Am J Occup Ther. 2013;67(4):480–3. [DOI] [PubMed] [Google Scholar]
  • 22.Guerra RS, Amaral TF. Comparison of hand dynamometers in elderly people. J Nutr Health Aging. 2009;13(10):907–12. doi: 10.1007/s12603-009-0250-3 [DOI] [PubMed] [Google Scholar]
  • 23.Kim M, Shinkai S. Prevalence of muscle weakness based on different diagnostic criteria in community-dwelling older adults: A comparison of grip strength dynamometers. Geriatr Gerontol Int. 2017;17(11):2089–95. doi: 10.1111/ggi.13027 [DOI] [PubMed] [Google Scholar]
  • 24.Svens B, Lee H. Intra- and inter-instrument reliability of Grip-Strength Measurements: GripTrack™ and Jamar® hand dynamometers. The British Journal of Hand Therapy. 2005;10(2):47–55. [Google Scholar]
  • 25.Milanzi EB, Koppelman GH, Oldenwening M, Augustijn S, Aalders-de Ruijter B, Farenhorst M, et al. Considerations in the use of different spirometers in epidemiological studies. Environ Health. 2019;18(1):39. doi: 10.1186/s12940-019-0478-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Hosie HE, Nimmo WS. Measurement of FEV1 and FVC. Comparison of a pocket spirometer with the Vitalograph. Anaesthesia. 1988;43(3):233–8. [PubMed] [Google Scholar]
  • 27.Gerbase MW, Dupuis-Lozeron E, Schindler C, Keidel D, Bridevaux PO, Kriemler S, et al. Agreement between spirometers: a challenge in the follow-up of patients and populations? Respiration. 2013;85(6):505–14. doi: 10.1159/000346649 [DOI] [PubMed] [Google Scholar]
  • 28.O’Neill D, Benzeval M, Boyd A, Calderwood L, Cooper C, Corti L, et al. Data resource profile: cohort and longitudinal studies enhancement resources (CLOSER). International journal of epidemiology. 2019;48(3):675–6i. doi: 10.1093/ije/dyz004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Schulz KF, Altman DG, Moher D, Group C. CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. Trials. 2010;11(1):32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Roberts HC, Denison HJ, Martin HJ, Patel HP, Syddall H, Cooper C, et al. A review of the measurement of grip strength in clinical and epidemiological studies: towards a standardised approach. Age and ageing. 2011;40(4):423–9. doi: 10.1093/ageing/afr051 [DOI] [PubMed] [Google Scholar]
  • 31.Rabe-Hesketh S, Skrondal A. Multilevel and longitudinal modeling using Stata: STATA press; 2008. [Google Scholar]
  • 32.Bland JM, Altman D. Statistical methods for assessing agreement between two methods of clinical measurement. The lancet. 1986;327(8476):307–10. [PubMed] [Google Scholar]
  • 33.Chhapola V, Kanwal SK, Brar R. Reporting standards for Bland–Altman agreement analysis in laboratory research: a cross-sectional survey of current practice. Annals of Clinical Biochemistry. 2015;52(3):382–6. doi: 10.1177/0004563214553438 [DOI] [PubMed] [Google Scholar]
  • 34.Giavarina D. Understanding bland altman analysis. Biochemia medica: Biochemia medica. 2015;25(2):141–51. doi: 10.11613/BM.2015.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Chobanian AV, Bakris GL, Black HR, Cushman WC, Green LA, Izzo JL Jr, et al. Seventh report of the joint national committee on prevention, detection, evaluation, and treatment of high blood pressure. hypertension. 2003;42(6):1206–52. doi: 10.1161/01.HYP.0000107251.49515.c2 [DOI] [PubMed] [Google Scholar]
  • 36.Hardy R, Wadsworth ME, Langenberg C, Kuh D. Birthweight, childhood growth, and blood pressure at 43 years in a British birth cohort. International Journal of Epidemiology. 2004;33(1):121–9. doi: 10.1093/ije/dyh027 [DOI] [PubMed] [Google Scholar]
  • 37.Sousa-Santos A, Amaral T. Differences in handgrip strength protocols to identify sarcopenia and frailty-a systematic review. BMC geriatrics. 2017;17(1):238. doi: 10.1186/s12877-017-0625-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Mathiowetz V, Weber K, Volland G, Kashman N. Reliability and validity of grip and pinch strength evaluations. Journal of Hand Surgery. 1984;9(2):222–6. doi: 10.1016/s0363-5023(84)80146-x [DOI] [PubMed] [Google Scholar]
  • 39.Scholes S, Neave A. Health Survey for England 2016: Physical activity in adults. Leeds: Health and Social Care Information Centre. 2017. [Google Scholar]
  • 40.NHANES normative values [Available from: https://vitalograph.com/resources/nhanes-normal-values. [Google Scholar]
  • 41.Mutz J, Lewis CM. Lifetime depression and age-related changes in body composition, cardiovascular function, grip strength and lung function: sex-specific analyses in the UK Biobank. Aging (Albany NY). 2021; 3:17038–17079. doi: 10.18632/aging.203275 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Thomas ET, Guppy M, Straus SE, Bell KJ, Glasziou P. Rate of normal lung function decline in ageing adults: a systematic review of prospective cohort studies. BMJ open. 2019;9(6):e028150. doi: 10.1136/bmjopen-2018-028150 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Cooper R, Lessof C, Wong A, Hardy R. The impact of variation in the device used to measure grip strength on the identification of low muscle strength: Findings from a randomised cross-over study. Journal of Frailty, Sarcopenia and Falls. 2021;6(4):225–230. doi: 10.22540/JFSF-06-225 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Handler J. The importance of accurate blood pressure measurement. The Permanente Journal. 2009;13(3):51. doi: 10.7812/tpp/09-054 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Bilo G, Sala O, Perego C, Faini A, Gao L, Głuszewska A, et al. Impact of cuff positioning on blood pressure measurement accuracy: may a specially designed cuff make a difference? Hypertens Res. 2017;40(6):573–80. doi: 10.1038/hr.2016.184 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Incel NA, Ceceli E, Durukan PB, Erdem HR, Yorgancioglu ZR. Grip strength: effect of hand dominance. Singapore medical journal. 2002;43(5):234–7. [PubMed] [Google Scholar]
  • 47.Amaral JF, Mancini M, Novo Júnior JM. Comparison of three hand dynamometers in relation to the accuracy and precision of the measurements. Brazilian Journal of Physical Therapy. 2012;16(3):216–24. doi: 10.1590/s1413-35552012000300007 [DOI] [PubMed] [Google Scholar]
  • 48.Balogun JA, Akomolafe CT, Amusa LO. Grip strength: effects of testing posture and elbow position. Archives of physical medicine and rehabilitation. 1991;72(5):280–3. [PubMed] [Google Scholar]
  • 49.O’Driscoll SW, Horii E, Ness R, Cahalan TD, Richards RR, An K-N. The relationship between wrist position, grasp size, and grip strength. The Journal of hand surgery. 1992;17(1):169–77. doi: 10.1016/0363-5023(92)90136-d [DOI] [PubMed] [Google Scholar]
  • 50.Firrell JC, Crain GM. Which setting of the dynamometer provides maximal grip strength? The Journal of hand surgery. 1996;21(3):397–401. doi: 10.1016/S0363-5023(96)80351-0 [DOI] [PubMed] [Google Scholar]
  • 51.Fess E. Clinical assessment recommendations. American society of hand therapists. 1981:6–8. [Google Scholar]

Decision Letter 0

Masaki Mogi

2 Feb 2023

PONE-D-22-31125Comparison of devices used to measure blood pressure, grip strength and lung function: a randomised cross-over studyPLOS ONE

Dear Dr. Hardy,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

The manuscript is well assessed by the two reviewers; however, several major critiques are raised in the present form. Read the suggestions carefully and respond to them appripriately.

Please submit your revised manuscript by Mar 19 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Masaki Mogi

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match.

When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section.

3. Please note that in order to use the direct billing option the corresponding author must be affiliated with the chosen institute. Please either amend your manuscript to change the affiliation or corresponding author, or email us at plosone@plos.org with a request to remove this option.

4. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability.

Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized.

Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access.

We will update your Data Availability statement to reflect the information you provide in your cover letter.

5. We note you have included a table to which you do not refer in the text of your manuscript. Please ensure that you refer to Table 5 in your text; if accepted, production will need this reference to link the reader to the Table.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Carli Lessof et al. used a randomized cross-over study to compare different devices for their measurement of blood pressure, grip strength and lung function. All the measurements and analyses were performed to a good technical standard. Measurement differences between devices are bound to exist, however, the article does not provide information for us to choose a better device. I recommend comparing these devices to standard devices such as mercury sphygmomanometer.

Reviewer #2: Review of PONE-D-22-31125

Thank you for the invitation to review the manuscript “Comparison of devices used to measure blood pressure, grip strength and lung function: a randomised cross-over study”.

The manuscript is well-written, and the research question is clearly defined. The study is scientifically sound, and I have no significant concerns. I therefore provide only minor comments and suggestions which I hope the authors will find useful in improving the presentation of their work.

Abstract

It would be useful to state that multiple measurements were taken for each device.

L58: consider providing quantitative estimates with the statement that “differences were small”, for example the range of mean differences.

Key implications of the findings for statistical analyses could be more specific, addressing how these findings should be considered when “modelling intra-individual changes in function and when carrying out cross-study analyses.”

Introduction

A brief paragraph discussing the test-retest reliability of these devices would be useful to contextualise the findings regarding difference between devices. How do measurements taken with the same device vary in magnitude compared to differences between devices?

L89-91: could these differences instead be accounted for in the analyses? It would be useful if the authors would be more specific about what they mean by “in some instances”.

L94: although the authors cite multiple studies, they state that the “evidence is limited”.

L98 (and elsewhere): I suggest avoiding unnecessary acronyms such as “BP” for blood pressure.

Methods

L104: consider providing the name of the guidelines (CONSORT) here.

L109: “and the South East” of England.

L119: “anonymised” -> “pseudo anonymised”.

Please provide the justification/rationale for using a randomised cross-over study design. This might seem obvious, but it is useful to state this explicitly in the text.

As noted above, it would be useful to state earlier that multiple measurements were taken with each device.

Data on other standard metrics for lung function (e.g., peak expiratory flow or the ratio of FEV1/FVC) were not considered here and might be worth adding, if available.

More details on quality control should be provided for the lung function data.

L210-211: additional guidance should be provided for readers regarding the interpretation of 95% limits of agreement.

Results

Table 3: “V good” and “V poor” should be spelled out.

Table 4: “Some/lot difficulting gripping” needs revision.

The data on reliability provided in Table S2 are of interest. Could the authors provide information about what these reliability estimates translate to in terms of variation in units of measurement? This would likely help readers better interpret the results regarding between-device differences.

L275: “… the analyses”.

The large differences between digital dynamometers and manual ones are of interest. Did these discrepancies vary by experimenter?

Discussion

L340: please explain why the age of the sample would be a relevant consideration here.

L345: “55 participants aged …” would make for easier reading.

L364: given the strong non-linear associations between age and several of these measures (e.g., Mutz et al. 2021, Aging, doi: 10.18632/aging.203275 in a large UK sample), it is perhaps difficult to talk about the equivalent of a 5-year age difference, especially for measures like diastolic blood pressure.

L405: although this suggestion is appropriate within studies, I am not sure about the feasibility of conducting such experiments when making comparison between studies (which are typically secondary data analyses / meta-analyses).

L426: “irrepsctive” -> “irrespective”.

Could the authors comment on how statistically significant differences, for example, between groups should be interpreted if such differences are smaller in magnitude than some of the differences observed here between devices for the same device across multiple measurements?

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: Review_PONE-D-22-31125.docx

PLoS One. 2023 Dec 27;18(12):e0289052. doi: 10.1371/journal.pone.0289052.r002

Author response to Decision Letter 0


28 Apr 2023

We are grateful to the reviewers for the time taken to review this paper and the helpful comments provided.

Reviewer #1

Carli Lessof et al. used a randomized cross-over study to compare different devices for their measurement of blood pressure, grip strength and lung function. All the measurements and analyses were performed to a good technical standard. Measurement differences between devices are bound to exist, however, the article does not provide information for us to choose a better device. I recommend comparing these devices to standard devices such as mercury sphygmomanometer.

Response: We thank the reviewer for acknowledging that our study was performed to a good technical standard. The intention of the study was not to aid selection of a “better device” nor to compare new devices with a standard machine, but rather to aid within- and between-study comparison of devices already commonly used in longitudinal studies. We therefore selected devices which had commonly been used in key UK longitudinal population studies for which there was a lack of previous comparison information. There have been multiple previous studies comparing the manual sphygmomanometers, including mercury ones, with automated devices (see references 16-20), but few comparing different automated devices. We have made edits to the introduction to make clearer the rationale for the choice of machines compared in our study (lines 81-104).

Reviewer #2:

The manuscript is well-written, and the research question is clearly defined. The study is scientifically sound, and I have no significant concerns. I therefore provide only minor comments and suggestions which I hope the authors will find useful in improving the presentation of their work.

Response: We thank the reviewer for their positive comments and for the very helpful suggestions which we feel have improved the manuscript.

Abstract

It would be useful to state that multiple measurements were taken for each device.

Response: This information has been added to the abstract (line 49).

L58: consider providing quantitative estimates with the statement that “differences were small”, for example the range of mean differences.

Response: The estimates and 95% confidence intervals have been added (lines 59 and 60).

Key implications of the findings for statistical analyses could be more specific, addressing how these findings should be considered when “modelling intra-individual changes in function and when carrying out cross-study analyses.”

Response: The implications have been edited to make this clearer and highlight that we might expect sensitivity analyses to be carried out (lines 66-67).

Introduction

A brief paragraph discussing the test-retest reliability of these devices would be useful to contextualise the findings regarding difference between devices. How do measurements taken with the same device vary in magnitude compared to differences between devices?

Response: On review of the literature, there was little good quality information on test-retest reliability across the devices used in this study. We therefore decided not to include a paragraph in the introduction, but we acknowledge that this is an important point to consider. Therefore, in response to this point, and others below, we have added the within-person standard deviations to Table S2 (and discuss in lines 254-260) and further discussion of reliability in interpretation of the findings (lines 391-394, 406-407, 413-416, 430-435).

L89-91: could these differences instead be accounted for in the analyses? It would be useful if the authors would be more specific about what they mean by “in some instances”.

Response: We have clarified this sentence in that it relates to specific recommendations regarding the use of spirometers in UK studies (line 94).

L94: although the authors cite multiple studies, they state that the “evidence is limited”.

Response: We have edited the final paragraph of the introduction to make clearer our choice of machines to compare (lines 97-104).

L98 (and elsewhere): I suggest avoiding unnecessary acronyms such as “BP” for blood pressure.

Response: We have used blood pressure instead of BP and also replace LOA with limits of agreement and BMI with body mass index. We welcome editorial guidance on any other acronyms that should be replaced.

Methods

L104: consider providing the name of the guidelines (CONSORT) here.

Response: “CONSORT” has been added (line 110).

L109: “and the South East” of England.

Response: This has been corrected (line 115).

L119: “anonymised” -> “pseudo anonymised”.

Response: This had been corrected (line 125).

Please provide the justification/rationale for using a randomised cross-over study design. This might seem obvious, but it is useful to state this explicitly in the text.

Response: We thank the reviewer for highlighting the need for clarification. The rationale for using a randomised cross-over study design has now been added. “…so as to make within-person measurement comparisons” (lines 109).

As noted above, it would be useful to state earlier that multiple measurements were taken with each device.

Response: This has now been stated in the “Study Design and Sample” section (lines 135-136).

Data on other standard metrics for lung function (e.g., peak expiratory flow or the ratio of FEV1/FVC) were not considered here and might be worth adding, if available.

Response: FEV1 and FVC are the gold standard measures for lung function and therefore are the ones that are derived from spirometers in longitudinal population studies and used in analyses. PEFR is seen as an approximation of FEV1, and is therefore no longer used where FEV1 measures are available. FEV1 and FVC were thus chosen a priori as primary outcomes measures. Given these measures were defined in our protocol, we feel that it would not be good practice to add PEFR as an outcome at this stage and, although available, our protocol was not set up to standardise PEFR measurement. We do acknowledge the interest in the FEV1/FVC particularly in detecting COPD, and we have thus added a sentence to the discussion on the likely impact of our findings on FEV1/FVC (lines 418-419). Given that this outcome is a ratio of two measurements, a comparison between devices would not be informative as differences would be a combination of the variation in the individual components. If adjustment for a change in device was to be made, it would be to the component measures (FEV1 or FVC) rather than the ratio.

More details on quality control should be provided for the lung function data.

Response: We have added details on the quality control (lines 201-203). In relation to this, we also clarified that participants had five attempts to produce three valid measures (lines 179-180).

L210-211: additional guidance should be provided for readers regarding the interpretation of 95% limits of agreement.

Response: This has been added (lines 220-221), as has a clearer description of the Bland-Altman plot which also aids interpretation (lines 217-218).

Results

Table 3: “V good” and “V poor” should be spelled out.

Response: These have been spelled out.

Table 4: “Some/lot difficulting gripping” needs revision.

Response: This has been edited.

The data on reliability provided in Table S2 are of interest. Could the authors provide information about what these reliability estimates translate to in terms of variation in units of measurement? This would likely help readers better interpret the results regarding between-device differences.

Response: We have added the within-person standard deviation, which are in the original units, for each device and measurement to Table S2. We have added additional text describing the content of this table, including highlighting the repeatability is part of the spirometry quality control (lines 254-260).

L275: “… the analyses”.

Response: This has been corrected.

The large differences between digital dynamometers and manual ones are of interest. Did these discrepancies vary by experimenter?

Response: Given that this is a within-person comparison study, and the same experimenter (assessor) tested the same person on all machines, while the individual measurements may have varied according to experimenter (due, for example, to different levels of encouragement), we did not anticipate that they would have had a great impact on the differences. However, we did consider the potential impact of assessor variation in preliminary multilevel model analyses. These models showed that the variation by assessor, as anticipated, was small (and was statistically non-significant), and inclusion of this source of variation did not change the main findings. Therefore, we chose not include assessor in our final analyses and can conclude therefore that assessor does not explain the wide variation in differences in grip strength. We do now acknowledge that assessor is an additional source of variation which we have not accounted for in the study and have added this to the discussion (lines 448-449).

Discussion

L340: please explain why the age of the sample would be a relevant consideration here.

Response: This has been explained. “However, the study population was younger, with an average age of 32 years, comprising a convenience sample of 40 men and women and may have better function than our older sample which could influence comparability across machines” (lines 361-362)

L345: “55 participants aged …” would make for easier reading.

Response: This has been changed (line 366)

L364: given the strong non-linear associations between age and several of these measures (e.g., Mutz et al. 2021, Aging, doi: 10.18632/aging.203275 in a large UK sample), it is perhaps difficult to talk about the equivalent of a 5-year age difference, especially for measures like diastolic blood pressure.

Response: we have added a caveat to this interpretation for diastolic blood pressure and acknowledge the limitation of this comparison (line 404-406).

L405: although this suggestion is appropriate within studies, I am not sure about the feasibility of conducting such experiments when making comparison between studies (which are typically secondary data analyses / meta-analyses).

Response: This point been edited for clarity. “Conducting external comparison studies, such as ours, would also help interpretation for both within-study and between-study comparisons.” (lines 429-430)

L426: “irrepsctive” -> “irrespective”.

Response: This has been corrected.

Could the authors comment on how statistically significant differences, for example, between groups should be interpreted if such differences are smaller in magnitude than some of the differences observed here between devices for the same device across multiple measurements?

Response: As indicated above we have added the within-person standard deviations in Table S2. We have also edited the discussion to add further comment on the interpretation (lines 391-394, 406-407, 413-416, 430-435).

Attachment

Submitted filename: Response to reviewers.docx

Decision Letter 1

Masaki Mogi

23 May 2023

PONE-D-22-31125R1Comparison of devices used to measure blood pressure, grip strength and lung function: a randomised cross-over studyPLOS ONE

Dear Dr. Wong,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

==============================

The manusript still needs a minor revision.See the suggetions from the reviewer and respond them appropriately.

==============================

Please submit your revised manuscript by Jul 07 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Masaki Mogi

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The study showed measurement differences between devices commonly used to assess BP, grip strength and lung function, the results may help within and between-study comparison of devices used in longitudinal studies. I only provide minor suggestions which I hope the authors will find useful in improving the presentation of their work

1. I suggest presenting the order of assessment (Table 2) in the form of a flowchart.

2. Please add the limitations of this study in the Discussion.

Reviewer #2: Review of PONE-D-22-31125_R1

Thank you for the invitation to review the revised version of this manuscript. I thank the authors for addressing each of my previous comments and suggestions. I have no further concerns.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2023 Dec 27;18(12):e0289052. doi: 10.1371/journal.pone.0289052.r004

Author response to Decision Letter 1


3 Jul 2023

Response to additional reviewers comments from 23/05/2023

Reviewer #1: The study showed measurement differences between devices commonly used to assess BP, grip strength and lung function, the results may help within and between-study comparison of devices used in longitudinal studies. I only provide minor suggestions which I hope the authors will find useful in improving the presentation of their work

1. I suggest presenting the order of assessment (Table 2) in the form of a flowchart.

Response: Thank you for this suggestion. We have provided a flowchart.

2. Please add the limitations of this study in the Discussion.

Response: We have made the limitations of this study more explicit and expanded the discussion.

Response to original reviewers comments

Reviewer #1

Carli Lessof et al. used a randomized cross-over study to compare different devices for their measurement of blood pressure, grip strength and lung function. All the measurements and analyses were performed to a good technical standard. Measurement differences between devices are bound to exist, however, the article does not provide information for us to choose a better device. I recommend comparing these devices to standard devices such as mercury sphygmomanometer.

Response: We thank the reviewer for acknowledging that our study was performed to a good technical standard. The intention of the study was not to aid selection of a “better device” nor to compare new devices with a standard machine, but rather to aid within- and between-study comparison of devices already commonly used in longitudinal studies. We therefore selected devices which had commonly been used in key UK longitudinal population studies for which there was a lack of previous comparison information. There have been multiple previous studies comparing the manual sphygmomanometers, including mercury ones, with automated devices (see references 16-20), but few comparing different automated devices. We have made edits to the introduction to make clearer the rationale for the choice of machines compared in our study (lines 81-104).

Reviewer #2:

The manuscript is well-written, and the research question is clearly defined. The study is scientifically sound, and I have no significant concerns. I therefore provide only minor comments and suggestions which I hope the authors will find useful in improving the presentation of their work.

Response: We thank the reviewer for their positive comments and for the very helpful suggestions which we feel have improved the manuscript.

Abstract

It would be useful to state that multiple measurements were taken for each device.

Response: This information has been added to the abstract (line 49).

L58: consider providing quantitative estimates with the statement that “differences were small”, for example the range of mean differences.

Response: The estimates and 95% confidence intervals have been added (lines 59 and 60).

Key implications of the findings for statistical analyses could be more specific, addressing how these findings should be considered when “modelling intra-individual changes in function and when carrying out cross-study analyses.”

Response: The implications have been edited to make this clearer and highlight that we might expect sensitivity analyses to be carried out (lines 66-67).

Introduction

A brief paragraph discussing the test-retest reliability of these devices would be useful to contextualise the findings regarding difference between devices. How do measurements taken with the same device vary in magnitude compared to differences between devices?

Response: On review of the literature, there was little good quality information on test-retest reliability across the devices used in this study. We therefore decided not to include a paragraph in the introduction, but we acknowledge that this is an important point to consider. Therefore, in response to this point, and others below, we have added the within-person standard deviations to Table S2 (and discuss in lines 254-260) and further discussion of reliability in interpretation of the findings (lines 391-394, 406-407, 413-416, 430-435).

L89-91: could these differences instead be accounted for in the analyses? It would be useful if the authors would be more specific about what they mean by “in some instances”.

Response: We have clarified this sentence in that it relates to specific recommendations regarding the use of spirometers in UK studies (line 94).

L94: although the authors cite multiple studies, they state that the “evidence is limited”.

Response: We have edited the final paragraph of the introduction to make clearer our choice of machines to compare (lines 97-104).

L98 (and elsewhere): I suggest avoiding unnecessary acronyms such as “BP” for blood pressure.

Response: We have used blood pressure instead of BP and also replace LOA with limits of agreement and BMI with body mass index. We welcome editorial guidance on any other acronyms that should be replaced.

Methods

L104: consider providing the name of the guidelines (CONSORT) here.

Response: “CONSORT” has been added (line 110).

L109: “and the South East” of England.

Response: This has been corrected (line 115).

L119: “anonymised” -> “pseudo anonymised”.

Response: This had been corrected (line 125).

Please provide the justification/rationale for using a randomised cross-over study design. This might seem obvious, but it is useful to state this explicitly in the text.

Response: We thank the reviewer for highlighting the need for clarification. The rationale for using a randomised cross-over study design has now been added. “…so as to make within-person measurement comparisons” (lines 109).

As noted above, it would be useful to state earlier that multiple measurements were taken with each device.

Response: This has now been stated in the “Study Design and Sample” section (lines 135-136).

Data on other standard metrics for lung function (e.g., peak expiratory flow or the ratio of FEV1/FVC) were not considered here and might be worth adding, if available.

Response: FEV1 and FVC are the gold standard measures for lung function and therefore are the ones that are derived from spirometers in longitudinal population studies and used in analyses. PEFR is seen as an approximation of FEV1, and is therefore no longer used where FEV1 measures are available. FEV1 and FVC were thus chosen a priori as primary outcomes measures. Given these measures were defined in our protocol, we feel that it would not be good practice to add PEFR as an outcome at this stage and, although available, our protocol was not set up to standardise PEFR measurement. We do acknowledge the interest in the FEV1/FVC particularly in detecting COPD, and we have thus added a sentence to the discussion on the likely impact of our findings on FEV1/FVC (lines 418-419). Given that this outcome is a ratio of two measurements, a comparison between devices would not be informative as differences would be a combination of the variation in the individual components. If adjustment for a change in device was to be made, it would be to the component measures (FEV1 or FVC) rather than the ratio.

More details on quality control should be provided for the lung function data.

Response: We have added details on the quality control (lines 201-203). In relation to this, we also clarified that participants had five attempts to produce three valid measures (lines 179-180).

L210-211: additional guidance should be provided for readers regarding the interpretation of 95% limits of agreement.

Response: This has been added (lines 220-221), as has a clearer description of the Bland-Altman plot which also aids interpretation (lines 217-218).

Results

Table 3: “V good” and “V poor” should be spelled out.

Response: These have been spelled out.

Table 4: “Some/lot difficulting gripping” needs revision.

Response: This has been edited.

The data on reliability provided in Table S2 are of interest. Could the authors provide information about what these reliability estimates translate to in terms of variation in units of measurement? This would likely help readers better interpret the results regarding between-device differences.

Response: We have added the within-person standard deviation, which are in the original units, for each device and measurement to Table S2. We have added additional text describing the content of this table, including highlighting the repeatability is part of the spirometry quality control (lines 254-260).

L275: “… the analyses”.

Response: This has been corrected.

The large differences between digital dynamometers and manual ones are of interest. Did these discrepancies vary by experimenter?

Response: Given that this is a within-person comparison study, and the same experimenter (assessor) tested the same person on all machines, while the individual measurements may have varied according to experimenter (due, for example, to different levels of encouragement), we did not anticipate that they would have had a great impact on the differences. However, we did consider the potential impact of assessor variation in preliminary multilevel model analyses. These models showed that the variation by assessor, as anticipated, was small (and was statistically non-significant), and inclusion of this source of variation did not change the main findings. Therefore, we chose not include assessor in our final analyses and can conclude therefore that assessor does not explain the wide variation in differences in grip strength. We do now acknowledge that assessor is an additional source of variation which we have not accounted for in the study and have added this to the discussion (lines 448-449).

Discussion

L340: please explain why the age of the sample would be a relevant consideration here.

Response: This has been explained. “However, the study population was younger, with an average age of 32 years, comprising a convenience sample of 40 men and women and may have better function than our older sample which could influence comparability across machines” (lines 361-362)

L345: “55 participants aged …” would make for easier reading.

Response: This has been changed (line 366)

L364: given the strong non-linear associations between age and several of these measures (e.g., Mutz et al. 2021, Aging, doi: 10.18632/aging.203275 in a large UK sample), it is perhaps difficult to talk about the equivalent of a 5-year age difference, especially for measures like diastolic blood pressure.

Response: we have added a caveat to this interpretation for diastolic blood pressure and acknowledge the limitation of this comparison (line 404-406).

L405: although this suggestion is appropriate within studies, I am not sure about the feasibility of conducting such experiments when making comparison between studies (which are typically secondary data analyses / meta-analyses).

Response: This point been edited for clarity. “Conducting external comparison studies, such as ours, would also help interpretation for both within-study and between-study comparisons.” (lines 429-430)

L426: “irrepsctive” -> “irrespective”.

Response: This has been corrected.

Could the authors comment on how statistically significant differences, for example, between groups should be interpreted if such differences are smaller in magnitude than some of the differences observed here between devices for the same device across multiple measurements?

Response: As indicated above we have added the within-person standard deviations in Table S2. We have also edited the discussion to add further comment on the interpretation (lines 391-394, 406-407, 413-416, 430-435).

Attachment

Submitted filename: Response to reviewers2.docx

Decision Letter 2

Masaki Mogi

11 Jul 2023

Comparison of devices used to measure blood pressure, grip strength and lung function: a randomised cross-over study

PONE-D-22-31125R2

Dear Dr. Wong,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Masaki Mogi

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Masaki Mogi

3 Aug 2023

PONE-D-22-31125R2

Comparison of devices used to measure blood pressure, grip strength and lung function: a randomised cross-over study

Dear Dr. Wong:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Masaki Mogi

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Table. Sample size by age group and sex.

    (DOCX)

    S2 Table. Reliability of the devices included in the study.

    (DOCX)

    S3 Table. Assessment of order effects for all measures.

    (DOCX)

    S4 Table. Sensitivity analysis for differences in mean and limits of agreement for all measures.

    (DOCX)

    S5 Table. Sensitivity analysis of order effects for all measures.

    (DOCX)

    S6 Table. Sensitivity analysis using multilevel models for blood pressure.

    (DOCX)

    S7 Table. Sensitivity analysis using multilevel models for grip strength.

    (DOCX)

    S8 Table. Sensitivity analysis using multilevel models for lung function.

    (DOCX)

    S1 Fig. Histograms of mean differences in SBP (mmHg), DBP (mmHg), grip strength (kg) and lung function (FEV1 and FVC, litres) for all device combinations.

    (DOCX)

    S1 Appendix. Supplementary methods.

    (DOCX)

    S2 Appendix. Questionnaire.

    (DOCX)

    Attachment

    Submitted filename: Review_PONE-D-22-31125.docx

    Attachment

    Submitted filename: Response to reviewers.docx

    Attachment

    Submitted filename: Response to reviewers2.docx

    Data Availability Statement

    All relevant data are publicly available from the ReShare repository (https://dx.doi.org/10.5255/UKDA-SN-856306).


    Articles from PLOS ONE are provided here courtesy of PLOS

    RESOURCES