Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2021 Jan 12;16(1):e0245278. doi: 10.1371/journal.pone.0245278

Within-session test-retest reliability of pressure pain threshold and mechanical temporal summation in healthy subjects

Catherine Mailloux 1,#, Louis-David Beaulieu 2,, Timothy H Wideman 3,, Hugo Massé-Alarie 1,*,#
Editor: Alison Rushton4
PMCID: PMC7802960  PMID: 33434233

Abstract

Objective

To determine the absolute and relative intra-rater within-session test-retest reliability of pressure pain threshold (PPT) and mechanical temporal summation of pain (TSP) at the low back and the forearm in healthy participants and to test the influence of the number and sequence of measurements on reliability metrics.

Methods

In 24 participants, three PPT and TSP measures were assessed at four sites (2 at the low back, 2 at the forearm) in two blocks of measurements separated by 20 minutes. The standard error of measurement, the minimal detectable change (MDC) and the intraclass correlation coefficient (ICC) were investigated for five different sequences of measurements (e.g. measurement 1, 1–2, 1-2-3).

Results

The MDC for the group (MDCgr) for PPT ranged from 28.71 to 50.56 kPa across the sites tested, whereas MDCgr for TSP varied from 0.33 to 0.57 out of 10 (numeric scale). Almost all ICC showed an excellent relative reliability (between 0.80 and 0.97), except when only the first measurement was considered (moderate). Although minimal differences in absolute PPT reliability were present between the different sequences, in general, using only the first measurement increase measurement error. Three TSP measures reduced the measurement error.

Discussion

We established that two measurements of PPT and three of TSP reduced the measurement error and demonstrated an excellent relative reliability. Our results could be used in future pain research to confirm the presence of true hypo/hyperalgesia for paradigms such as conditioned pain modulation or exercise-induced hypoalgesia, indicated by a change exceeding the measurement variability.

1. Introduction

The experience of pain is highly variable and influenced by biological, psychological and social factors [1]. One essential feature of the experience of pain is the capacity of the nervous system to modulate pain through the interplay of multiple areas and mechanisms [2]. The complexity of pain modulation mechanisms makes it difficult to evaluate. In humans, psychophysical experimental paradigms have been developed in research as proxy of pain modulation. Conditioned pain modulation (CPM) and exercise-induced hypoalgesia (EIH), for instance, are used to approximate the efficacy of pain inhibition [35]. Recently, these paradigms have been increasingly used in research to determine if individuals with chronic pain have an altered pain inhibition response. For example, some studies reported an alteration of pain inhibition in individuals with chronic low back pain (CLBP) using CPM [69] and EIH [10,11].

CPM refers to the decrease in pain sensitivity after the administration of a painful stimulus on a remote body part (e.g. cold water immersion of the hand [12]). EIH represents the decrease in pain sensitivity that occurs following an isometric, resistance or aerobic exercise [13]. Opioidergic, serotonergic and noradrenergic systems contribute to both CPM [2,1417] and EIH [1824]. Experimental protocols developed to assess EIH and CPM take generally part in a within-session design, including pain sensitivity measures collected before and after a conditioning stimulus, and they are determined by the change in pain sensitivity between test and retest. The pressure pain threshold (PPT) is the most used pain sensitivity measure for these paradigms [13,25,26]. PPT is a static measure of pain and would reflect the basal state of pain perception [4,27]. Temporal summation of pain (TSP) has also been used as a pain sensitivity measure to quantify CPM/EIH [2830] and rather constitutes a dynamic measure of pain sensitivity that refers to the perception of increasing pain in response to stable (continuous or repeated) noxious stimuli [3133]. To ensure the validity of these paradigms, it is essential that pain sensitivity measures chosen as testing stimulus (PPT and TSP) present small measurement error. This permits to define a minimal level of true change, exceeding the measurement variability, to confirm the presence of hyper- or hypoalgesia following a conditioning stimulus. In addition, reliability needs to be considered in respect of the site tested. Therefore, with a view of studying CLBP population, measuring variability of PPT and TSP at the low back area and at a remote site is essential to provide a general view of pain sensitivity and pain modulation functioning. Some studies documented PPT reliability at the low back [3437], but the sample size tested in these studies remains limited. For TSP, no study measured the reliability at the lower back area. Moreover, the number of measures needed to reach an acceptable level of test-retest reliability for TSP is not known. This is essential to ensure the applicability and the feasibility of measuring PPT and TSP in research and clinical practice, and to reduce the number of painful stimulus needed to obtain reliable and valid data.

The objectives are to 1) determine the absolute and relative intra-rater within-session test-retest reliability of PPT and mechanical TSP at the low back and at the forearm in healthy participants and 2) test the influence of the number and sequence of measurements for PPT and TSP on reliability metrics. Our study focused on absolute reliability that will provide the minimal level to reach a real change in pain sensitivity, which is essential to interpret the presence/absence of hyper/hypoalgesia using pain modulation paradigms such as CPM and EIH.

2. Materials and methods

2.1 Participants

Twenty-four healthy subjects (12 females | 12 males; 28.3 ± 11.0 years old) aged between 18 and 65 years old were recruited between December 2018 and June 2019 (see Table 1 for descriptive statistics). Sampling method was by convenience with emails sent to Laval University community (60,000 individuals comprising students and employees) and by solicitation at the research center. Determination of selection criteria was based on consensus statement by the EUROPAIN and NEUROPAIN consortia for quantitative sensory testing (QST)-based studies to ensure validity of the data [38]. Exclusion criteria were: 1) pain lasting three months or longer, located anywhere in the body, 2) severe health problem (such as cancer, major rheumatoid, cardiac, neurologic or psychiatric disease), 3) low back pain lasting more than 7 days in the last 6 months, 4) consultation with a health professional because of low back pain in the last 6 months, 5) current bilateral wrist or forearm pain and 6) current pregnancy and/or gave birth in the last year. Subjects who were currently taking medication like antidepressants, opioids, neuroleptics, anticonvulsive drugs or steroids were also excluded. This study was approved by local ethic committee (CIUSSS-Capitale Nationale, project #2019–1547) and all participants provided informed written consent prior experimentation. The body mass index was calculated for each participant and the Global Physical Activity Questionnaire (GPAQ) was self-administered to rate the level of physical activity [39].

Table 1. Descriptive statistics (n = 24).

Age, years, X¯ ± SD 28.3 ± 11.0
Sex (male: female) 12: 12
Race, %
    Caucasian 92
    Asian 4
    Black 4
BMI 23.0 ± 3.12
GPAQ 2763 ± 2018
Dominance, Right, % 83
Dominant site assessed, % 42

X¯ mean, SD standard deviation, BMI Body Mass Index, GPAQ Global Physical Activity Questionnaire.

Missing data for 2 participants.

Missing data for one participant.

2.2 Study design

All measurements were collected in a single session at the research center by the same rater (CM). The rater was a physical therapist with five years of clinical experience who had undertaken a QST training with experienced researchers.

PPT and TSP were tested in two blocks, lasting approximately 15 minutes each, separated by a pause of 20 minutes (Fig 1). The 20 min-pause was established in consideration of the time interval duration between blocks of testing when assessing EIH and CPM (e.g. time interval between the measure of PPT pre to post conditioning stimulus) [26,4043]. This was done to ensure that PPT and TSP reliability were tested in a design reproducing CPM/EIH protocols.

Fig 1. Study procedure and sequences of measurements analyzed for PPT.

Fig 1

2.3 Quantitative sensory testing (QST)

PPT and TSP measures were collected in the same environment (same room with stable conditions regarding light, temperature and noise) and QST testing order within each testing block was randomized. Side was randomized independently of dominance and measures. If one side (wrist or low back) presented impairments (not related to pain) other than exclusion criteria, the opposite side was tested (1 participant had a limited wrist extension due to a scaphoid fracture 10 years ago [non-painful] and 1 participant had a low back lipoma). QST measures were first tested on the calf or the thigh for familiarization with the procedure.

2.3.1 Pressure pain threshold

PPT were assessed with a handheld digital algometer (1-cm2 probe–FPIX, Wagner Instruments, Greenwich, CT, USA) for 22 participants and with a handheld dial algometer (1-cm2 probe–FPK, Wagner Instruments, Greenwich, CT, USA) for 2 participants. Since FPK algometer does not allow to measure between 0 and 1 kg/cm2, FPIX algometer was used for the remaining participants. To determine if the use of FPK algometer affected the reliability, we performed the analysis with and without the first two participants. Considering both analyses provided similar results, we included the first two participants in our analysis. A rate of ~0.5 kg/cm2/s was applied, at two back and two upper limb sites: i) lumbar erector spinae (LES), 2–3 cm laterally to L4/L5 ii) S1 spinous process, iii) dorsal aspect of the wrist on capitatum (WD) and iv) wrist flexors muscles (WF), 10 cm distally from medial humeral epicondyle on a line from medial epicondyle to styloid process of ulna, over the muscles bulk of the wrist flexors. All sites were located and marked before testing. Assessment of the back and the forearm were done in a prone lying position with a pillow under the abdomen and in sitting position with arm supported, respectively. Standardized verbal instructions were given based on German Research Network on Neuropathic Pain (DFNS) recommendations: “This is a test of your sensitivity to deep pain. Now I will press this pressure meter against your back/wrist/forearm and will gradually increase the pressure. Please say ´Now´ as soon as the pressure starts to be painful. Remember that this is not a pain tolerance test, it is a pain threshold test” [32]. Instructions were freely translated into French. PPT were measured three times with one-minute break between measurements. To reduce the variability of the measurement and reduce the impact of potential outlier, a fourth measure was taken if the following two conditions were met: 1) the standard deviation (SD) of the three measures was larger than 1 kg/cm2 and 2) one measure was outside the mean±SD interval (Fig 1). The same criteria were applied to the four values to determine if three or four data were kept for analysis. Because of the accuracy limit of the FPIX algometer, PPT data above 11.5 kg/cm2 were considered as 11.5 kg/cm2 (2 trials at S1). For the FPK, PPT between 0 and 1 kg/cm2, were considered as 1 kg/cm2 (4 trials at WD– 2 first participants). This may have caused a small overestimation of the reliability of the PPT measurement at S1 and WD. PPT were transformed from kg/cm2 to kPa (1 kg/cm2 = 98.07 kPa) to facilitate comparisons with the PPT literature.

2.3.2 Temporal summation of pain

TSP was tested using a pinprick stimulator (256 mN, MRC Systems GmbH, Heidelberg, Germany) and a series of ten punctuate stimuli at 1 Hz over (i) L4/L5 interspinous process line and (ii) hand dorsum. The 1-Hz frequency was displayed to the rater by a light metronome out of participant’s sight. TSP was considered as the difference between the highest numeric rating scale (NRS) (anchored between 0 [no pain] and 10 [worst imaginable pain]) pain through the ten trials and the pain after a single stimulus [27,44]. Standardized verbal instructions were given based on DFNS recommendations: “This is a test of repeated pinpricks. I will now apply a single pinprick. Please give a number between ´0´ and ´10´ for the pain of that stimulus. I will now apply a series of 10 pinpricks in a row. Please give a number between ´0´ and ´10´ for the highest pain over that whole series of 10 pinpricks. This procedure will be repeated 3 times, with a 30-s break between” [32].

2.4 Statistical analysis

SPSS software was used for statistical analysis (IBM SPSS 25 for Mac, Armonk, NY, USA). First, Shapiro-Wilk’s test and visual appreciation of histogram and Normal Q-Q Plots were used to assess normality of PPT and TSP distributions. Second, presence of outliers was assessed by inspection of a boxplot for values greater than 1.5 box-lengths from the edge of the box (representing a 99.3% confidence interval). Third, homoscedasticity i.e. absence of correlation between the size of error and the magnitude of the observed scores [45,46] was assessed. This was done by visual inspection of Bland-Altman plots of differences between the two values collected at test and retest for each participant by the means of these two values [47,48]. The correlation (R2) between the absolute differences and the means values was also calculated for each variable by a linear regression analysis [49]. In presence of the two following conditions: 1) R2 values greater than 0.1 [49] and 2) p<0.05 for the linear regression model, heteroscedasticity was considered present. Natural logarithmic transformation was applied to data in the presence of non-normality or heteroscedasticity and normality/homoscedasticity were re-tested to ensure that previous assumptions were met.

2.4.1 Absolute reliability

Absolute reliability was assessed using the standard error of the measurement (SEMeas) [50]. SEMeas represents the within-subject variation and is defined as “the standard deviation of errors of measurement that is associated with the test scores for a specific group of test takers” [50,51]. SEMeas was estimated as the square root of the mean square error (WMS) calculated by a one-way ANOVA (ANOVARM) applied on test and retest measurements (SEMeas=WMS) [45,52].

To evaluate measures responsiveness, the minimal detectable change for each individual (MDCind) was estimated for PPT and TSP with the following formula: MDCind=1,96×SEMeas×2 [53]. MDC for the group (MDCgr) was also computed: MDCgr=MDCind/n, where ‘‘n” represents the sample size [5456]. The values of SEMeas, MDCind and MDCgr were also presented as percentage of pooled means (average of test and retest measurements) [50,52] designated as %SEMeas, %MDCind and %MDCgr. %SEMeas and %MDC allow comparisons between studies and facilitate interpretation, since they represent a dimension-less/unit-less measure [50,52]. If SEMeas and MDC were obtained from a log-transformation of the data, an antilog (i.e. exponential (ex)) was applied, resulting in a multiplication/division (×/÷) factor to the test data. This factor indicates the random error component of the mean bias [45]. A constant was added to the distributions before the log-transformation to obtain positive and no-null value.

2.4.2 Relative reliability

The relative reliability was quantified using the intraclass correlation coefficient (ICC). ICC’s model 2 was used since each subject was assessed by the same rater and the rater represents the population of possible raters [57]. The ICC was calculated using the Fisher’s test of a two-way ANOVARM. ″Random effects, absolute agreement, average measures″ (2,k) model, for sequences including more than one measurement, and “single-measure” model (2,1) for the sequence of one measurement only were used (see next section for the description of the different sequences of measurements). ICC permits to determine systematic bias between test and retest (difference between means) [46] and ICC 95% confidence interval was calculated to represent ICC variability. An ICC > 0.80 was considered “excellent”, between 0.61 and 0.80; “good”, between 0.41 and 0.60; “moderate”, between 0.21 and 0.40; “acceptable” and 0–0.20 represents “poor” reliability [58].

Considering that the ICC is influenced by the variability of the measurement, the coefficient of variation (CV) of the data was also calculated using the formula: CV=(SD÷X¯)×100, where SD is the standard deviation of test-retest data and X¯ is the mean of test-retest data. CV measures the relative spread of data and helps to consider ICC results in function of its variability [50,52].

2.4.3 Comparison of different sequences of measurements

To determine how many measurements provided the best reliability metrics, five sequences of measurements were considered for PPT (Fig 1): first measurement only (1), the first two measurements (1–2), the first three measurements (1-2-3), the first three or the first fourth when applicable (1-2-3(4)) and the last two trials including the fourth when applicable (2-3-4). The sequence 2-3-4 was included because some studies suggest that the first measure of PPT tends to be higher [36,37,59], but this remains debated [34]. For TSP, three sequences were evaluated: 3 measurements (1-2-3), the first two (1–2) and the first only (1). For all sequences, relative and absolute within-subject short-term test-retest reliabilities were assessed.

3. Results

3.1 Pain sensitivity outcomes

3.1.1 Missing data

One participant TSP data were excluded of analysis because of a technical problem with the pinprick during the data collection (n = 23). For PPT, twenty-four participants were included in the analysis.

3.1.2 Assumptions validation and data transformation

Two distributions required natural logarithmic (ln) transformation because of non-normality: 1) TSP at the back at retest for sequence 1–2 (p = 0.02) and 2) TSP at the hand at test for sequence 1 (p = 0.03). Data were distributed normally after the transformation. One or two outliers were detected for 20 and 3 out of 52 data distributions, respectively. To determine if outliers affected the reliability analysis, we calculated the reliability with and without outliers. Considering both analyses provided similar reliability that was not affecting the interpretation of the results, outliers were included. All data distributions were homoscedastic.

3.1.3 Raw PPT and TSP data

Group means and SD for PPT and TSP at test and retest are reported in Table 2. The results of the one-way ANOVARM showed that there was no statistically significant difference between test and retest occasions for PPT and TSP at all sites for all sequences (all p>0.076 –Table 2).

Table 2. Means of PPT and TSP at test and retest occasions for each sequence of measurements.
Outcomes Sites Sequences of measurements Test Retest Test versus retest
Mean SD Mean SD (F) (p)
PPTa (kPa) S1 1-2-3(4) 519.50 205.61 533.48 214.72 0.608 0.443
1-2-3 522.87 206.88 534.07 211.79 0.351 0.559
2-3-4 517.20 205.77 532.58 212.41 0.548 0.466
1–2 521.79 216.07 530.32 210.10 0.289 0.596
1 536.77 222.51 527.45 209.86 0.248 0.623
LES 1-2-3(4) 547.29 199.65 551.32 169.86 0.027 0.871
1-2-3 551.09 201.84 557.17 172.31 0.059 0.810
2-3-4 549.62 207.49 559.74 175.09 0.154 0.698
1–2 549.52 196.11 548.79 174.74 0.001 0.976
1 553.79 199.52 553.18 181.51 0.001 0.981
WD 1-2-3(4) 371.67 149.28 355.95 123.96 1.072 0.311
1-2-3 370.54 149.03 358.91 129.38 0.631 0.435
2-3-4 365.02 155.98 357.37 130.20 0.225 0.640
1–2 373.63 142.29 355.61 127.02 1.285 0.269
1 381.89 141.27 359.97 134.01 1.821 0.190
WF 1-2-3(4) 368.55 121.92 354.45 115.34 0.916 0.348
1-2-3 369.47 122.57 353.77 115.16 1.089 0.307
2-3-4 361.41 127.11 349.83 115.80 0.509 0.483
1–2 371.86 126.16 356.19 115.64 0.969 0.335
1 389.94 135.22 365.73 127.69 1.518 0.230
TSPb (NRS) Hand 1-2-3 2.29 1.45 2.61 1.51 3.461 0.076
1–2 2.21 1.43 2.47 1.52 1.731 0.202
1 2.24** 1.63** 2.28** 1.71** 0.011* 0.917*
Back 1-2-3 1.92 1.32 1.95 1.58 0.015 0.905
1–2 1.85** 1.31** 1.98** 1.69** 0.016* 0.902*
1 1.96 1.32 1.91 1.71 0.022 0.884

PPT pressure pain threshold, TSP temporal summation of pain, NRS numeric rating scale, SD standard deviation, S1 spinous process of S1, LES lumbar erector spinae, WD dorsal aspect of the wrist, WF wrist flexors muscles.

* ANOVA were done on natural log-transformation of the data distributions.

** Means and SD of the non-transformed raw data are presented to facilitate interpretation.

F Fisher’s test from a one-way ANOVARM.

p one-way ANOVARM.

a n = 24.

b n = 23.

3.2 Reliability analysis

3.2.1 Absolute reliability for PPT and TSP

Considering that (i) MDCgr is proportional to SEMeas and MDCind, and that (ii) %MDCgr, %SEMeas and %MDCind results present the same pattern between sequences, only MDCgr and %MDCgr will be described to facilitate the presentation of the results. Each metric is detailed in Table 3. In general, differences in the measures of absolute reliability for PPT are small across the sequences and seem to be dependent on the site tested (Fig 2). For S1, MDCgr varied from 31.10 kPa for sequence 1–2 to 40.71 kPa for sequence 2-3-4 (%MDCgr: 5.91% (1–2) to 7.76% (2-3-4)). For LES, MDCgr ranged from 46.13 kPa (1–2) to 50.56 kPa (1) (%MDCgr: 8.40% (1–2) to 9.14% (1)). For WD, MDCgr varied from 28.71 kPa (1-2-3) to 31.84 kPa (1) (%MDCgr: 7.87% (1-2-3) to 8.58% (2-3-4)). For WF, MDCgr ranged from 28.87 kPa (1-2-3(4)) to 38.52 kPa (1) (%MDCgr: 7.99% (1-2-3(4)) to 10.19% (1)). Sequences 1-2-3(4) and 1-2-3 presented similar results at WD and WF and seem to present a smaller measurement error compared to other sequences regarding absolute reliability. At the low back (S1 and LES), sequence 1–2 presented the lowest MDCgr compared to other sequences although the difference remains minimal. In terms of the different sites, S1 presented the lowest measurement error compared to the three other sites that were very similar. Again, these differences were small.

Table 3. Within-session reliability parameters for PPT for each site using different sequences of measurements (n = 24).
Sites PPT Sequences ICC (95%CI) CV (%) SEMeas MDCind MDCgr
kPa % kPa % kPa %
S1 1-2-3(4) 0.96 (0.90–0.98) 39.52 62.08 11.79 172.08 32.68 35.12 6.67
1-2-3 0.95 (0.89–0.98) 39.21 65.49 12.39 181.54 34.35 37.06 7.01
2-3-4 0.94 (0.86–0.97) 39.44 71.95 13.71 199.45 38.00 40.71 7.76
1–2 0.97 (0.92–0.99) 40.09 54.97 10.45 152.37 28.96 31.10 5.91
1 0.91 (0.81–0.96) 40.22 64.83 12.18 179.90 33.77 36.68 6.89
LES 1-2-3(4) 0.89 (0.74–0.95) 33.39 85.21 15.51 236.20 43.00 48.21 8.78
1-2-3 0.89 (0.73–0.95) 33.51 86.56 15.62 239.93 43.30 48.98 8.84
2-3-4 0.88 (0.73–0.95) 34.25 89.36 16.11 247.70 44.66 50.56 9.12
1–2 0.90 (0.76–0.96) 33.46 81.53 14.85 225.98 41.15 46.13 8.40
1 0.79 (0.57–0.90) 34.09 89.37 16.15 247.71 44.76 50.56 9.14
WD 1-2-3(4) 0.92 (0.82–0.97) 37.38 52.62 14.46 145.85 40.09 29.77 8.18
1-2-3 0.93 (0.84–0.97) 37.89 50.74 13.91 140.66 38.57 28.71 7.87
2-3-4 0.92 (0.82–0.97) 39.37 55.92 15.48 154.99 42.91 31.64 8.76
1–2 0.91 (0.79–0.96) 36.68 55.07 15.10 152.66 41.87 31.16 8.55
1 0.83 (0.65–0.92) 36.84 56.28 15.17 155.99 42.05 31.84 8.58
WF 1-2-3(4) 0.90 (0.77–0.96) 32.54 51.02 14.11 141.42 39.12 28.87 7.99
1-2-3 0.89 (0.76–0.95) 32.61 52.12 14.41 144.47 39.95 29.49 8.15
2-3-4 0.88 (0.73–0.95) 33.86 56.22 15.81 155.84 43.82 31.81 8.95
1–2 0.88 (0.74–0.95) 32.96 55.14 15.15 152.84 41.99 31.20 8.57
1 0.73 (0.47–0.87) 34.59 68.08 18.02 188.71 49.94 38.52 10.19

ICC intraclass correlation coefficient, with a 95% confidence interval, CV coefficient of variation, SEMeas standard error of the measurement, MDCind individual minimal detectable change, MDCgr group minimal detectable change, S1 spinous process of S1, LES lumbar erector spinae, WD dorsal aspect of the wrist, WF wrist flexors muscles.

Fig 2. %MDCgr for PPT at all sites for the five sequences of measurements tested.

Fig 2

For TSP, absolute reliability is depicted in Table 4. MDCgr at the hand varied from 0.33 for sequence 1-2-3 to 0.40/10 for sequence 1–2 (%MDCgr: 13.58% (1-2-3) to 17.29% (1–2)). At the back, MDCgr ranged from 0.46 for sequence 1-2-3 to 0.57/10 for sequence 1 (%MDCgr: 23.52% (1-2-3) to 29.22% (1)). An increase in the number of measurements for TSP leads to an increased absolute reliability. %MDCgr was lower at the hand compared to the back. However, although we presented the non-transformed data in Table 4, comparisons with other sequences must be done with caution. For log-transformation data, calculation of %SEMeas and %MDC was not applicable because transformed results do not correspond to the same ratio scale [54,55]. Antilog factor (×/÷) derived from SEMeas and MDCind were also calculated and presented in Table 4.

Table 4. Within-session reliability parameters for TSP for each site using different sequences of measurements (n = 23).
Sites TSP Sequences of measurements ICC (95%CI) CV (%) SEMeas MDICind MDICgr
NRS % NRS % NRS %
Hand 1-2-3 0.91 (0.79–0.96) 60.29 0.58 23.50 1.59 65.16 0.33 13.58
1–2 0.88 (0.71–0.95) 62.74 0.70 29.91 1.94 82.91 0.40 17.29
1* (non-transformed) 0.73 (0.46–0.88) 73.09 0.86 38.03 2.38 105.39 0.50 21.98
1 (logn) 0.78 (0.54–0.90) NA 0.15 NA 0.41 NA 0.09 NA
1 (antilog) ×/÷1.16 ×/÷1.51
Back 1-2-3 0.83 (0.59–0.93) 74.47 0.79 40.70 2.18 112.81 0.46 23.52
1-2* (non-transformed) 0.81 (0.55–0.92) 78.32 0.85 44.60 2.37 123.63 0.49 25.78
1–2 (logn) 0.81 (0.55–0.92) NA 0.17 NA 0.47 NA 0.10 NA
1–2 (antilog) ×/÷1.18 ×/÷1.60
1 0.58 (0.23–0.80) 78.12 0.98 50.56% 2.71 140.15 0.57 29.22

ICC intraclass correlation coefficient, with a 95% confidence interval, CV coefficient of variation, SEMeas standard error of the measurement, MDCind individual minimal detectable change, MDCgr group minimal detectable change, NA: not applicable.

* Reliability outcomes are presented for non-transformed data distributions in order to facilitate interpretation but considerations of these data must be done cautiously because the assumption of the normality of the distribution was violated.

3.2.2 Relative reliability for PPT and TSP

For PPT, almost all ICC values are above 0.80, between 0.80 and 0.97, denoting that almost all sequences presented an excellent relative reliability, except sequence 1 for PPT at LES and WF, for which an acceptable reliability was observed with ICC of 0.79 and 0.73 respectively (Table 3). Sequence 1 presented lower ICCs for all sites, associated to large confidence intervals at LES, WD and WF. As illustrated in Fig 3, ICCs of sequence 1–2 are larger at S1 and LES than others. ICCs for WD and WF were similar among sequences, except for sequence 1 that presents with lower ICCs.

Fig 3. ICC for PPT for all sites by each sequence of measurements.

Fig 3

For TSP, results are depicted in Table 4. Sequences 1–2 and 1-2-3 showed an excellent reliability with ICC ranged between 0.81 and 0.91 and narrow confidence intervals. Visually, larger ICC confidence intervals were present at the back compared to the hand. As illustrated in Fig 4, there was a large difference between sequence 1 and 1-2-3 for ICC at the two sites, with sequence 1-2-3 presenting larger ICCs compared to sequence 1.

Fig 4. ICC for TSP at the hand and at the back by each sequence of measurements.

Fig 4

4. Discussion

The objectives of this study were to determine the reliability of PPT and TSP and to determine the sequence of measurements providing the best reliability metrics. The design of the study was established to measure the minimal change exceeding the variability of PPT and TSP technique within a session (20 minutes apart). Our results could be used in future pain research to confirm the presence of ‘true’ hypo- or hyperalgesia (i.e. change over normal variability) for paradigms such as CPM or EIH. In addition, we established that two measures of PPT and three measures of TSP reduced the measurement error and demonstrated an excellent relative reliability.

4.1 PPT reliability

Four studies investigated the intra-rater reliability of PPT at the low back and at the wrist in healthy participants [3437], but only two of them [34,37] evaluated the within-session test-retest reliability of PPT at the low back. The two other studies based their reliability analysis on the comparison of two or three consecutive measurements from a single block of measurements. Therefore, their results are not suitable in a context of pain modulation evaluation such as CPM/EIH and cannot be directly compared to our results. Balaguier et al. [34] evaluated absolute reliability at the low back over two sessions separated by one hour and observed a MDCind ranging from 94 to 253 kPa. These results are similar to our findings for the low back that varied from 152.37 to 247.71 kPa. In addition, our ICCs for PPT are consistent with the literature. One study reported excellent relative reliability (ICCs: 0.86 to 0.99) at 14 anatomical locations at the low back and another study reported good to excellent reliability (ICC: 0.40 to 0.99) for three sites over lumbar erector spinae (L1, L3, L5) [37]. Previous results must be interpreted with caution because of the limited sample size (n = 5 [37] and n = 15 [34]).

4.2 TSP reliability

Only three studies investigated mechanical TSP reliability [6062] but none assessed the low back area. In two studies [60,62], 5 consecutive series of ten punctuate stimuli were used instead of two blocks of 3 series tested 20-min apart. One study [60] observed a poor reliability for mechanical TSP at the face, hand and foot and another study reported poor to good intra-rater reliability at the tongue, face and gingiva [62]. The third study investigated the test-retest reliability of mechanical TSP at the hand within a two-weeks periods and observed an acceptable reliability in younger adults and a moderate to good reliability in older adults [61]. These findings are inconsistent with the excellent reliability observed for mechanical TSP at back and hand sites in our study. This discrepancy can be caused by differences in sites tested, study designs and calculation of TSP. In two of the above-mentioned studies [60,62], TSP reliability was analyzed using the wind-up ratio calculation (WUR = the mean pain ratings of trains divided by the mean pain rating to single stimulus). WUR cannot be calculated if the single pinprick stimulus is rated as non-painful (NRS = 0/10, meaning a null denominator), leading to an undefined division and limiting the number of patients included in the analysis (e.g. up to ~27% participants excluded of the analysis [60]). We considered that the subtraction method used for TSP also reflect the facilitation process of nociceptive inputs but with the advantage to include more participants into the analysis.

Antilog factors were calculated for two TSP sequences considering that a log-transformation was applied in presence of normality violation. For example, an antilog factor for the MDCind of ×/÷1.60 for method 1–2 at the back was obtained. This factor is applied on the test score to obtain lower/higher limits representing the level below/above which the retest has to change to be considered as a real change. For example, if a TSP of 2/10 is measured at test, it implies that a TSP ≥ 3.20/10 (2×1.60) or TSP ≤ 1.25/10 (2÷1.60) at retest for the same participant are considered as a true change [45,47,54]. Some methodological limitations derived from the application of the antilog factor, for example, the antilog factor could not be used if TSP at test was rated 0/10. Also, considering that TSP represents ordinal data, interpretation appears to be less precise. Therefore, antilog factor calculation constitutes an alternative way to analyze reliability in presence of normality violation, but its concrete and clinical applicability remains challenging with TSP data.

4.3 Comparison of sequences of measurements

For PPT, there was substantial heterogeneity in terms of the sequences of measurements producing the least measurement error, which differs in function of the site tested. Absolute reliability with sequence 1–2 was superior for S1 and LES, sequence 1-2-3 for WD and 1-2-3-(4) for WF. However, these differences were minimal across sequences. In addition, all the sequences tested for PPT demonstrated an excellent relative reliability at all sites, except sequence 1 which showed a good relative reliability at LES and WF. Our results are generally in accordance with current literature suggesting that the mean of two or three consecutive measurements are enough to provide reliable PPT [34,36]. However, one study proposed that using only the first measure constitutes an excellent method [34] in opposition to our findings. Some studies observed that the first PPT measurement tends to be significantly higher compared to the subsequent ones and recommended to exclude it from the analysis to reduce the variability [36,37,59]. Our results suggest that excluding the first measurement (2-3-4) did not reduce the measurement error. However, when the first measure was used alone, usually, reliability worsen, meaning that adding only one more measurement seemed sufficient to stabilize the measure. Disparity regarding the first measure can be caused by the fact that some studies did not include familiarization trial prior to PPT evaluation [36,37] or evaluated PPT at different sites than those evaluated in the present study (i.e. biceps brachii) [59]. Thus, we recommend to measure PPT at least twice in each block of testing to improve reliability.

For TSP, our results suggest that an increase in the number of measurements leads to an increased absolute reliability. Two and three measurements showed an excellent relative reliability at the hand and the back, in contrast a single measurement demonstrated a good relative reliability at the hand and moderate at the low back. These results may have been influenced by the sample heterogeneity, as represented by high coefficients of variation for TSP. Although some research groups recommend to measure 5 series of ten punctuate stimuli [32,60,62], our results suggest that a sequence of 3 series separated by 30 seconds provides an excellent reliability. Taking 3 measures instead of 5 may reduce the duration of the test and minimize skin irritation due to repeated pinprick stimulations. Therefore, we suggest to take at least three TSP measurement to improve the TSP reliability.

4.4 Applicability of absolute reliability in research

The MDC obtained in our study provides specific threshold to determine if change in PPT or TSP in a within-session design exceeds the variability of these measures for a group (MDCgr) or an individual (MDCind). In research, MDC can serve as a cut-off value to determine if the pain sensitivity change (as measured by PPT or TSP) exceeds measurement error and can be considered as true hypo- or hyperalgesia following the conditioning stimulus in pain modulation paradigm (e.g. CPM and EIH). For instance, a previous study investigating EIH reported a significant within-session PPT change of 29.78 kPa at the low back in healthy controls following a repetitive lifting task [11]. We determined that the lowest MDCgr at the low back was 46.13 kPa. Thus, a change inferior to this value cannot be considered as a real change despite statistical significance.

Also, the response to conditioning stimulus (e.g. CPM) has been used to stratify healthy and chronic pain participants in function of the change in pain sensitivity (decrease vs. increase) [9,63,64] suggesting a bias toward inhibitory or facilitatory descending control (e.g. anti- vs pro-nociceptive). However, this stratification is usually done without considering the measurement error. This may result in stratifying participants for whom changes remain within the measurement error (i.e. pain sensitivity did not change following conditioned stimulus). Our MDCind (or %MDCind) could be used in future studies using a similar design (CPM/EIH) as cut-off values to subgroup participants. For example, an individual change of ~ 35% is necessary to exceed the %MDCind at S1. Considering that this change is large, it questions the validity of stratification methods using PPT.

4.5 Methodological considerations

Our study did not conduct the reliability analysis in different groups for each sequence and this could have underestimated PPT and TSP measurement variability. Also, considering that this study was conducted in a healthy pain-free population, current results are not generalizable to other populations. It is acknowledged that factors such as race/ethnicity, sex (e.g. phases of menstrual cycle [65]) and age could influence pain sensitivity [66,67], but considering that we measured reliability of pain sensitivity in a single session rather than pain sensitivity per se, the effect of these factors on our results remain limited. Future studies must be conducted in chronic pain participants, such as chronic low back pain.

4.6 Summary

This study observes that PPT and TSP at back and hand sites have small error of measurement and an excellent relative reliability using a within-session test-retest design. Our results also suggest that at least two PPT and three TSP consecutive measures are needed to optimize reliability and these recommendations may be used in future research and in clinical practice. Our results also provide cut-off values that may be used with pain modulation paradigms such as CPM and EIH to confirm that changes following conditioning stimulus exceed PPT and TSP measurement error (true hypo-/hyperalgesia). Further studies are warranted to investigate the within-session test-retest reliability of these parameters in chronic pain populations.

Data Availability

All files are available from the Scholar Portal Dataverse database (https://doi.org/10.5683/SP2/URLJLN, Scholars Portal Dataverse, V1, UNF:6:g+Lz6E4IvbaXIy5wuxkzPQ==).

Funding Statement

This study is financed by a grant from the Quebec Pain Research Network (QPRN; https://qprn.ca/fr), the Réseau Provincial de Recherche en Adaptation-Réadaptation (REPAR; https://repar.ca) and the Canadian Musculoskeletal Rehab Research Network (http://mskrehabnet.com). HMA is supported by salary awards from Fonds de recherche du Québec - Santé (281961 - FRQS; http://www.frqs.gouv.qc.ca). CM received grants from the Canadian Institute of Health Research (CIHR; https://cihr-irsc.gc.ca/e/193.html) and the Ordre professionnel de la physiothérapie du Québec (OPPQ; https://oppq.qc.ca). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Moseley GL, Butler DS. Explain pain supercharged: The clinician’s manual. First ed: NOIgroup Publications; 2017. 225 p. [Google Scholar]
  • 2.Ossipov MH, Dussor GO, Porreca F. Central modulation of pain. J Clin Invest. 2010;120(11):3779–87. 10.1172/JCI43766 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Starkweather AR, Heineman A, Storey S, Rubia G, Lyon DE, Greenspan J, et al. Methods to measure peripheral and central sensitization using quantitative sensory testing: A focus on individuals with low back pain. Appl Nurs Res. 2016;29:237–41. 10.1016/j.apnr.2015.03.013 [DOI] [PubMed] [Google Scholar]
  • 4.Arendt-Nielsen L, Yarnitsky D. Experimental and clinical applications of quantitative sensory testing applied to skin, muscles and viscera. J Pain. 2009;10(6):556–72. 10.1016/j.jpain.2009.02.002 [DOI] [PubMed] [Google Scholar]
  • 5.Graven-Nielsen T, Arendt-Nielsen L. Assessment of mechanisms in localized and widespread musculoskeletal pain. Nat Rev Rheumatol. 2010;6(10):599–606. 10.1038/nrrheum.2010.107 [DOI] [PubMed] [Google Scholar]
  • 6.Neelapala YVR, Bhagat M, Frey-Law L. Conditioned Pain Modulation in Chronic Low Back Pain: A Systematic Review of Literature. Clin J Pain. 2020;36(2):135–41. 10.1097/AJP.0000000000000778 [DOI] [PubMed] [Google Scholar]
  • 7.Correa JB, Costa LO, de Oliveira NT, Sluka KA, Liebano RE. Central sensitization and changes in conditioned pain modulation in people with chronic nonspecific low back pain: a case-control study. Exp Brain Res. 2015;233(8):2391–9. 10.1007/s00221-015-4309-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Aoyagi K, He J, Nicol AL, Clauw DJ, Kluding PM, Jernigan S, et al. A Subgroup of Chronic Low Back Pain Patients with Central Sensitization. Clin J Pain. 2019. 10.1097/AJP.0000000000000755 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Rabey M, Poon C, Wray J, Thamajaree C, East R, Slater H. Pro-nociceptive and anti-nociceptive effects of a conditioned pain modulation protocol in participants with chronic low back pain and healthy control subjects. Man Ther. 2015;20(6):763–8. 10.1016/j.math.2015.02.011 [DOI] [PubMed] [Google Scholar]
  • 10.Falla D, Gizzi L, Tschapek M, Erlenwein J, Petzke F. Reduced task-induced variations in the distribution of activity across back muscle regions in individuals with low back pain. Pain. 2014;155(5):944–53. 10.1016/j.pain.2014.01.027 [DOI] [PubMed] [Google Scholar]
  • 11.Kuithan P, Heneghan NR, Rushton A, Sanderson A, Falla D. Lack of Exercise-Induced Hypoalgesia to Repetitive Back Movement in People with Chronic Low Back Pain. Pain Pract. 2019. 10.1111/papr.12804 [DOI] [PubMed] [Google Scholar]
  • 12.Yarnitsky D, Bouhassira D, Drewes AM, Fillingim RB, Granot M, Hansson P, et al. Recommendations on practice of conditioned pain modulation (CPM) testing. European Journal of Pain. 2015;19(6):805–6. 10.1002/ejp.605 [DOI] [PubMed] [Google Scholar]
  • 13.Naugle KM, Fillingim RB, Riley JL, 3rd. A meta-analytic review of the hypoalgesic effects of exercise. J Pain. 2012;13(12):1139–50. 10.1016/j.jpain.2012.09.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Sprenger C, Bingel U, Buchel C. Treating pain with pain: supraspinal mechanisms of endogenous analgesia elicited by heterotopic noxious conditioning stimulation. Pain. 2011;152(2):428–39. 10.1016/j.pain.2010.11.018 [DOI] [PubMed] [Google Scholar]
  • 15.Yarnitsky D, Granot M, Granovsky Y. Pain modulation profile and pain therapy: between pro- and antinociception. Pain. 2014;155(4):663–5. 10.1016/j.pain.2013.11.005 [DOI] [PubMed] [Google Scholar]
  • 16.Pud D, Granovsky Y, Yarnitsky D. The methodology of experimentally induced diffuse noxious inhibitory control (DNIC)-like effect in humans. Pain. 2009;144(1–2):16–9. 10.1016/j.pain.2009.02.015 [DOI] [PubMed] [Google Scholar]
  • 17.Gebhart GF. Descending modulation of pain. Neurosci Biobehav Rev. 2004;27(8):729–37. 10.1016/j.neubiorev.2003.11.008 [DOI] [PubMed] [Google Scholar]
  • 18.Da Silva Santos R, Galdino G. Endogenous systems involved in exercise-induced analgesia. J Physiol Pharmacol. 2018;69(1):3–13. 10.26402/jpp.2018.1.01 [DOI] [PubMed] [Google Scholar]
  • 19.Rice D, Nijs J, Kosek E, Wideman T, Hasenbring MI, Koltyn K, et al. Exercise-Induced Hypoalgesia in Pain-Free and Chronic Pain Populations: State of the Art and Future Directions. J Pain. 2019. 10.1016/j.jpain.2019.03.005 [DOI] [PubMed] [Google Scholar]
  • 20.Sluka KA, Frey-Law L, Hoeger Bement M. Exercise-induced pain and analgesia? Underlying mechanisms and clinical translation. Pain. 2018;159 Suppl 1:S91–S7. 10.1097/j.pain.0000000000001235 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Paungmali A, Joseph LH, Punturee K, Sitilertpisan P, Pirunsan U, Uthaikhup S. Immediate Effects of Core Stabilization Exercise on beta-Endorphin and Cortisol Levels Among Patients With Chronic Nonspecific Low Back Pain: A Randomized Crossover Design. J Manipulative Physiol Ther. 2018;41(3):181–8. 10.1016/j.jmpt.2018.01.002 [DOI] [PubMed] [Google Scholar]
  • 22.Crombie KM, Brellenthin AG, Hillard CJ, Koltyn KF. Endocannabinoid and Opioid System Interactions in Exercise-Induced Hypoalgesia. Pain Med. 2018;19(1):118–23. 10.1093/pm/pnx058 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Tour J, Lofgren M, Mannerkorpi K, Gerdle B, Larsson A, Palstam A, et al. Gene-to-gene interactions regulate endogenous pain modulation in fibromyalgia patients and healthy controls-antagonistic effects between opioid and serotonin-related genes. Pain. 2017;158(7):1194–203. 10.1097/j.pain.0000000000000896 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Koltyn KF, Brellenthin AG, Cook DB, Sehgal N, Hillard C. Mechanisms of exercise-induced hypoalgesia. J Pain. 2014;15(12):1294–304. 10.1016/j.jpain.2014.09.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Yarnitsky D, Arendt-Nielsen L, Bouhassira D, Edwards RR, Fillingim RB, Granot M, et al. Recommendations on terminology and practice of psychophysical DNIC testing. Eur J Pain. 2010;14(4):339 10.1016/j.ejpain.2010.02.004 [DOI] [PubMed] [Google Scholar]
  • 26.Kennedy DL, Kemp HI, Ridout D, Yarnitsky D, Rice AS. Reliability of conditioned pain modulation: a systematic review. Pain. 2016;157(11):2410–9. 10.1097/j.pain.0000000000000689 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Coronado RA, Bialosky JE, Robinsen ME, George SZ. Pain sensitivity subgroups in individuals with spine pain: potential relevance to short-term clinical outcome. Physical Therapy. 2014;94(8):1111–22. 10.2522/ptj.20130372 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Nahman-Averbuch H, Yarnitsky D, Granovsky Y, Gerber E, Dagul P, Granot M. The role of stimulation parameters on the conditioned pain modulation response. Scand J Pain. 2013;4(1):10–4. 10.1016/j.sjpain.2012.08.001 [DOI] [PubMed] [Google Scholar]
  • 29.Marchand S, Arsenault P. Spatial summation for pain perception: interaction of inhibitory and excitatory mechanisms. Pain. 2002;95:201–6. 10.1016/s0304-3959(01)00399-2 [DOI] [PubMed] [Google Scholar]
  • 30.Tousignant-Laflamme Y, Page S, Goffaux P, Marchand S. An experimental model to measure excitatory and inhibitory pain mechanisms in humans. Brain Res. 2008;1230:73–9. 10.1016/j.brainres.2008.06.120 [DOI] [PubMed] [Google Scholar]
  • 31.Price DD, Hu JW, Dubner R, Gracely RH. Peripheral suppression of first pain and central summation of second pain evoked by noxious heat pulses. Pain. 1977;3:57–68. 10.1016/0304-3959(77)90035-5 [DOI] [PubMed] [Google Scholar]
  • 32.Rolke R, Baron R, Maier C, Tolle TR, Treede RD, Beyer A, et al. Quantitative sensory testing in the German Research Network on Neuropathic Pain (DFNS): standardized protocol and reference values. Pain. 2006;123(3):231–43. 10.1016/j.pain.2006.01.041 [DOI] [PubMed] [Google Scholar]
  • 33.Herrero JF, Laird JMA, Lopez Garcia JA. Wind-up of spinal cord neurones and pain sensation: much ado about something? Progress in neurobiology. 2000;61:169–203. 10.1016/s0301-0082(99)00051-9 [DOI] [PubMed] [Google Scholar]
  • 34.Balaguier R, Madeleine P, Vuillerme N. Is One Trial Sufficient to Obtain Excellent Pressure Pain Threshold Reliability in the Low Back of Asymptomatic Individuals? A Test-Retest Study. PLoS One. 2016;11(8):e0160866 10.1371/journal.pone.0160866 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Waller R, Straker L, O'Sullivan P, Sterling M, Smith A. Reliability of pressure pain threshold testing in healthy pain free young adults. Scand J Pain. 2015;9(1):38–41. 10.1016/j.sjpain.2015.05.004 [DOI] [PubMed] [Google Scholar]
  • 36.Lacourt TE, Houtveen JH, van Doornen LJP. Experimental pressure-pain assessments: Test-retest reliability, convergence and dimensionality. Scand J Pain. 2012;3(1):31–7. 10.1016/j.sjpain.2011.10.003 [DOI] [PubMed] [Google Scholar]
  • 37.Farasyn A, Meeusen R. Pressure pain thresholds in healthy subjects: influence of physical activity, history of lower back pain factors and the use of endermology as a placebo-like treatment. Journal of Bodywork and Movement Therapies. 2003;7(1):53–61. [Google Scholar]
  • 38.Gierthmuhlen J, Enax-Krumova EK, Attal N, Bouhassira D, Cruccu G, Finnerup NB, et al. Who is healthy? Aspects to consider when including healthy volunteers in QST—based studies-a consensus statement by the EUROPAIN and NEUROPAIN consortia. Pain. 2015;156(11):2203–11. 10.1097/j.pain.0000000000000227 [DOI] [PubMed] [Google Scholar]
  • 39.Riviere F, Widad FZ, Speyer E, Erpelding ML, Escalon H, Vuillemin A. Reliability and validity of the French version of the global physical activity questionnaire. J Sport Health Sci. 2018;7(3):339–45. 10.1016/j.jshs.2016.08.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Vaegter HB, Handberg G, Emmeluth C, Graven-Nielsen T. Preoperative Hypoalgesia After Cold Pressor Test and Aerobic Exercise is Associated With Pain Relief 6 Months After Total Knee Replacement. Clin J Pain. 2017;33(6):475–84. 10.1097/AJP.0000000000000428 [DOI] [PubMed] [Google Scholar]
  • 41.Vaegter HB, Dorge DB, Schmidt KS, Jensen AH, Graven-Nielsen T. Test-Retest Reliabilty of Exercise-Induced Hypoalgesia After Aerobic Exercise. Pain Med. 2018;19(11):2212–22. 10.1093/pm/pny009 [DOI] [PubMed] [Google Scholar]
  • 42.Vaegter HB, Lyng KD, Yttereng FW, Christensen MH, Sorensen MB, Graven-Nielsen T. Exercise-Induced Hypoalgesia After Isometric Wall Squat Exercise: A Test-Retest Reliabilty Study. Pain Med. 2019;20(1):129–37. 10.1093/pm/pny087 [DOI] [PubMed] [Google Scholar]
  • 43.Gehling J, Mainka T, Vollert J, Pogatzki-Zahn EM, Maier C, Enax-Krumova EK. Short-term test-retest-reliability of conditioned pain modulation using the cold-heat-pain method in healthy subjects and its correlation to parameters of standardized quantitative sensory testing. BMC Neurol. 2016;16:125 10.1186/s12883-016-0650-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Anderson RJ, Craggs JG, Bialosky JE, Bishop MD, George SZ, Staud R, et al. Temporal summation of second pain: variability in responses to a fixed protocol. Eur J Pain. 2013;17(1):67–74. 10.1002/j.1532-2149.2012.00190.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Atkinson G, Nevill AM. Statistical methods for assessing measurement error (Reliability) in variables relevant to sports medicine. Sports Med. 1998;26(4):217–38. 10.2165/00007256-199826040-00002 [DOI] [PubMed] [Google Scholar]
  • 46.Weir JP. Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. Journal of Strength and Conditioning Research. 2005;19(1):231–40. 10.1519/15184.1 [DOI] [PubMed] [Google Scholar]
  • 47.Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. The Lancet. 1986:307–10. [PubMed] [Google Scholar]
  • 48.Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. International Journal of Nursing Studies. 2010;47(8):931–6. [Google Scholar]
  • 49.Damron LA, Dearth DJ, Hoffman RL, Clark BC. Quantification of the corticospinal silent period evoked via transcranial magnetic stimulation. J Neurosci Methods. 2008;173(1):121–8. 10.1016/j.jneumeth.2008.06.001 [DOI] [PubMed] [Google Scholar]
  • 50.Hopkins WG. Measures of reliability in sports medicine and science. Sports Med. 2000;1(30):1–15. 10.2165/00007256-200030010-00001 [DOI] [PubMed] [Google Scholar]
  • 51.Harvill LM. Standard error of measurement. Instructional Topics in Educational Measurement. 1991:33–41. [Google Scholar]
  • 52.Lexell JE, Downham DY. How to assess the reliability of measurements in rehabilitation. Am J Phys Med Rehabil. 2005;84(9):719–23. 10.1097/01.phm.0000176452.17771.20 [DOI] [PubMed] [Google Scholar]
  • 53.Beckerman H, Roebroeck ME, Lankhorst GJ, B J.G., Bezemer PD, Verbreek ALM. Smallest real difference, a link between reproducibility and responsiveness. Quality of Life Research. 2001;10:571–8. 10.1023/a:1013138911638 [DOI] [PubMed] [Google Scholar]
  • 54.Beaulieu LD, Masse-Alarie H, Ribot-Ciscar E, Schneider C. Reliability of lower limb transcranial magnetic stimulation outcomes in the ipsi- and contralesional hemispheres of adults with chronic stroke. Clin Neurophysiol. 2017;128(7):1290–8. 10.1016/j.clinph.2017.04.021 [DOI] [PubMed] [Google Scholar]
  • 55.Schambra HM, Ogden RT, Martinez-Hernandez IE, Lin X, Chang YB, Rahman A, et al. The reliability of repeated TMS measures in older adults and in patients with subacute and chronic stroke. Front Cell Neurosci. 2015;9:335 10.3389/fncel.2015.00335 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Terwee CB, Bot SD, de Boer MR, van der Windt DA, Knol DL, Dekker J, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol. 2007;60(1):34–42. 10.1016/j.jclinepi.2006.03.012 [DOI] [PubMed] [Google Scholar]
  • 57.Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychological bulletin. 1979;86(2):420–8. 10.1037//0033-2909.86.2.420 [DOI] [PubMed] [Google Scholar]
  • 58.Portney LG, Watkins MP. Foundations of clinical research applications to practice. 3rd ed Upper Saddle River (New Jersey): Pearson Education Inc.; 2009. 912 p. [Google Scholar]
  • 59.Nussbaum EL, Downes L. Reliability of clinical pressure-pain algometric measurements obtained on consecutives days. Physical Therapy. 1998;78(2). [DOI] [PubMed] [Google Scholar]
  • 60.Geber C, Klein T, Azad S, Birklein F, Gierthmuhlen J, Huge V, et al. Test-retest and interobserver reliability of quantitative sensory testing according to the protocol of the German Research Network on Neuropathic Pain (DFNS): a multi-centre study. Pain. 2011;152(3):548–56. 10.1016/j.pain.2010.11.013 [DOI] [PubMed] [Google Scholar]
  • 61.Ohlman T, Miller L, Naugle K. (169) Comparison of Temporal Stability of Conditioned Pain Modulation and Temporal Summation of Pain in Healthy Older and Younger Adults. The Journal of Pain. 2019;20(4). [Google Scholar]
  • 62.Pigg M, Baad-Hansen L, Svensson P, Drangsholt M, List T. Reliability of intraoral quantitative sensory testing (QST). Pain. 2010;148(2):220–6. 10.1016/j.pain.2009.10.024 [DOI] [PubMed] [Google Scholar]
  • 63.Fingleton C, Smart KM, Doody CM. Exercise-induced Hypoalgesia in People With Knee Osteoarthritis With Normal and Abnormal Conditioned Pain Modulation. Clin J Pain. 2017;33(5):395–404. 10.1097/AJP.0000000000000418 [DOI] [PubMed] [Google Scholar]
  • 64.O'Neill S, Manniche C, Graven-Nielsen T, Arendt-Nielsen L. Association between a composite score of pain sensitivity and clinical parameters in low-back pain. Clin J Pain. 2014;30(10):831–8. 10.1097/AJP.0000000000000042 [DOI] [PubMed] [Google Scholar]
  • 65.Pogatzki-Zahn EM, Drescher C, Englbrecht JS, Klein T, Magerl W, Zahn PK. Progesterone relates to enhanced incisional acute pain and pinprick hyperalgesia in the luteal phase of female volunteers. Pain. 2019;160(8):1781–93. 10.1097/j.pain.0000000000001561 [DOI] [PubMed] [Google Scholar]
  • 66.Ostrom C, Bair E, Maixner W, Dubner R, Fillingim RB, Ohrbach R, et al. Demographic Predictors of Pain Sensitivity: Results From the OPPERA Study. J Pain. 2017;18(3):295–307. 10.1016/j.jpain.2016.10.018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Riley JL 3rd, Cruz-Almeida Y, Glover TL, King CD, Goodin BR, Sibille KT, et al. Age and race effects on pain sensitivity and modulation among middle-aged and older adults. J Pain. 2014;15(3):272–82. 10.1016/j.jpain.2013.10.015 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Alison Rushton

3 Dec 2020

PONE-D-20-27453

Within-session test-retest reliability of pressure pain threshold and mechanical temporal summation in healthy subjects

PLOS ONE

Dear Dr. Massé-Alarie,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Jan 17 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Alison Rushton

Academic Editor

PLOS ONE

Journal requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

Additional Editor Comments:

Please address the reviewers' comments detailed below.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: 

Comments:

- The authors propose a very interesting study to determine how many measurements provide the best reliability metrics. Thus, the authors presented a well-designed study design, in which they evaluated the minimum change that exceeds the variability of the pressure pain threshold (PPT) technique and the mechanical time sum of pain (TSP) with 3 measures from two tested blocks with a 20-minute break and retest.

- The authors used convenience sampling. Even if it's only for the studied group, the main consequence is to generalize the study. Although it can offer us valuable information in countless circumstances, especially when there are no fundamental reasons that differentiate accessible individuals. The type of sampling for convenience in this study I believe will not introduce bias in relation to the total population, the results obtained can be used for the studied universe (pain in specific sites).

Considerations and one question:

In the item materials and methods, in the description of the gender of the participants (line 92), there is an incoherence of the authors in relation to table 1 (descriptive statistics - line 106). The authors reported 12 females from the 24 participants, in table 1 they report ills (50%). Exclusion criteria are presented for females. Was the study carried out with both? It is not clear. Table 1 must be offset.

-If the study was conducted with females, did they not exclude menstrual periods and menopause? These conditions create pain. Could they interfere with the measurements?

The age range in the age of 18 to 65 years, could generate bias, although it was 28.3 the average age.

Reviewer #2: 

Please provide operational definition for healthy subjects. Clarify the inclusion criteria....the cause/s of pain in the subjects included in the study. Please add discussion about patient characteristics, any influencing factors that could have affected the study result

Reviewer #3: 

The study was purported to determine the absolute and relative intra-rater within-session test-retest reliability of pressure pain threshold (PPT) and temporal summation of pain (TSP) and the influence of the sequence of the measurements. Authors established that at least two PPT and three TSP consecutive measures are needed to optimize reliability. The study was well conducted, well written to clearly communicate the results and has added additional evidence to the reliability of pain measurements at the low back and forearm in a healthy population. However, there are a few suggestions to consider to improve the manuscript.

Results

Line 249 – 251: The second sentence seems incomplete. Are the first-two sentences supposed to be linked as one sentence?

Figure 2. The color coding for LES and WF looks identical and difficult to follow. Consider using a different shape(s).

Discussion

Line 344: The sentence “Some methodological limitations derived from the application of the antilog factor” looks incomplete.

Lines 359 – 364. It’s interesting that there are some conflicting outcomes with the first PPT measurement from other studies. It would be great to briefly discuss possible factors in these studies that could have led to the different findings relative to the finding in your study.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Izabel Cristina Custódio de Souza

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Jan 12;16(1):e0245278. doi: 10.1371/journal.pone.0245278.r002

Author response to Decision Letter 0


22 Dec 2020

Ms. Ref. No.: PONE-D-20-27453

Title: Within-session test-retest reliability of pressure pain threshold and mechanical temporal summation in healthy subjects

PLOS One

Reviewers' comments:

We thank the Reviewers for their comments that will substantially improve the quality of the manuscript. We address attentively each one of them. Our responses and modifications are presented below. Changes in the manuscript are red-typed.

Reviewer #1:

Comments:

- The authors propose a very interesting study to determine how many measurements provide the best reliability metrics. Thus, the authors presented a well-designed study design, in which they evaluated the minimum change that exceeds the variability of the pressure pain threshold (PPT) technique and the mechanical time sum of pain (TSP) with 3 measures from two tested blocks with a 20-minute break and retest.

- The authors used convenience sampling. Even if it's only for the studied group, the main consequence is to generalize the study. Although it can offer us valuable information in countless circumstances, especially when there are no fundamental reasons that differentiate accessible individuals. The type of sampling for convenience in this study I believe will not introduce bias in relation to the total population, the results obtained can be used for the studied universe (pain in specific sites).

Considerations and one question:

In the item materials and methods, in the description of the gender of the participants (line 92), there is an incoherence of the authors in relation to table 1 (descriptive statistics - line 106). The authors reported 12 females from the 24 participants, in table 1 they report ills (50%). Exclusion criteria are presented for females. Was the study carried out with both? It is not clear. Table 1 must be offset.

RESPONSE: At the line 92, the number of females is presented to specify the sex of the recruited participants. This study was therefore carried in 12 females and 12 males, for a total recruited sample of 24 participants. The exclusion criteria presented after are for both (females and males). To reduce confusion, Table 1 was modified to state the number of both males and females.

-If the study was conducted with females, did they not exclude menstrual periods and menopause? These conditions create pain. Could they interfere with the measurements?

The age range in the age of 18 to 65 years, could generate bias, although it was 28.3 the average age.

RESPONSE: We agree with Reviewer #1 that phases of menstrual cycle and menopause could create pain in females and could influence pain sensitivity. Indeed, a study of Pogatzki-Zhan et al. showed that pinprick pain is greater in the luteal phase of the menstrual cycle and predicted by progesterone (1). We did not determine menstrual periods and menopause as exclusion criteria in females because pain related to menstrual periods is mostly short-time and episodic, considered as intermittent pain rather than chronic pain. However, we agree that it may introduce variability in the results. We excluded participants presented pain located anywhere in the body lasting more than three months. This substantially reduced the possibility of central adaptations (within the central nervous system) related to pain which ensure that our pain-free sample is homogenous and that pain-free participants are really healthy. Also, considering we studied reliability, and that test and retest sessions were performed in the same day, we believe that impact of phases of menstrual cycle on results is limited. We still added a sentence in the Methodological section to address these points:

p. 19-20; ln 403-407: “It is acknowledged that factors such as race/ethnicity, sex (e.g. phases of menstrual cycle (1)) and age could influence pain sensitivity (2, 3), but considering that we measured reliability of pain sensitivity in a single session rather than pain sensitivity per se, the effect of these factors on our results remain limited.”

Reviewer #2:

Please provide operational definition for healthy subjects. Clarify the inclusion criteria....the cause/s of pain in the subjects included in the study. Please add discussion about patient characteristics, any influencing factors that could have affected the study result

RESPONSE: We understand the concern of the Reviewer #2. Participants included in our study did not report any clinical pain, they were pain-free participants. Pain was induced using quantitative sensory testing (QST). We added all the other participants characteristics collected in Table 1 in order to better describe participants (body mass index (BMI) and score to Global Physical Activity Questionnaire (GPAQ)). We also added a sentence in Material and Methods to introduce these characteristics.

Ln 105-107 p.5: ‘’The body mass index was calculated for each participant and the Global Physical Activity Questionnaire (GPAQ) was self-administered to rate the level of physical activity (4)’’.

We added another sentence in the Discussion to specify that pain sensitivity could be affected by personal characteristics and factors, but considering the within-subject design used, we estimate that their influence on results is limited.

Ln 403-407 p.19-20: “It is acknowledged that factors such as race/ethnicity, sex (e.g. phases of menstrual cycle (1)) and age could influence pain sensitivity (2, 3), but considering that we measured reliability of pain sensitivity in a single session rather than pain sensitivity per se, the effect of these factors on our results remain limited.”

Reviewer #3:

The study was purported to determine the absolute and relative intra-rater within-session test-retest reliability of pressure pain threshold (PPT) and temporal summation of pain (TSP) and the influence of the sequence of the measurements. Authors established that at least two PPT and three TSP consecutive measures are needed to optimize reliability. The study was well conducted, well written to clearly communicate the results and has added additional evidence to the reliability of pain measurements at the low back and forearm in a healthy population. However, there are a few suggestions to consider to improve the manuscript.

Results

Line 249 – 251: The second sentence seems incomplete. Are the first-two sentences supposed to be linked as one sentence?

RESPONSE: We totally agree with the Reviewer #3. The sentence was reformulated as follows.

Ln 253-255, p.12: “Considering that (i) MDCgr is proportional to SEMeas and MDCind, and that (ii) %MDCgr, %SEMeas and %MDCind results present the same pattern between sequences, only MDCgr and %MDCgr will be described to facilitate the presentation of the results”.

Figure 2. The color coding for LES and WF looks identical and difficult to follow. Consider using a different shape(s).

RESPONSE: As recommended by Reviewer #3, the colors of the Figure 2 was modified to better distinguish each studied site. Same shapes was kept to ensure the uniformity of the figure.

Discussion

Line 344: The sentence “Some methodological limitations derived from the application of the antilog factor” looks incomplete.

RESPONSE: As recommended by the Reviewer #3, the sentence was modified as follows.

Ln 348-350, p.17: “Some methodological limitations derived from the application of the antilog factor, for example, the antilog factor could not be used if TSP at test was rated 0/10’’.

Lines 359 – 364. It’s interesting that there are some conflicting outcomes with the first PPT measurement from other studies. It would be great to briefly discuss possible factors in these studies that could have led to the different findings relative to the finding in your study.

RESPONSE: As recommended, the next sentence was added to suggest an explanation to this disparity between our results and the current literature.

Ln 367-370, p.18: “Disparity regarding the first measure can be caused by the fact that some studies did not include familiarization trial prior to PPT evaluation (5, 6) or evaluated PPT at different sites than those evaluated in the present study (i.e. biceps brachii) (7).”

Attachment

Submitted filename: Reviewers_comments.docx

Decision Letter 1

Alison Rushton

28 Dec 2020

Within-session test-retest reliability of pressure pain threshold and mechanical temporal summation in healthy subjects

PONE-D-20-27453R1

Dear Dr. Massé-Alarie,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Alison Rushton

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Thank you for addressing all comments from the reviewers to a satisfactory level.

I hope that you agree the changes have improved the quality of the manuscript further.

Reviewers' comments:

No further review required.

Acceptance letter

Alison Rushton

4 Jan 2021

PONE-D-20-27453R1

Within-session test-retest reliability of pressure pain threshold and mechanical temporal summation in healthy subjects

Dear Dr. Massé-Alarie:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Professor Alison Rushton

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Attachment

    Submitted filename: Reviewers_comments.docx

    Data Availability Statement

    All files are available from the Scholar Portal Dataverse database (https://doi.org/10.5683/SP2/URLJLN, Scholars Portal Dataverse, V1, UNF:6:g+Lz6E4IvbaXIy5wuxkzPQ==).


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES