Abstract
Background
The pull test and the push-and-release test evaluate postural instability in Parkinson’s disease (PD). We systematically evaluated the impact of within- and between-assessor variability in test delivery by the clinician on the test outcome. We also evaluated whether using standardized treadmill-based mechanical perturbations may enhance the consistency of the patient’s test outcomes.
Methods
Fifty persons with PD underwent a series of backward balance perturbations: three repetitions of both the pull test and the push-and-release test delivered by three different assessors (i.e., nine repetitions of each test), plus five standardized treadmill-induced perturbations at 1.5 m/s2, in pseudo-random order.
Results
We found substantial within-assessor variability on both manual tests. A difference in scores of 2 points or more was found in 30% of participants for the pull tests, and in 42% for the push-and-release tests. Similarly, large variability in scores was observed between assessors. Inconsistent test delivery was demonstrated by a wide range of sternum and center of mass displacements following the pull test and body inclination angles in the push-and-release test. Across five repeated treadmill-based perturbations at 1.5 m/s2, ≥ 2 points difference in test outcomes was found in 18% of participants, with significantly greater consistency in sternum and center of mass displacements.
Conclusions
Variability in the patient’s balance test scores can be attributed to substantial variability in test delivery, as well as inconsistent performance of the individual patient. Assessment of postural instability may benefit from standardizing test delivery, e.g., using treadmill-induced perturbations.
Keywords: Postural instability, Parkinson’s disease, Pull test, Push-and-release test, Balance control, Balance perturbation
Introduction
Postural instability is one of the cardinal signs in persons with moderate to severe Parkinson’s disease (PD). It leads to falls and fall-related injuries, and negatively impacts on mobility, quality of life, and independence [1–3]. Valid assessment of postural instability is key to optimize patient management, observe treatment efficacy, and monitor disease progression. Yet, a study that evaluated disease progression data from three clinical trials found that more than half of the patients in whom an onset of postural instability had been clinically identified reverted to normal at a subsequent visit [4]. This inconsistency suggests potential flaws in the application of clinical tests for identifying postural instability, raising concerns about their reliability.
In daily practice, two tests are commonly used and regarded as the gold standard to evaluate postural instability [5]. The ‘Pull test’, part of the Movement Disorders Society Unified Parkinson Disease Rating Scale (MDS-UPDRS), examines a person's reaction to a sudden pull on the shoulders strong enough to induce a backward loss of balance, forcing the person to take at least one step backward [6]. When using the ‘Push-and-release test,’ a backward loss of balance is induced by a sudden release of the examiner’s hands placed on the person's back who is actively applying a backward force [7]. With both tests, the absence or the number of balance-correcting steps taken indicates the degree of postural instability on a five-point ordinal scale.
Although it is a common clinical observation that the test outcome following both the pull and the push-and-release test can vary between clinicians and between subsequent perturbations within the same participant, remarkably few studies have formally evaluated the consistency in test delivery by therapists in persons with PD. One study in people with PD demonstrated that errors in pull test delivery (e.g., pull intensity, ‘if the pull was administered at the shoulders’, or ‘whether the pull was performed steady and continuously’) were highly prevalent among a group of 25 assessors, resulting in almost 80% of the pull tests conducted being unsuitable for adequately assessing postural instability. In addition, one study observed moderate-to-low inter-rater reliability when examining variability in test delivery between assessors in single patients, but somewhat surprisingly, this variability in test delivery did not influence test performance[8].Another study in a mixed population reported differences in the actual test outcomes across the ten repetitions within the same assessor in four of the 12 PD participants (33%), with small but significant effects on the participants’ responses in step time and length [9]. Furthermore, recurrent clinical evaluations of postural instability in persons with PD are commonly conducted by different health professionals. A study in healthy individuals reported poor to moderate consistency between assessors in push-and-release test delivery (i.e., initial body lean angle) [10]. Surprisingly, the impact of between-assessor test delivery variability on test performance has not yet been evaluated in persons with PD.
We here aimed to systematically evaluate the impact of within and between-assessor variability in pull test and the push-and-release test delivery on postural instability test outcome (i.e., test scores according to respective rating scale) in persons with PD. To this aim, we used a comprehensive testing protocol in 50 participants with PD (Hoehn and Yahr stages 1–3), with three assessors each administering three repetitions of both the pull and the push-and-release test. To study variability in test delivery by the therapists, we quantified the magnitude of the perturbation by assessing the sternum displacement in the pull test and the initial body lean angle in the push-and-release test at perturbation onset. We expected to observe variability in retropulsion performance in both the pull test and the push-and-release test within people with PD, in parallel with substantial variability in test delivery, more so between than within assessors.
In addition, we evaluated variability in test outcomes following standardized treadmill-based mechanical perturbations. Recent work has demonstrated that using standardized treadmill-based perturbations may outperform the pull test in differentiating between persons with PD and healthy controls, possibly due to reduced test delivery variability [11]. We expected such standardized test delivery to be reflected in greater consistency in imposed body displacements, and consequently, lower variability in test outcomes than in the manual tests. Nonetheless, we expected to find residual variability in test outcomes as these are not only determined by variability in test delivery, but also by the consistency of the patient ‘s performance (Fig. 1).
Fig. 1.
Variability in the patient’s test outcomes is determined by variable test delivery within and between assessors, as well as inconsistent performance of the patient him/herself
Methods
Participants
In this observational cohort study, we included 50 people with PD (defined according to accepted international criteria), in the dopaminergic ON state, who were able to stand unaided (Table 1). Persons with PD were excluded if any other neurological, musculoskeletal, or orthopedic condition affected balance capacity, or if people were unable to follow instructions. This study was conducted in accordance with the Declaration of Helsinki [10] and with approval of the local ethical committee (Medical Ethical Committee Arnhem-Nijmegen dossier ‘2021–13298 ‘). All participants provided written informed consent before participating.
Table 1.
Population characteristics
Age (years) | 67 ± 7 |
Men (n, %) | 25 (50%) |
Time since diagnosis (years) | 6.5 ± 4.7 |
MDS-UPDRS part III score (median, [0–136]) | 32.5 [4–72] |
Hoehn and Yahr stage (median, [1–5]) | 2 [1–3] |
Fallen in the past year (n, %) | 21 (42%) |
More than one fall in the past year (n, %) | 14 (28%) |
Subjective presence of freezing of gaita (n, %) | 12 (24%) |
Mini-BESTest score (median, [0–28]) | 23 [10–28] |
MoCA score (median, [0–30]) | 28 [22–30] |
Values are represented as mean ± SD, unless otherwise specified
aDefined by a score of > 0 on the NFOG-Q
Experimental protocol
Measurements took place at the Radboud University Medical Center (Radboudumc, Nijmegen, the Netherlands). Information on demographics, disease duration, and fall history was collected. Disease severity was assessed using the MDS-UPDRS part III [6]. Subjective presence of freezing of gait was identified using the New Freezing of Gait Questionnaire (NFOG-Q) [12]. Cognitive assessment involved the Montreal Cognitive Assessment (MoCA) [13]. Furthermore, balance capacity was clinically evaluated using the mini-BESTest [14].
Participants stood on the surface of an instrumented dual-belt treadmill (Motek, Amsterdam, NL). To prevent falls and injuries, participants wore a safety harness attached to the ceiling. Safety bars were attached to the treadmill, but participants were instructed not to use these unless necessary. In total, subjects underwent a series of 28 trials involving backward balance perturbations. These included three times the pull test by three different trained assessors with similar experience (nine pull tests in total), three times the push-and-release test by three different assessors (nine push-and-release tests in total), five times a standardized treadmill-induced balance perturbation at 1.5 m/s2, and five times at 2.5 m/s2 (Fig. 2). The perturbation waveform comprised an acceleration phase of 300 ms, a constant velocity phase of 700 ms, and a deceleration phase of 300 ms. Perturbations delivered at 2.5 m/s2 were not included in the analyses since this intensity clearly exceeded those of the pull and push-and-release tests. The first trial always involved a treadmill-induced balance perturbation at 1.5 m/s2. Subsequent trials were conducted in a pseudo-randomized order that was different across participants to balance general habituation effects across perturbation types. The assessors always delivered three consecutive trials of either the pull test or the push-and-release test. Assessors never saw another perform either test before executing the test themselves. All trials were videotaped.
Fig. 2.
Example of a pseudo-randomized trial sequence for a participant. The first trial was always a treadmill perturbation at 1.5 m/s2. PRT: Push & release test
The pull test was conducted according to the instructions of the MDS-UPDRS part III, item 12 [6]. Before each pull test, the participants were instructed to stand upright with their feet comfortably wide and parallel next to each other. In addition, the person was allowed to take a step backward to prevent a fall. The assessors executed the pull test by applying a quick and forceful pull to the shoulders, forceful enough to displace the center of mass so that at least one step was needed to prevent a fall. Pull test trials were rated offline using the video-data and according to the ordinal scale of the MDS-UPDRS part III item 12 [6], i.e., (0) recovery with one or two steps; (1) three to five steps, but participant recovers unaided, (2) more than five steps but participant recovers unaided, (3) stands safely, but with absence of postural response; falls if not caught by assessor, (4) very unstable, tends to lose balance spontaneously or with just a gentle pull on the shoulders. The last did not occur in our study, as one of our inclusion criteria would interfere. Whenever a participant utilized the safety bars or used multiple steps but failed to recover balance before reaching the end of the treadmill surface and stepped into an assessor, the trial received a rating of 3.
The push-and-release test was conducted following the respective test instruction [7]. Before each trial, participants were instructed to stand upright with their feet comfortably wide and parallel. Participants were given instructions to do whatever was required to regain balance, including taking a step backward. After the assessor placed their palms on the participant’s scapulae, the participant was instructed to push backward. Flexion of the assessor’s elbows was required to allow backward movement of the participant’s trunk. Assessors released the participant when their shoulders and hips were behind the participant’s heels. Push-and-release test trials were rated using the push-and-release ordinal rating scale [7], i.e., (0) recovers independently with 1 step, (1) two or three backward steps but recovers unaided, (2) four or more backward steps, but recovers unaided, (3) steps backward, but needs assistance to prevent a fall, 4) falls without taking a step or is unable to stand without aid. Whenever a participant used the safety bars or used multiple steps but failed to recover balance before reaching the end of the treadmill surface and stepped into an assessor, they received a rating of 3 or 4 (depending on whether a step was taken).
In the treadmill-induced balance perturbations, participants were instructed to stand upright with their feet comfortably wide and parallel next to each other. Participants were instructed to try to recover from the balance perturbation by taking one step backward. Treadmill-induced balance perturbation trials were rated using the ordinal scale of the MDS-UPDRS part III item 12 [6]. Whenever a participant used the safety bars or used the safety harness to regain balance, the trial received a rating of 3.
Data acquisition
A 10-camera motion capture system (Vicon Motion System Ltd., Oxford, UK) was used to record 3D kinematics at a sampling rate of 100 Hz. Reflective markers were attached to the surface of the participant’s skin. We placed markers on the sternum, pelvis and the feet. To correct for the movement of the treadmill, an additional marker was placed on the treadmill’s belt. The raw data was preprocessed in Vicon Nexus (Vicon Motion Systems Ltd., Oxford, UK).
Data analysis
Vicon data were analyzed using Matlab 2022a (MathWorks Inc., Natick, MA, USA). Data were filtered using a second-order zero-lag Butterworth filter with a cutoff frequency of 10 Hz. The marker at the top of the sternum was used to identify sternum displacement, whereas the center of the four pelvis markers was used to estimate the CoM. To identify perturbation onset in the pull test and push-and-release test, we first localized the peak acceleration of the sternum marker. We then determined the determined the start of the slope, with the assumption that the acceleration had to be a positive value. In the treadmill condition, the start of the perturbation was determined by the belt marker’s velocity exceeding a threshold of 0.1 m/s. The relative displacement of the sternum and CoM at 250 ms following perturbation onset was identified to determine the variability in delivery of the pull test and the treadmill perturbations. We used a marker to correct for the movement of the belt. This instant was chosen to minimize the influence of corrective torques generated by the participant on the observed CoM and sternum displacements [15]. Push-and-release test delivery was expressed using the body inclination angle at perturbation onset. This involved the angle of the sternum marker relative to the middle of the participant’s toe markers.
Statistical analysis
Variability in the pull and push-and-release test outcomes is presented descriptively as (1) maximum within-participant difference in ratings across all nine trials; (2) difference in first trial and in median ratings between each assessor; (3) difference in ratings between three repetitions within each assessor. Variability in outcomes following standardized perturbations on the treadmill is presented as the maximum within-participant difference in outcome in ratings across five repetitions. As the very first trial of the protocol always involved a treadmill perturbation at 1.5 m/s2, which trial is known to yield distinctly different behavioral responses compared to any subsequent trial, we also present within-participant differences between trials two and five to eliminate this first-trial effect.
Variability in delivery of the pull test is descriptively presented as single-trial values of sternum and CoM displacements, as well as the observed range (1) across all nine repetitions within each participant; (2) across the first trials of each of the three assessors; and (3) across the three repetitions within each assessor. We used independent samples T-tests to test whether the range of sternum and CoM displacements across the nine trials differed from that of standardized treadmill perturbations. T-tests were also used to test whether the range in displacements within assessors differed from that between assessors. Similar statistics were used for comparing ranges in body inclination angles in the push-and-release test within and between assessors.
Results
Distributions of outcomes on the pull test, push-and-release test and treadmill perturbations across all participants are shown in Fig. 3. In all executed pull tests (n = 450), 70% of the pulls resulted in balance recovery with one (42%, n = 191) or two (27%, n = 123) steps (score 0, n = 314), 14% in three to five steps (score 1, n = 63), 1% in unaided recovery in five or more steps (score 2, n = 4), and 15% in an aided (i.e., by the assessor, handrails or the safety harness) recovery due to an absence of stepping response (score 3, n = 69). In 56% of the treadmill perturbations people recovered with one (26%, n = 64) or two (30%, n = 76) steps (score 0, n = 140), 35% with three to five steps (score 1, n = 88), 1% with five or more steps but unaided (score 2, n = 3), and 8% in an aided recovery (score 3, n = 19). In the push-and-release test, 25% of the people recovered with one step (score 0, n = 111), 52% of the trials resulted in two or three steps (score 1, n = 233), 4% in four or more step with unaided recovery (score 2, n = 18), 16% in backward steps with aided recovery (score 3, n = 74), and 3% resulted in an aided recovery without taking a step backward (score 4, n = 14).
Fig. 3.
Distribution of ordinal scores across all recorded trials. A Scores of the pull test (n = 450) are rated according the MDS-UPDRS part 3.12. B Treadmill perturbations (n = 250) are rated according the MDS-UPDRS part 3.12. C The push-and-release test (n = 450) is rated according the ordinal scale used in the push-and-release test
Variability in outcome on the pull test, the push-and-release test, and treadmill perturbations is visualized in Fig. 4. Across nine pull tests of each participant, we observed variability in outcomes in 56% of the participants (n = 28), with differences in scores ranging from one point (26%) to three points (22%). When leaving out the first trial of each assessor, variability in outcomes was seen in 44% of participants, with differences in scores ranging from one point (18%) to three points (20%). Between the three assessors, the scores on their first pull showed variability in 50% of the participants (n = 25), whereas median pull test scores differed between the three assessors in 26% of participants (n = 13). Likewise, between the three consecutive pulls administered by the same assessor, 52% of the participants demonstrated variable scores in at least one of the three assessors (n = 26).
Fig. 4.
Variability within participants in performance on the pull test, the push-and-release test and the treadmill perturbations
Across the nine push-and-release tests, we observed variability in outcomes in 70% of participants (n = 35), with differences in scores ranging from one point (28%) to three points (18%). When leaving out the first trial of each assessor, variability in outcomes was seen in 56% of participants, with differences in scores ranging from one point (32%) to three points (10%). When looking at the first push-and-release test of each of the three assessors, 58% of participants showed different scores between assessors (n = 29), whereas median scores differed in 44% of participants (n = 22). Within-assessor variability was observed in a comparable percentage of participants, with 56% showing different scores in three consecutive push-and-release tests of at least one of the three assessors (n = 28).
Seventy-eight percent of participants exhibited variability in outcomes across five treadmill-induced balance perturbations (n = 39). Yet, as expected, it was often the outcome in response to the first perturbation (i.e., always the very first trial of the experimental protocol) that differed from the other four trials. When discarding this very first trial, 54% of the participants showed variability in outcome scores (n = 27), with differences ranging from one point in 46% to three points in 6%.
Variability in the pull test delivery (i.e., the magnitude of the pull) is visualized in Fig. 5. Sternum displacement ranged from 0.2 to 12.1 cm in the pull test. While the range across all trials was evidently wider than following treadmill perturbations (1.8–4.1 cm), the average sternum displacements showed a minor 2 mm difference (2.8 ± 1.6 vs 2.6 ± 0.4 cm, respectively; t(687) = 2.244, p = 0.025). Similarly, the range of CoM displacements was larger for the pull test (range: − 0.6 to 5.4 cm) compared to the treadmill condition (range: 1.6–4.5 cm), but average CoM displacements were smaller (0.6 ± 0.7 cm vs 2.6 ± 0.3 cm, respectively; t(681) = 42.683, p = < 0.001).
Fig. 5.
Sternum displacement and CoM displacement at 250 after pull onset, for the pull test and the treadmill perturbations. A and B represent all data points of the pull test and the treadmill perturbations. C and D represent ranges over 9 trials within the same patient. E and F depict the range in the first trial of each of the three assessors and the mean range within the same patient
Across nine trials of the pull test within each patient, the range of sternum displacements varied between 1.4 and 9.7 cm (3.3 ± 1.5 cm), which was larger than the range observed within the five treadmill perturbations (0.2–1.7 cm; 0.7 ± 0.3 cm; t(98) = 12.015, p < 0.001). For the CoM displacement, the range within each patient varied between 0.4 and 6.0 cm (1.4 ± 1.1 cm) in the pull test, which was larger than the range within the five treadmill perturbations (0.2–1.9 cm, 0.6 ± 0.3 cm; t(98) = 5.499, p < 0.001).
Between each of the three assessors, the range of the sternum and CoM displacements in the first trials of the pull test (between-assessor variability in test delivery) varied between 0.5 and 9.2 cm (Mdn: 1.9 cm, IQR: 1.7 cm) and 0.1 and 6.0 cm (Mdn: 0.5 cm, IQR: 0.7 cm), respectively. The mean range of the sternum and CoM displacements within the same patient (within-assessor test delivery variability) varied from 0.3 to 2.5 cm (Mdn: 1.2 cm, IQR: 0.5) and 0.2 to 2.1 cm (Mdn: 0.5 cm, IQR: 0.4 cm), respectively. Between-assessor variability in sternum and CoM displacements was significantly larger compared to within-assessor variability (t(98) = 21.707, p = < 0.001 and t(98) = 2.382, p = 0.019).
Variability in the push-and-release test delivery (i.e., the body inclination angle at perturbation onset) is visualized in Fig. 6. Body inclination angles across all trials ranged from 0.4 to 16.0 degrees (Mdn: 8.4 degrees, IQR: 2.7 degrees). The range of the body inclination angles across nine trials within the same patient varied between 2.0 and 9.7 degrees (Mdn: 4.6 degrees, IQR: 3.3 degrees). Between assessors, the range of the body inclination angle of the first push-and-release test (between-assessor test delivery variability) was 0.6–8.8 degrees (Mdn: 3.1 degrees, IQR: 2.3 degrees). When looking at the variability in test delivery within assessors, the mean range of the body inclination angles of three test repetitions within the same patient varied between 0.7 and 5.5 degrees (Mdn: 1.9 degrees, IQR: 1.0 degrees). Between-assessor variability was significantly larger compared to the within-assessor variability (t(98) = 4.764, p = < 0.001).
Fig. 6.
Push & release test: body inclination angles at perturbation onset. Variability across all data points and ranges across nine trials, the first three trials of each of the assessors and the mean range within each of the assessors
Discussion
This observational cohort study in 50 persons with PD involves the first systematic evaluation of variability in postural instability test outcomes (i.e., test scores according to an established rating scale) within and between assessors. In line with our hypotheses, we found substantial variability in pull test and push-and-release test outcomes within participants. Across nine repetitions within each participant, a difference in pull test scores of 2 points or more was observed in 30% of participants, and in the push-and-release test scores in 42%. Variability in outcomes was rather similar within and between assessors. Across five repetitions of standardized treadmill-based perturbations at 1.5 m/s2, we observed a difference of 2 points or more in 18% of the participants. This, number further dropped to 8% when the very first trial was discarded, while the proportion of participants with consistent test scores (difference of 0 points) was comparable to that of the manual pull test (40–50%). Discarding the first trial of each assessor on the manual pull and the push-and-release tests also lowered the observed variability in test outcomes to some extent, but a difference of 2 points or more was still present in 26% and 24% of participants, respectively. We observed large variability in pull test and push-and-release test delivery, with sternum displacement and body inclination angles showing greater between- than within-assessor variability. Variability in sternum and CoM displacements was greatly reduced following standardized treadmill perturbations.
We are unaware of any previous systematic evaluation of between-assessor variability in retropulsion outcomes within the same participants with PD. In light of clinical experience, the presence of such variability may seem rather unsurprising, but the degree of variability in test outcomes was remarkably large. Variability in test scores on the pull test, as quantified using the ordinal scale on the MDS-UPDRS item 3.12, could be as large as 3 points within participants, indicating the difference between ‘no postural instability’ and ‘moderate postural instability’, with assistance required to recover from the backward balance perturbation. Variability in the push-and-release test scores was even more pronounced, which is likely due to differences in the ordinal rating scales of either test. In particular, recovery with two steps is considered ‘normal’ (score of 0) in the pull test, whereas this is regarded as mild postural instability (score of 1) in the push-and-release test. Indeed, as can be seen in Fig. 2, the relatively higher number of participants with a score of 0 in the pull as compared to the push-and-release test can largely be attributed to the participants who needed 2 steps to recover. These observations suggest that variability in pull test outcomes may have been underestimated due to the high proportion of participants in our study with a score of 0 in combination with the lack of sensitivity of the pull test in identifying mild postural instability. Collectively, these findings raise questions regarding the validity of postural instability assessments and warrant reconsideration of using these tests as a gold standard [4].
The finding of within-assessor variability in test outcomes is in line with a previous study in a small sample of PD participants (n = 12) [9]. In this study, one experienced physical therapist performed 10 backward push-and-release tests in each participant, with a difference in clinical rating scores observed in four participants (33%). As outlined in Fig. 1, variability in test delivery is proposed to at least partly explain the variability in test outcomes. Indeed, the present findings as well as the results from this previous study demonstrate that individual assessors appear to have difficulties calibrating their pull/ push-and-release for consistent test delivery, even when administering multiple repetitions within the same patient. Even when leaving out the first trial of each assessor, this takes away only a small part of the variability in test outcomes (Fig. 4). Unsurprisingly, when multiple assessors conducted these manual tests in the same patient, the observed greater ranges of imposed sternum and CoM displacements demonstrate that inconsistency in test delivery was further aggravated. These observations highlight the need for standardization of perturbation delivery, using e.g., pulley machines [16, 17], instrumented pull test [18, 19], and movable platforms [11, 20–24]). Our results show that delivering standardized treadmill-based balance perturbations substantially decreased the variability in imposed sternum and CoM displacements, and in line with our expectations, such standardization did take away some of the variability in test outcome. Yet, despite standardization, we observed residual variability in test outcomes, presumably due to inconsistencies in the participant’s test performance (Fig. 2).
The obvious disadvantages of treadmill-based perturbations are size and costs of the set-up, and thereby the application in daily clinical practice. As a more practical approach, wearable sensors and markerless video-based tracking methods have been proposed for quantifying manual pull intensity. Such metrics could potentially be used to provide assessors with feedback on their pull intensity and help standardize test delivery [8, 19, 25] However, identifying a parameter that gives a reliable readout is not straightforward due to two main reasons. First, pull intensity depends on multiple parameters, e.g., the energy delivered by the assessor is not only determined by the peak acceleration, but also by its duration. Second, because reactive responses of the participants commence soon after the onset of perturbation, metrics derived from movement trajectories of the participant’s body do not merely reflect the imposed perturbation, but are increasingly influenced by these responses. For the push-and-release test, such standardization methods may potentially be easier to implement, since perturbation intensity mainly depends on the inclination angle at the start of the perturbation. Yet, despite maximum standardization of test delivery on the treadmill, we still observed residual variability in test outcomes, presumably due to inconsistencies in the participant’s test performance (Fig. 1). Future work should establish the minimum number of repetitions needed for reliable quantification of postural instability. Interestingly, previous work from our group demonstrated that healthy older adults successfully recovered balance from standardized perturbations at 1.5 m/s2 in the vast majority of trials [24], whereas the PD participants in the present study managed to do so in only 26% of all trials. This observation indicates that postural instability was highly prevalent in our study population despite 70% of all pull tests yielding a score of 0 (i.e., no postural instability). In addition to the aforementioned rating differences (1 vs 2 steps), we speculate that some of the pulls may not have been strong enough to reveal postural instability. A previous study evaluated videotapes of pull test deliveries of 25 experienced assessors who performed a total of 66 pulls, which were evaluated by four raters with respect to pull technique and application (e.g., recovery space for the participant, feet placement of the patient, pull intensity, ‘if the pull was administered at the shoulders’, or ‘whether the pull was performed steady and continuously’) [26]. The most commonly agreed-on error (i.e., agreed by at least two of the four raters) was that the participant was pulled too lightly (n = 51, 77%). Indeed, the wide range of observed sternum displacements (Fig. 4) also included trials where the displacement at 250 ms after perturbation onset did not even exceed 1 cm. Such light pulls likely result in frequent false-negative test outcomes.
While standardized delivery of sufficiently strong perturbations holds promise in addressing important shortcomings of the currently used pull and push-and-release tests, assessment of postural instability may further benefit from quantification of step quality (in addition to the number of steps taken). To achieve this, one option to consider is measuring the leg angle (i.e., angle of the leg of the first balance-correcting step with the vertical at step touchdown), which measure captures the position of the foot relative to the CoM. Indeed, in response to standardized perturbations of equal magnitude, this metric was shown to distinguish between PD patients and controls [24], making it a potential candidate to evaluate postural instability on a continuous scale. Future studies may further explore the utility of the leg angle for quantifying postural instability in PD and for evaluating disease progression or effects of intervention.
It must be noted that center of mass displacements induced by the treadmill-induced perturbations differed from the pull tests. This is not surprising as the point of impact differs (i.e., the shoulder compared to the platform of the treadmill), and the duration of the impact is not the same. Despite these differences, the participants likely perceived test difficulty of the manual pull test and the treadmill perturbations as being rather similar as outcome scores were comparable between the tests (Fig. 2). A limitation of the study was that we performed a lower number of repetitions of the treadmill-based perturbations than pull and push-and-release tests. As a consequence, the variability in the treadmill-based tests may have been underestimated. Yet, it must be noted that variability in test outcome in four non-consecutively conducted treadmill-based perturbations (the very first trial excluded; see Fig. 2) was still somewhat lower than that of three consecutive pull or push-and-release test conducted by the same assessor. In addition, our protocol involved a series of relatively similar pseudo-randomized backward perturbations that may have led to habituation effects. This, however, may underestimate the variability in test outcomes observed in daily clinical practice, where retropulsion tests are typically performed longitudinally at multiple time points.
In conclusion, the present study demonstrates that an individual’s postural instability test outcomes may show substantial variability, presumably due to a combination of variable test delivery and inconsistencies in the patient’s performance. To rule out variability in test delivery, we recommend using standardized perturbations. To account for the observed variability in the patient’s performance, it seems imperative to administer multiple trials for accurately identifying postural instability.
Funding
The research was funded by ZonMW-TopZorg (TZO) grant number [10070022010004].
Declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Footnotes
Jorik Nonnekes and Vivian Weerdesteyn have shared last authorship.
References
- 1.Crouse JJ et al (2016) Postural instability and falls in Parkinson’s disease. Rev Neurosci 27(5):549–555 [DOI] [PubMed] [Google Scholar]
- 2.Allen NE, Schwarzel AK, Canning CG (2013) Recurrent falls in Parkinson’s disease: a systematic review. Parkinsons Dis 2013:906274 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Jankovic J (2008) Parkinson’s disease: clinical features and diagnosis. J Neurol Neurosurg Psychiatry 79(4):368–376 [DOI] [PubMed] [Google Scholar]
- 4.Elm JJ et al (2005) A responsive outcome for Parkinson’s disease neuroprotection futility studies. Ann Neurol 57(2):197–203 [DOI] [PubMed] [Google Scholar]
- 5.Nonnekes J et al (2015) The retropulsion test: a good evaluation of postural instability in Parkinson’s disease? Journal of Parkinsons Disease 5(1):43–47 [DOI] [PubMed] [Google Scholar]
- 6.Goetz CG et al (2008) Movement disorder society-sponsored revision of the Unified Parkinson’s Disease Rating Scale (MDS-UPDRS): scale presentation and clinimetric testing results. Mov Disord 23(15):2129–2170 [DOI] [PubMed] [Google Scholar]
- 7.Jacobs JV et al (2006) An alternative clinical postural stability test for patients with Parkinson’s disease. J Neurol 253(11):1404–1413 [DOI] [PubMed] [Google Scholar]
- 8.Ellrich N et al (2024) Precision balance assessment in Parkinson’s disease: utilizing vision-based 3D pose tracking for pull test analysis. Sensors (Basel) 24(11):3673 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Smith BA, Carlson-Kuhta P, Horak FB (2016) Consistency in administration and response for the backward push and release test: a clinical assessment of postural responses. Physiother Res Int 21(1):36–46 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Morris A et al (2022) Interadministrator reliability of a modified instrumented push and release test of reactive balance. J Sport Rehabil 31(4):517–523 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Lu C et al (2023) Postural instability in Parkinson’s disease assessed with clinical “pull test” and standardized postural perturbations: effect of medication and body weight support. J Neurol 270(1):386–393 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Nieuwboer A et al (2009) Reliability of the new freezing of gait questionnaire: agreement between patients with Parkinson’s disease and their carers. Gait Posture 30(4):459–463 [DOI] [PubMed] [Google Scholar]
- 13.Nasreddine ZS et al (2005) The Montreal Cognitive Assessment, MoCA: a brief screening tool for mild cognitive impairment. J Am Geriatr Soc 53(4):695–699 [DOI] [PubMed] [Google Scholar]
- 14.Duncan RP et al (2013) Comparative utility of the BESTest, mini-BESTest, and brief-BESTest for predicting falls in individuals with Parkinson disease: a cohort study. Phys Ther 93(4):542–550 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Le Goic M et al (2019) An initial passive phase that limits the time to recover and emphasizes the role of proprioceptive information (vol 9, 986, 2018). Front Neurol. 10.3389/fneur.2019.00118 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Kimmell K et al (2015) Postural instability in Parkinson disease: to step or not to step. J Neurol Sci 357(1–2):146–151 [DOI] [PubMed] [Google Scholar]
- 17.Perez-Sanchez JR, Grandas F (2019) Early postural instability in Parkinson’s disease: a biomechanical analysis of the pull test. Parkinsons Dis 2019:6304842 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Tan D et al (2012) Relationships between motor aspects of gait impairments and activity limitations in people with Parkinson’s disease: a systematic review. Parkinsonism Relat Disord 18(2):117–124 [DOI] [PubMed] [Google Scholar]
- 19.Tan JL et al (2018) Neurophysiological analysis of the clinical pull test. J Neurophysiol 120(5):2325–2333 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Allum JH et al (2002) Age-dependent variations in the directional sensitivity of balance corrections and compensatory arm movements in man. J Physiol 542(Pt 2):643–663 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Carpenter MG et al (2004) Postural abnormalities to multidirectional stance perturbations in Parkinson’s disease. J Neurol Neurosurg Psychiatry 75(9):1245–1254 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Horak FB, Dimitrova D, Nutt JG (2005) Direction-specific postural instability in subjects with Parkinson’s disease. Exp Neurol 193(2):504–521 [DOI] [PubMed] [Google Scholar]
- 23.Peterson DS, Horak FB (2016) The effect of levodopa on improvements in protective stepping in people with Parkinson’s disease. Neurorehabil Neural Repair 30(10):931–940 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.de Kam D et al (2014) Dopaminergic medication does not improve stepping responses following backward and forward balance perturbations in patients with Parkinson’s disease. J Neurol 261(12):2330–2337 [DOI] [PubMed] [Google Scholar]
- 25.Daly S et al (2022) Using kinematics to re-define the pull test as a quantitative biomarker of the postural response in normal pressure hydrocephalus patients. Exp Brain Res 240(3):791–802 [DOI] [PubMed] [Google Scholar]
- 26.Munhoz RP et al (2004) Evaluation of the pull test technique in assessing postural instability in Parkinson’s disease. Neurology 62(1):125–127 [DOI] [PubMed] [Google Scholar]