Abstract
Objective
To extend the applicability of the Wolf Motor Function test (WMFT) to describe the residual functional abilities of moderate-to-severely affected stroke patients. The WMFT is a motor function test for mild to moderate upper extremity weakness in stroke patients, but it has not been routinely used for evaluation of more severe hemiparetic stroke patients due to its numerical characteristics.
Design
Data was collected as part of two double-blind sham controlled randomized interventional studies, the Transcranial Direct Current Stimulation (tDCS) in Chronic Stroke Recovery and tDCS Enhanced Stroke Recovery and Cortical Reorganization. Stroke patients were evaluated with the upper-extremity Fugl-Meyer (UFM) and the WMFT in the same setting prior to treatment.
Setting
University inpatient rehabilitation and outpatient clinic.
Participants
32 stroke patients with moderate-to-severe hemiparesis enrolled in the tDCS in Chronic Stroke Recovery and tDCS Enhanced Stroke Recovery and Cortical Reorganization studies.
Intervention
Not applicable.
Main Outcome Measures
WMFT scores were calculated using 1) median performance times, 2) new calculation using the mean rate of performance. We compared the distribution of values from the two methods and examined the WMFT-UFM correlation for the traditional and the new calculation.
Results
WMFT rate values were more evenly distributed across their range than median WMFT time scores. The association between the WMFT rate and UFM was as good as the association between the median WMFT time scores and UFM (Spearman rs 0.84 vs −0.79).
Conclusions
The new WMFT mean rate of performance is valid and a more sensitive measure in describing the functional activities of the moderate to severely affected upper extremity of stroke subjects and avoids the pitfalls of the median WMFT time calculations.
Keywords: Hemiparesis, Rehabilitation, Assessment, Stroke
Stroke is a leading cause of long-term disability in the United States1 and worldwide2. Among the 795,000 Americans who suffer a stroke annually,1 about 60–70% suffer initial upper extremity paresis3–5. Only 15% of all acute stroke patients who enter rehabilitation for upper extremity weakness ever regain full useful function of their limbs6, while 63% of those with severe upper extremity hemiparesis are discharged to institutionalized care7. With such a bleak outlook for stroke patients, the promise of even partial recovery of upper extremity function has inspired dedicated study of new therapies.
Evaluating the effectiveness of interventional therapy to improve upper extremity weakness requires a sensitive and reliable assessment of functional activity in addition to evaluation of the impairment 8. One valid and commonly used assessment tool of upper extremity functional ability is the Wolf Motor Function Test (WMFT)9–13. The WMFT tests a broad range of upper extremity function through two strength measurements and a series of 15 functional tasks that progress from simple movements in proximal joint areas to complex movements in distal joint areas. Each of the 15 tasks is timed to completion, up to a maximum of 120 seconds. Functional ability sub-scores represent the quality of the movement during the performance of these functional tasks. WMFT was found to be a valid and reliable measure of upper extremity function in mild 9, 10, 13 to moderately involved 12, 14 subjects with high test-retest and inter-rater reliability 15. High correlation between the WMFT and upper extremity Fugl-Meyer (UFM) scores established the criterion validity of WMFT in mild 10 and moderately14 impaired stroke patients. In these prior tests UFM was chosen as the criterion test because it is a validated test that measures multi-joint upper extremity function reliably following stroke.16–18
There is a pressing need for using a similar time-based, reliable, and sensitive functional motor measurement for the moderate-to-severely impaired stroke patients’ assessment, especially early after the stroke when deficits are more severe,12 or for those chronic stroke patients that did not recover well during the course of their customary rehabilitation therapy. However, the numerical characteristics of WMFT time data present several distinct problems when applied to severely affected subjects. First, task times do not even remotely approximate a normal distribution, precluding the use of many classical statistical analyses (e.g.: ANOVA). Second, a floor effect prevents the accurate representation of the performance of severely impaired subjects who cannot complete at least half the tasks9, 12, 14 – the median time becomes 120 seconds, regardless of how well the subject performed on the tasks that he was able to complete. This is particularly troublesome as it underlines the inability of the median to quantify overall changes in performance in moderate to severely impaired subjects in response to treatment. Using mean task-completion times instead of median times does utilize data from all 15 tasks, but is also problematic because of the third problem, the arbitrary 120-second limit, which introduces a correspondingly arbitrary amount of skewing into the calculations.
To offer a solution, we propose to calculate WMFT measurements as rate of performance, where we calculate “how many times would a person have completed the task, had he or she been performing it continuously for 60 seconds”. That is, we are proposing a simple reciprocal transformation of task-time data into task-rate data:
If an individual could not perform the action in 120 seconds, we propose assigning a score of 0, signifying the inability of the subject to perform a task. We also propose calculating the overall WMFT value as the simple average (arithmetic mean) of the rates of the 15 time based functional tasks. By replacing the arbitrarily assigned 120-second time score with a zero rate score, which is perfectly meaningful for subjects unable to complete a task, we can use the mean of timed measurements to sensitively detect variability among severely affected subjects. By the central limit theorem, the mean of a set of numbers will be more normally distributed than the individual numbers, and this approach to normality will tend to be much faster when non-performance is signified by the reasonable value zero than when it is signified by the large and arbitrary value of 120.
This study investigates the validity of calculating mean WMFT performance rates in moderate to severely affected stroke subjects as compared to the standard median performance times of WMFT.
METHODS
Subjects
Subjects from two double blind interventional stroke studies were included for this study if they had suffered a single symptomatic ischemic stroke affecting motor function in the upper extremities, and had no other neurological disease. Additional inclusion criteria included age 18–80 years old with an UFM score <60, a Modified Ashworth Scale score of <3, and a hand/wrist MRC Scale score of ≥1. Subjects were excluded for the following: more than one symptomatic stroke, bilateral motor impairment, history of substance abuse, psychiatric illness (severe depression, poor motivation) or serious cognitive deficits, severe language disturbance, severe uncontrolled medical problems, pregnancy, pacemaker, metallic implants in the head, antiadrenergic medication, and seizure disorder. All subjects provided informed consent to the studies’ IRB-approved procedures.
Subjects were 16 women and 16 men (N=32) (Table 1)
Table 1. Subject Characteristics.
Acute and chronic study participants had similar age and motor deficit. Means, standard deviations (SD), range of values, and confidence intervals (CI) shown. The National Institute of Health Stroke Scale (NIHSS) showed that the acute stroke study participants had a slightly higher multi-domain impairment level due to deficits other than upper extremity weakness. We compared numerical values using a two-sample t-test.
| Acute Study mean ±SD | Chronic Study mean ±SD | Difference (Chronic-Acute) [95% CI] | p-value | Pooled mean ±SD | |
|---|---|---|---|---|---|
| Subject number (women) | 20 (12) | 12 (4) | N/A | N/A | 32 (16) |
| Age (years) | 62.3±11.7 (range: 31–79) | 56.2±9.0 (range: 40–72) | −6.1 [CI: −13.6 to +1.4] | 0.11 | 59.6±11.2 (range 31–79) |
| Time post stroke | 8.2±3.7 days (range: 5–15) | 16.9±10.8 months (range: 7–44) | N/A | N/A | 6.3±10.3 months (5 days - 44 months) |
| NIHSS | 7±3.2 (range: 2–14) | 3.9±3.4 (range: 1–11) | −3.1 [CI: −4.3 to −1.9] | 0.02 | 5.9±3.6 (range: 1–14) |
| UFM | 24.5±15.6 (range: 4–54) | 31.3±14.4 (range: 12–53) | +6.8 [CI: −4.3 to +17.9] | 0.22 | 27.6±15.5 (range: 4–54) |
| WMFT rate | 7.6±9.3 (range: 0–31.8) | 16.1±19.5 (range: 2.3–70.9) | +8.5 [CI: −3.7 to +20.7] | 0.18 | 10.8±14.3 (range: 0–70.9) |
| WMFT time (sec) | 69.7±57.1 (range: 2.4–120) | 64.2±53.8 (range: 2.3–120) | −5.5 [CI: −46.5 to +35.5] | 0.79 | 67.7±55.1 (range: 2.3–120) |
Because subjects were taken from two different studies, the time since onset of stroke fell within 5–15 days (acute study) or >3 months (chronic study). The two groups were not different from each other in age or upper extremity weakness as measured by the UFM or the WMFT. (Table 1) One subject from the chronic study group was excluded from the current analysis due to a pre-existing Dupuytren contracture on the 4th and 5th fingers of his affected hand that had decoupled his UFM and WMFT scores. Our subjects’ arm function ranged from those who could not use the paretic arm even in a supportive role, to those who could use the affected arm and hand with some adaptive equipment or other supports. Sixteen of our 32 patients could perform less than half of the WMFT timed tasks.
Test Administration
The UFM and WMFT were both administered in the same session by a trained tester who was unaware of the purpose of this study. For the UFM, the tester gave clear instructions and demonstrations for testing flexor synergy, extensor synergy, movement combining synergies, movement out of synergy, wrist function, hand function, and coordination/speed. The subjects first performed the movement with their non-affected limb, and then performed each of the tested movements with their affected limb three times. The highest-scoring movement out of three attempts was recorded for each movement, save for the coordination/speed test, which was performed only once. In the same session, the tester administered the modified version of WMFT[11] that included two strength measures (grip strength, weight-lifting ability) and 15 functional tasks progressing from simple movements in proximal joints (i.e. lifting forearm, extending elbow) to complex tasks in distal joints (i.e. flipping cards, lifting a pencil) using only the affected limb. Use of the less affected arm was allowed only for the bimanual task of folding a towel [11]. Tables were pre-marked to indicate object placement, and subjects were given clear verbal instructions and demonstrations for each task to ensure understanding and best effort. The subject first performed the movement with their non-affected limb, and then performed each of the tested movements with their affected limb twice, before moving on to the next item on the test. The mean of the two measurements was calculated for each test item for both the median and the mean rate values. The UFM was administered once, followed by the WMFT.
Data Analysis
Timed task-completion data was analyzed using the standard median time calculation and also the new calculation, which consisted of calculating the rate of task performance over 60 seconds (60/performance time). Subjects unable to complete a task within 120 seconds were given a rate of 0.
To determine the criterion validity of the rate-of-performance calculation, we calculated the correlation coefficient (Spearman’s rs) between the WMFT mean performance rate data to the UFM data and compared them to the corresponding coefficients between the WMFT median performance time data and the UFM, in this moderate to severely affected group. We used Spearman’s correlation because the median WMFT time values were not normally distributed. We fitted locally weighted scatterplot smoothing (LOWESS) lines to the scatterplots to evalute the relative merits of each calculation. Exploratory subgroup analysis according to severity groups was carried out. We compared the correlation coefficients between the measures by the standard method using the Fisher r to z transformation.
Statistical analyses were carried out using the statistical/graphical software packages R v2.14.0 and SigmaPlot v12.0.
RESULTS
The scatter plots shown in Fig 1 show that when median times and mean rates of performance are plotted against UFM scores, median time data points segregate into two distinct areas on the graph, agreeing with a fitted line only at the upper and lower ends of WMFT performance. On the other hand, the mean rate-of-performance data is more evenly distributed across the range of the data (Fig 1) The association between WMFT and UFM was more clearly discernible when WMFT was displayed as mean rate rather than median time.
Figure 1. Scatter Charts Demonstrate Consistent Relationship between the WMFT Mean Rate of Performance and UFM.
Scatter plots of median task times (on the left) and mean rates of performance (on the right) from the WMFT vs. UE-FM scores from the acute (first row), chronic study (second row), and the combined dataset (third row). Each graph contains a LOWESS line superimposed on the data points. Please note that shorter task completion times indicate better performance for the median time data, therefore the sign of correlation is reversed.
The correlation between UFM and WMFT (see Table 2) was overall slightly better with the mean performance rate data then with the median performance time data, but the difference was not statistically significant. (p=0.60 Fisher r to z, two tailed) The acute and chronic studies showed similar correlation coefficients.
Table 2. Correlation Between UFM Data and WMFT Median and Rate Data.
Spearman’s correlation coefficients are shown to validate the median WMFT timed test data and the mean WMFT rate data against the UFM scores of the chronic study subjects (n=11), the acute study subjects (n=20), and from a combined pool of all subjects from the acute and chronic study (n=31, p<0.001 for all correlation coefficients). Please note that shorter task completion times indicate better performance for the median time data, therefore the sign of correlation is reversed.
| Median WMFT Time | Mean WMFT Rate | |
|---|---|---|
| Acute (n=20) | rs = −0.76 | rs = 0.81 |
| Chronic (n=11) | rs = −0.90 | rs = 0.95 |
| Pooled (n=31) | rs = −0.79 | rs = 0.84 |
| Chronic vs. Acute | p = 0.26 | p = 0.12 |
We performed an exploratory subgroup analysis on the relative strength of correlation between the median WMFT time scores and UFM and the mean WMFT rate scores and UFM. (Table 3) The mean WMFT rate of performance correlated well with the UFM in the lower functioning individuals, producing similar Spearman rs as in the higher functioning subjects. 16 subjects had a score of 120 on the median time score, but only 5 of them scored 0 on the new scale.
Table 3. Subgroup analysis by severity of impairment.
Median WMFT times have a floor effect in the weaker subjects that is reflected by the adjusted Spearman correlation coefficient of 0.49. (Adjustment was made for the “ties”: 14 out of 17 scored 120 total score) On the other hand, mean WMFT rate calculation showed as good correlation in the weaker subject subgroup as in the less impaired subgroup. Differences between the groups did not reach statistical significance due to the small sample size.
| Severity group | 0–25 UFM (N=17) | >25 UFM (N=14) | 0–25-vs- >25 UFM |
|---|---|---|---|
| Median Time vs UFM | rs =−0.49 | rs =−0.74 | p=0.32 |
| Mean Rate vs UEF | rs =0.78 | rs =0.68 | p=0.59 |
DISCUSSION
This study extends the applicability of the WMFT, a widely used stroke functional outcome measure, to describe the residual functional abilities of moderate to severely affected stroke patients. 9, 14 We report comparable Spearman correlation coefficient results in the weaker stroke patients to the previously reported studies that examined higher functioning stroke patients (Wolf et al −0.54 to −0.68 in 19 patients9, Whitall et al. −0.69 to −0.89 in a mild to moderate severity group14)
Our criterion validity (concurrent validity) tests show an acceptable degree of correlation between mean performance rate data and the Fugl-Meyer Assessment; in fact, the Spearman’s rs correlation test shows as good or slightly better correlation between UFM and mean WMFT performance rate than between UFM and median WMFT performance time and this holds true for the weakest patients also.
Comparison of correlation coefficients is of limited usefulness in assessing the relative merits of the two methods of expressing WMFT scores. Such calculations are usually reserved for independent datasets that limits their applicability to this situation. In addition, the WMFT-UFM relationships may not be linear, and the WMFT times are not even roughly normally distributed.
The main advantages of using mean rates instead of median times comes from (1) their ability to utilize times from all 15 timed tasks, even for subjects who can perform only fewer than half of them, (2) the more uniform spread of mean times across their range, (3) minimizing the impact of “floor effect”, and (4) the elimination of the arbitrary 120 scores. It is most telling to examine the scatter charts of median WMFT times vs. UFM scores, in which the times tend to be segregated into a nearly dichotomous pattern, with all the actual completion times near the left side of the graph, and all the “120 second” non-completion times at the right side. In contrast, scatter charts of mean performance rates vs. UFM scores data points show rates evenly distributed throughout their range, indicating a more consistent relationship with UFM, and a more smoothly-graded measure of performance, that renders the WMFT performance rate a more sensitive and more useful measure of arm function in moderate to severe upper extremity weakness. (see Figure 1)
Furthermore, unlike the previous 120 median score, a total mean score of 0 in the new rate calculation is unlikely to miss any meaningful functional abilities on the upper extremity, due to the WMFT’s extensive testing of proximal and distal, simple and complex movements, therefore avoiding a clinically meaningful floor effect for the motor evaluation. In summary, the new mean WMFT performance rate calculation shifts the floor effect of the WMFT from approximately 15–31 (see figure 1) on the UFM to match the lower end of the range of the UFM.
Study limitations
We used data from two single-site Phase II clinical trials, both based on a small sample of stroke patients. These findings should be confirmed in a larger multi-site sample, and further severity-adjusted analyses should be performed using the new method.
Using our new, simple transformation, scores between 0 and 0.05 are unobtainable, creating a small “gap” in the distribution. However, in practice values fitting in this range would be obtained exceedingly rarely, as most tests that the subjects cannot perform in 120 seconds would probably never be completed, even if longer time were allowed, due to their limited functional ability and fatigue. In fact, most of the time we abort the test early when subjects are clearly unable to perform a task to avoid excessive frustration and fatigue. In theory, therefore, when the subject is not able to complete the task the time of performance would be infinity. Our suggested mathematical transformation results in 0 in the case when the subject could not perform the task, allowing meaningful mean calculations.
The reciprocal transformation will decrease the effect of outliers caused by the longer performance times in lower functioning patients that has been hampering mean WMFT time calculations, making the new calculation theoretically more stable in this patient group. One caveat of this calculation is that while measurement error in lower functioning individuals would be expected to be less, measurement error at the higher performing individuals may be magnified using these calculations. Such measurement error could be lowered by using the mean of two (or more) repeated measurements without significantly adding to subject burden as it was performed in our study. However, the applicability of this new WMFT calculation in higher functioning individuals would require further study.
CONCLUSIONS
It is important to include weaker patients in stroke motor recovery studies with accurate assessments in order to be able to develop more effective treatments for this patient group that is most in need of rehabilitation. This calculation will allow assessment of stroke patients early after stroke, when their upper extremity is expected to be weaker. Our study shows that this new calculation could be used to produce valid and accurate assessments of functional ability among moderate to severely affected subjects.
Acknowledgments
This study was supported by the National Institutes of Health (grant no. 5K23HD050267), the Mobility Foundation (grant no. 50921)
We thank Alexander Dromerick, M.D. for his critical reading and advice on the manuscript.
Abbreviations
- WMFT
Wolf motor function test
- tDCS
Transcranial Direct Current Stimulation
- UFM
Upper extremity Fugl-Meyer
- ANOVA
Analysis of Variance
- LOWESS
locally weighted scatterplot smoothing
Footnotes
Part of the chronic stroke patient data was presented in a poster format at the Society for Neuroscience meeting in 2010.
Financial Disclosure:
We certify that no party having a direct interest in the results of the research supporting this article has or will confer a benefit on us or on any organization with which we are associated AND, if applicable, we certify that all financial and material support for this research (eg, NIH or NHS grants) and work are clearly identified in the title page of the manuscript.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- 1.Roger VL, Go AS, Lloyd-Jones DM, Adams RJ, Berry JD, Brown TM, et al. Heart disease and stroke statistics--2011 update: a report from the American Heart Association. Circulation. 2011;123(4):e18–e209. doi: 10.1161/CIR.0b013e3182009701. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Sousa RM, Ferri CP, Acosta D, Albanese E, Guerra M, Huang Y, et al. Contribution of chronic diseases to disability in elderly people in countries with low and middle incomes: a 10/66 Dementia Research Group population-based survey. Lancet. 2009;374(9704):1821–30. doi: 10.1016/S0140-6736(09)61829-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Jorgensen HS. The Copenhagen Stroke Study experience. J Stroke Cerebrovasc Dis. 1996;6(1):5–16. doi: 10.1016/s1052-3057(96)80020-6. [DOI] [PubMed] [Google Scholar]
- 4.Nakayama H, Jorgensen HS, Raaschou HO, Olsen TS. Recovery of upper extremity function in stroke patients: the Copenhagen Stroke Study. Arch Phys Med Rehabil. 1994;75(4):394–8. doi: 10.1016/0003-9993(94)90161-9. [DOI] [PubMed] [Google Scholar]
- 5.Wade DT, Langton-Hewer R, Wood VA, Skilbeck CE, Ismail HM. The hemiplegic arm after stroke: measurement and recovery. J Neurol Neurosurg Psychiatry. 1983;46(6):521–4. doi: 10.1136/jnnp.46.6.521. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Sunderland A, Tinson DJ, Bradley EL, Fletcher D, Langton Hewer R, Wade DT. Enhanced physical therapy improves recovery of arm function after stroke. A randomised controlled trial. J Neurol Neurosurg Psychiatry. 1992;55(7):530–5. doi: 10.1136/jnnp.55.7.530. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Hunter S, Crome P. Hand function and stroke. Reviews in Clinical Gerontology. 2002;12(01):68–81. [Google Scholar]
- 8.Duncan PW, Jorgensen HS, Wade DT. Outcome measures in acute stroke trials: a systematic review and some recommendations to improve practice. Stroke. 2000;31(6):1429–38. doi: 10.1161/01.str.31.6.1429. [DOI] [PubMed] [Google Scholar]
- 9.Wolf SL, Catlin PA, Ellis M, Archer AL, Morgan B, Piacentino A. Assessing Wolf motor function test as outcome measure for research in patients after stroke. Stroke. 2001;32(7):1635–9. doi: 10.1161/01.str.32.7.1635. [DOI] [PubMed] [Google Scholar]
- 10.Wolf SL, Lecraw DE, Barton LA, Jann BB. Forced use of hemiplegic upper extremities to reverse the effect of learned nonuse among chronic stroke and head-injured patients. Exp Neurol. 1989;104(2):125–32. doi: 10.1016/s0014-4886(89)80005-6. [DOI] [PubMed] [Google Scholar]
- 11.Woodbury M, Velozo CA, Thompson PA, Light K, Uswatte G, Taub E, et al. Measurement structure of the Wolf Motor Function Test: implications for motor control theory. Neurorehabil Neural Repair. 2010;24(9):791–801. doi: 10.1177/1545968310370749. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Lin JH, Hsu MJ, Sheu CF, Wu TS, Lin RT, Chen CH, et al. Psychometric comparisons of 4 measures for assessing upper-extremity function in people with stroke. Phys Ther. 2009;89(8):840–50. doi: 10.2522/ptj.20080285. [DOI] [PubMed] [Google Scholar]
- 13.Edwards DF, Lang CE, Wagner JM, Birkenmeier R, Dromerick AW. An Evaluation of the Wolf Motor Function Test in Motor Trials Early after Stroke. Arch Phys Med Rehabil. 2011 doi: 10.1016/j.apmr.2011.10.005. In press. [DOI] [PubMed] [Google Scholar]
- 14.Whitall J, Savin DN, Jr, Harris-Love M, Waller SM. Psychometric properties of a modified Wolf Motor Function test for people with mild and moderate upper-extremity hemiparesis. Arch Phys Med Rehabil. 2006;87(5):656–60. doi: 10.1016/j.apmr.2006.02.004. [DOI] [PubMed] [Google Scholar]
- 15.Morris DM, Uswatte G, Crago JE, Cook EW, 3rd, Taub E. The reliability of the wolf motor function test for assessing upper extremity function after stroke. Arch Phys Med Rehabil. 2001;82(6):750–5. doi: 10.1053/apmr.2001.23183. [DOI] [PubMed] [Google Scholar]
- 16.Wood-Dauphinee SL, Williams JI, Shapiro SH. Examining outcome measures in a clinical study of stroke. Stroke. 1990;21(5):731–9. doi: 10.1161/01.str.21.5.731. [DOI] [PubMed] [Google Scholar]
- 17.Sanford J, Moreland J, Swanson LR, Stratford PW, Gowland C. Reliability of the Fugl-Meyer assessment for testing motor performance in patients following stroke. Phys Ther. 1993;73(7):447–54. doi: 10.1093/ptj/73.7.447. [DOI] [PubMed] [Google Scholar]
- 18.DeWeerdt W, Harrison M. Measuring recovery of arm-hand function in stroke patients: a comparison of the Brunnstrom-Fugl-Meyer test and Action Research Arm test. Physiotherapy Canada. 1985;(70):542–8. [Google Scholar]

