Abstract
Background
Patients often are asked to report walking distances before joint arthroplasty and when discussing their results after surgery, but little evidence demonstrates whether patient responses accurately represent their activity.
Questions/purposes
Are patients accurate in reporting distance walked, when compared with distance measured by an accelerometer, within a 50% margin of error?
Methods
Patients undergoing THA or TKA were recruited over a 16-month period. One hundred twenty-one patients were screened and 66 patients (55%) were enrolled. There were no differences in mean age (p = 0.68), proportion of hips versus knees (p = 0.95), or sex (p = 0.16) between screened and enrolled patients. Each patient wore a FitBitTM Zip accelerometer for 1 week and was blinded to its measurements. The patients reported their perceived walking distance in miles daily. Data were collected preoperatively and 6 to 8 weeks postoperatively. Responses were normalized against the accelerometer distances and Wilcoxon one-tailed signed-rank testing was performed to compare the mean patient error with a 50% margin of error, our primary endpoint.
Results
We found that patients’ self-reported walking distances were not accurate. The mean error of reporting was > 50% both preoperatively (p = 0.002) and postoperatively (p < 0.001). The mean magnitude of error was 69% (SD 58%) preoperatively and 93% (SD 86%) postoperatively and increased with time (p = 0.001).
Conclusions
Patients’ estimates of daily walking distances differed substantially from those patients’ walking distances as recorded by an accelerometer, the accuracy of which has been validated in treadmill tests. Providers should exercise caution when interpreting patient-reported activity levels.
Level of Evidence
Level III, diagnostic study.
Introduction
As the American population ages, there is an increasing demand for THAs and TKAs for the treatment of osteoarthritis [15]. With this increase in demand, quality and value have become more intensely scrutinized. Both objective and subjective endpoints are commonly used with the former including metrics such as time until revision and the latter focusing increasingly on patient-reported outcome measures (PROMs). Improvements in pain and quality of life after arthroplasty have been well documented, but the best method for assessing improvement in activity after total joint arthroplasty (TJA) remains unclear [4, 18].
Patients often are asked to report their physical activity before joint arthroplasty and when discussing their results after surgery. Physicians may ask patients about their walking distance, for example, “How far can you walk without having to stop?” A variation of this question is even a component of the Harris hip score and Knee Society Score (number of blocks walked) [12, 13]. Distance walked can be used as a measure of patient activity limitation and of postoperative functional improvement, because it can be standardized among patients and in principle is not subject to the patient’s interpretation of what qualifies as physical activity. Walking distance has been investigated as a tool for measuring clinical improvement in patients undergoing TJA and correlates with objective measures of functional improvement [3]. However, patient-reported walking distance depends on the patient’s recall and individual biases. This subjectivity can be avoided with the use of devices such as pedometers, accelerometers, and GPS tracking. The utility and validity of these devices to accurately measure steps, walking distance, and activity levels of the wearer have been demonstrated, supporting their value as a tool for assessing activity in surgical patients [5, 24]. One group has already used these tools to examine activity level before and after TJA [22]. Despite this, there are few studies comparing the patient’s perception of their walking distance with objective findings such as concurrent treadmill or accelerometer data, and these were limited to short distances [9] and relatively small groups [6, 10] and were not specific to the TJA population.
The aim of this study is to investigate the question: Are patients accurate in reporting distance walked, when compared with distance measured by an accelerometer, within a 50% margin of error?
Patients and Methods
This study was approved by the institutional review board. A member of the research team approached adult patients, ages 18 to 85 years, who were scheduled to undergo elective THA or TKA (including revisions) for recruitment at their preoperative visit. Some patients were also contacted during a prescreening phone call. We excluded patients if they had a history of cognitive impairment or dementia, were being recruited for other research studies, or were non-English-speaking. Written informed consent was obtained from patients at the time of their enrollment. Patients were permitted to withdraw from the study at any time. The study protocol also allowed us to remove patients from the study if they did not adhere to the protocol requirements or if their surgical procedure changed. Data through the time of withdrawal were included in analysis.
Over the recruitment period of 16 months, a total of 862 primary and revision THAs and TKAs were performed at this institution. In all, 66 patients were recruited into the study and 55 patients failed screening and recruitment. Of the 55 screening failures, 24 (44%) declined to participate, 13 (24%) did not have sufficient time between the preoperative visit and surgery to obtain 7 days of data, 13 (24%) were not contacted because the research coordinator was unavailable at the time of the preoperative visit, two (3.6%) had a preoperative visit already completed and could not be approached for consent, two (3.6%) could not provide consent as a result of dementia or cognitive impairment, and one (1.8%) exceeded the exclusion criteria of 85 years of age. There were no differences between the screening failure and enrolled groups in mean age (screened, 66.7 ± 10.1 years versus enrolled, 72.3 ± 8.7 years, p = 0.68) or in the proportion of THAs versus TKAs performed (42% hips in both groups, p = 0.95) or proportion of females versus males (screened, 67% female versus enrolled, 55% female, p = 0.16). There were more whites in the enrolled group (100%) versus those screened (93% white, 7% black; p = 0.03). One patient who was enrolled never underwent an arthroplasty and was excluded from data analysis; 65 enrolled patients initially participated in the study. After excluding the patients with incomplete data sets and those who withdrew, 45 patients (69%) were included for preoperative analysis and 35 patients (54%) were included for 6- to 8-week postoperative analysis. Overall, 29 patients had data sets available for both preoperative and postoperative analysis, 16 had preoperative data only, and six patients had postoperative data only, for a total of 51 participants in the study (Table 1). Twenty-three patients underwent THA, including four revisions, and 28 patients underwent TKA, including two revisions. Overall, 23 patients (23 of 45 [51%]) had at least one data set of seven excluded from preoperative analysis, most commonly resulting from inadequate compliance wearing the FitBit™ Zip (San Francisco, CA, USA) such as wearing it < 75% of the time. Fifteen patients (15 of 35 [43%]) had at least one data set excluded postoperatively. Forty-five patients were completely excluded from analysis either preoperatively or postoperatively (Table 2).
Table 1.
Table 2.
Data collection occurred in two phases, within 1 month preoperatively and at 6 to 8 weeks postoperatively. At the time of consent, patients were given an activity diary and were administered a survey investigating activity level and use of gait aids. During each data collection phase, the patients were given a FitBit Zip for a 1-week period. This accelerometer was chosen because it is inexpensive, readily commercially available, easy to use, and has been validated in laboratory and free-living conditions for many populations, including the elderly [1, 8, 14, 19, 21, 24]. This device is worn clipped to the belt, shirt, pocket, or other clothing. The commercial FitBit Zip has a screen that outputs step count; this was covered with an opaque adhesive (Fig. 1) to prevent the patient from monitoring their steps. The patients were instructed to wear the device from the time they woke up until the time they went to bed for 7 consecutive days. At the end of each day (Days 1-7), the patients estimated the distance walked that day and recorded it in their activity diary. They also provided an estimate of the percentage of time that they actually wore the FitBit Zip that day such as 0% to 25%, 25% to 50%, 50% to 75%, or 75% to 100%. At the end of the data collection period, the patients returned the device and their activity diary to the research team.
Step counts from the devices were then uploaded wirelessly via Bluetooth® (Kirkland, WA, USA) into the online FitBit database. The database allowed customization of each user profile associated with a given device to accurately reflect the patient’s height, weight, and estimated stride length. A member of the research team then accessed the week’s data and recorded the patient responses and accelerometer step count data into a spreadsheet for analysis. Step counts were converted to distances walked (based on the FitBit website calculations from the patient’s weight- and height-based estimated stride length).
We checked the eligibility of each patient’s data set for inclusion. Only data points in which the patient reported wearing the accelerometer for 75% to 100% of daytime hours, from the time they woke until the time they went to bed, were included. If a patient had < 4 days’ worth of data, meaning they did not routinely wear the accelerometer, we excluded them from the analysis. Additionally, individual data points were excluded if the mismatch was thought to be unreasonably large (presumably as a result of device recording error) to prevent artificial inflation of the patient error. For example, a single data point was excluded if the patient reported walking 2 miles but only 0.03 miles were recorded. This occurred for 10 patients preoperatively (17 data points eliminated) and 11 patients postoperatively (16 data points eliminated). After excluding ineligible days, the daily differences between the patients’ estimated walking distances and the accelerometer-recorded actual walking distances were calculated and normalized against the “true” walking distances measured by the accelerometer: . Negative values represented an overestimate of the true walking distance and positive values an underestimate. We obtained the absolute value of these errors to determine the magnitude of error daily, and we calculated the mean magnitude of error for each patient. The data sets were checked for normality and were found to be nonparametric. As such, data were then compared with a threshold of a 50% magnitude of error using Wilcoxon one-tailed signed-rank testing, our primary study outcome. This was performed for the preoperative and postoperative measurements. Fifty percent was chosen as the magnitude of error because it is an easily conceptualized concept. We felt that a “reasonable” estimate would fall within 50% of the true value.
Preoperative and postoperative data were compared to determine if patient estimates changed over time, including only the 29 patients who had both sets of data. We compared the mean magnitude of error preoperatively and postoperatively for each patient. The data were checked for normality and were nonparametric. A two-tailed Wilcoxon signed-rank test was used to compare the preoperative and postoperative errors and the walking distances. We also performed a subjective comparison of patient errors to look at general trends in reporting errors and performed chi-square analysis: was the patient an overestimator, an underestimator, or did he or she do both? The mean population accelerometer and estimate data were also compared preoperatively with postoperatively with Wilcoxon two-tailed signed-rank testing to evaluate for changes over time. All statistical testing was set at a 5% significance level.
The patients were also asked to complete a survey about their activity level and use of assistive devices. Thirteen patients reported the use of an assistive device preoperatively: one patient with one crutch, nine patients with one cane, two with a walker, and one without details. Most patients, 27 of 45, reported walking > 30 minutes each day (Table 3). One patient did not complete the postoperative survey. Of the 34 respondents, 10 reported needing an assistive device postoperatively (five with one cane, two with a walker, and three with either one cane or a walker). Most patients again reported a reasonable effort, with 21 of 34 claiming to ambulate > 30 minutes daily and 15 patients reporting an activity level of > 1 hour daily (Table 3).
Table 3.
Results
Patient estimates of walking distance differed from the accelerometer-recorded walking distances by a margin of error of > 50%. Preoperatively, individual errors ranged from as much as an overestimation of 900% to an underestimation of +90% (Fig. 2A). The mean magnitude of the daily error preoperatively was 69% (minimum 18%, maximum 271%, SD ± 58%). The mean reporting error was found to be > 50% (p = 0.002). A large spread of daily errors was seen; many patients were inconsistent in their errors and both overestimated and underestimated walking distance. Postoperatively, the same large individual errors were seen, ranging from an overestimation of 810% to an underestimation of 87% (Fig. 2B). The mean magnitude of error was 93% (minimum 12%, maximum 334%, SD ± 86%). The mean magnitude of patient reporting error was > 50% (p < 0.001).
The mean magnitude of error actually increased with time (mean preoperative magnitude of error 69%; mean postoperative magnitude of error 91%; p = 0.001; Table 4). Preoperatively, four patients were overestimators consistently, 12 underestimators, and 13 were under- and overestimators. Interestingly, these proportions changed: postoperatively eight patients were consistently overestimators, 18 were both, and only three patients systematically underestimated their walking distance (p = 0.02). Only nine patients demonstrated consistency in their reporting errors (six both underestimating and overestimating each time, one consistently underestimating, and two consistently overestimating). The mean population walking distance decreased from 2.1 miles daily to 1.9 miles daily (p = 0.02; Fig. 3). Interestingly, the mean estimate of walking distance based on the daily diary entries increased from 1.8 miles preoperatively to 2.0 miles postoperatively (p < 0.001).
Table 4.
Discussion
Patients are often asked about their activity level in terms of maximum attainable walking distance. Reported limitations in a patient’s ability to walk may factor into both a surgeon’s decision to recommend surgery as well as in the evaluation of TJA success. Additionally, reimbursement may be increasingly linked to PROMs to maximize value of care [2]. The accuracy of patient estimates of distance, however, has not been well established. Other authors have attempted to approach activity level after TJA from an objective standpoint. One group demonstrated a good correlation of a categorically assigned activity level to activity and a weekly pedometer recording but did not investigate patient-reported walking distance [7]. Our goal was to determine how well patients could estimate their total daily walking distance, because we feel this is reflective of their overall ability to evaluate walking distance and thus function. We found that not only were patients unable to accurately estimate their daily walking distance within a 50% margin of error, but that there was also no consistency to their errors, with both overestimations and underestimations often reported by the same patient.
There were several limitations to this study. First, this study relied heavily on patient compliance with the protocol. Unfortunately, many of the patients withdrew from the study or were ineligible based on their noncompliant responses. We only included data if patients reported wearing their device ≥ 75% of their waking hours. Because many of the patients reported wearing their device a lower proportion of the time, and some patients even had to be eliminated from consideration because of this, it is reasonable to assume that patients generally responded honestly to this assessment and that the data included in this study are reflective of a comprehensive accelerometer-based record of their activity (for example, one patient reported 0%-25% 2 days, 25%-50% 1 day, 50%-75% 1 day, and 75%-100% 3 days, requiring exclusion from the analysis because of noncompliance). We chose to exclude the noncompliant data points (< 75% of the day wear time) and those that appeared to be reflective of a malfunctioning FitBit because we wanted to minimize the risk of falsely inflating the errors and a false-positive to our study question. The second limitation is related to patient attrition. Wearable activity trackers in theory may assist providers in evaluating patient function while minimizing biases or estimation errors [16]. Although wearing a FitBit and keeping a daily walking diary seemed simple at the onset, a surprising number of patients found compliance to be difficult. Fourteen patients withdrew from the study (22%). Some patients even failed to return the devices. Our study demonstrates that both patient-reported estimates and accelerometer-recorded values in free-living conditions must be used with caution. One group reported good compliance rates (26.7 days of use of 30 tracked) with FitBit wear while tracking patients’ activity after arthroplasty but do not report how they monitored compliance [23]. Some of the patients in their series had step counts as low as five steps in a given day, which raises the question of whether the device was truly being worn consistently. A more sophisticated device (for example, one that could also track heartbeat to confirm that patients were wearing the devices at all times) may be helpful for activity tracking. Despite this attrition, however, the differences reached statistical significance. Finally, the patients may have changed their behavior knowing that they were being monitored for this metric. Self-reported activity levels are subject to recall bias and social desirability bias, in this case as a patient’s desire to appear more favorable to others by reporting they are more active than they really are [20]. Given the high magnitudes of disparity between the measured distances and the patient-reported distances, we feel it is unlikely that patients were independently measuring their own activity by other means to appear more accurate in their estimations. We assumed that patients believed that the results of this study did not influence either the care provided to them or reimbursement for the procedure. However, in the future, if patients presume that monitored activity might affect them or the provider financially or otherwise, the Hawthorne effect must not be ignored; that is, patients may be more cognizant of their walking distances if they knew they are being studied.
We investigated changes over time in both activity level and patient estimates. The magnitudes and directions of individual patient’s estimation errors were not consistent before and after surgery. Some patients improved dramatically, but others worsened over time. There did not appear to be any consistency in the types of errors patients made either. One would suspect a patient may “get used” to estimating their activity level for the postoperative measurements and improve in accuracy, but this was not the case. The population walking distance decreased with time, whereas the population estimate increased with time. This is consistent with other studies investigating activity level and perception after TJA. One group found that in the setting of TJA, patients reported a decrease in their pain and increase in their activity level despite no change in “activity counts” measured by an accelerometer [11]. One study suggested that patients may overestimate their functional level after arthroplasty when compared with objective measures [17]. Our group is collecting 1-year postoperative data as well to determine whether long-term activity level and perception change over time.
Providers should exercise caution when interpreting patient-reported walking distance. Although individual patients may occasionally be able to accurately judge their walking distances, we found that estimates generally were not accurate within a 50% margin of error. Patients’ inabilities to accurately and reproducibly estimate their walking distances are problematic when one considers that such outcomes may eventually be used as a measure of procedural success. A patient may report that he or she is walking less after surgery, but this study demonstrates that these reports may not be accurate, at least in the early postoperative period. In these situations, other objective measures of success (such as ROM, strength) may carry more weight. There remains a need for reliable, objective, and quantifiable assessments of improvements in physical activity after THA and TKA. However, unless compliance is tracked in a systematic manner, we recommend using caution in the adoption of accelerometers as adjuncts to outcome measurements, because patients had difficulty wearing the FitBit in this study routinely for even a week. The results of this study suggest that subjective activity-related PROMs must be carefully evaluated and validated, especially if they are to be used in the calculation of the value of care or evaluation of quality of a given procedure.
Acknowledgments
We thank Susan Hassenbein and Bartolo Torre for their contributions to this research.
Footnotes
All ICMJE Conflict of Interest Forms for authors and Clinical Orthopaedics and Related Research® editors and board members are on file with the publication and can be viewed on request.
Clinical Orthopaedics and Related Research® neither advocates nor endorses the use of any treatment, drug, or device. Readers are encouraged to always seek additional information, including FDA approval status, of any drug or device before clinical use.
Each author certifies that his or her institution approved the human protocol for this investigation and that all investigations were conducted in conformity with ethical principles of research.
References
- 1.Case MA, Burwick HA, Volpp KG, Patel MS. Accuracy of smartphone applications and wearable devices for tracking physical activity data. JAMA. 2015;313:625-626. [DOI] [PubMed] [Google Scholar]
- 2.Centers for Medicare & Medicaid Services (CMS), HHS. Medicare program; comprehensive care for joint replacement payment model for acute care hospitals furnishing lower extremity joint replacement services. Final rule. Fed Regist . 2015;80:73273-73554. [PubMed] [Google Scholar]
- 3.Crizer M, Kazarian G, Fleischman A, Lonner J, Maltenfort M, Chen A. Stepping toward objective outcomes: a prospective analysis of step count after total joint arthroplasty. J Arthroplasty. 2017;32:S162–S165. [DOI] [PubMed] [Google Scholar]
- 4.Dailiana Z, Papakostidou I, Varitmidis S, Liaropoulos L, Zintzaras E, Karachalios T, Michelinakis E, Malizos K. Patient-reported quality of life after primary major joint arthroplasty: a prospective comparison of hip and knee arthroplasty. BMC Musculoskelet Disord . 2015;16:366. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Evenson K, Goto M, Furberg R. Systematic review of the validity and reliability of consumer-wearable activity trackers. Int J Behav Nutr Phys Act. 2015;12:159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Faucheur AL, Abraham P, Jaquinandi V, Bouye P, Saumet JL, Noury-Desvaux B. Measurement of walking distance and speed in patients with peripheral arterial disease: a novel method using a global positioning system. Circulation. 2008;117:897-904. [DOI] [PubMed] [Google Scholar]
- 7.Feller J, Kay P, Hodgkinson J, Wroblewski B. Activity and socket wear in the Charnley low-friction arthroplasty. J Arthroplasty. 1994;9:341–345. [DOI] [PubMed] [Google Scholar]
- 8.Ferguson T, Rowlands AV, Olds T, Maher C. The validity of consumer-level, activity monitors in healthy adults worn in free-living conditions: a cross-sectional study. Int J Behav Nutr Phys Act . 2015;12:42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Frans FA, Zagers MB, Jens S, Bipat S, Reekers JA, Koelemay MJ. The relationship of walking distances estimated by the patient, on the corridor, and on a treadmill, and the Walking Impairment Questionnaire in intermittent claudication. J Vasc Surg . 2013;57:720-727. [DOI] [PubMed] [Google Scholar]
- 10.Giantomaso T, Makowsky L, Ashworth NL, Sankaran R. The validity of patient and physician estimates of walking distance. Clin Rehabil. 2003;17:394-401. [DOI] [PubMed] [Google Scholar]
- 11.Harding P, Hollane A, Delany C, Hinman R. Do activity levels increase after total hip and knee arthroplasty? Clin Orthop Relat Res. 2014;472:1502–1511. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Harris WH. Traumatic arthritis of the hip after dislocation and acetabular fractures: treatment by mold arthroplasty. J Bone Joint Surg Am . 1969;51:737-755. [PubMed] [Google Scholar]
- 13.Insall JN, Dorr LD, Scott RD, Scott WN. Rationale of the Knee Society Clinical Rating System. Clin Orthop Relat Res. 1989;248:13-14. [PubMed] [Google Scholar]
- 14.Kooiman T, Dontie M, Sprenger S, Krijnen W, van der Schans C, de Groot M. Reliability and validity of ten consumer activity trackers. BMC Sport Sci Med Rehabil . 2015;7:24–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Kurtz S, Ong K, Lau E, Mowat F, Halpern M. Projections of primary and revision hip and knee arthroplasty in the United States from 2005 to 2030. J Bone Joint Surg Am. 2007;89:780-785. [DOI] [PubMed] [Google Scholar]
- 16.Lyman S, Yin KL. Patient-reported outcome measurement for patients with total knee arthroplasty. J Am Acad Orthop Surg. 2017;25:S44-47. [DOI] [PubMed] [Google Scholar]
- 17.Mizner R, Petterson S, Clements K, Zeni J, Irrgang J, Snyder-Mackler L. Measuring functional improvement after total knee arthroplasty requires both performance-based and patient-report assessments: a longitudinal analysis of outcomes. J Arthroplasty . 2011;26:728–737. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Nunez M, Nunez E, del Val J, Ortega R, Segur J, Hernandez M, Lozano L, Sastre S, Macule F. Health-related quality of life in patients with osteoarthritis after total knee replacement: factors influencing outcomes at 36 months of follow-up. Osteoarthritis Cartilage . 2007;15:1001–1007. [DOI] [PubMed] [Google Scholar]
- 19.Paul SS, Tiedemann A, Hassett LM, Ramsay E, Kirkham C, Chagpar S, Sherrington C. Validity of the Fitbit activity tracker for measuring steps in community-dwelling older adults. BMJ Open Sport Exerc Med. 2015;1:e000013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Sallis J, Saelens B. Assessment of physical activity by self-report: status, limitations, and future directions. Res Q Exerc Sport. 2000;71:1–14. [DOI] [PubMed] [Google Scholar]
- 21.Schneider M, Chau L. Validation of the Fitbit Zip for monitoring physical activity among free-living adolescents. BMC Res Notes. 2016;9:448. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Seichriest F, Kyle R, Marek D, Spates J, Saleh K, Kuskowski M. Activity level in young patients with primary total hip arthroplasty: a 5-year minimum follow-up. J Arthroplasty. 2007;22:39–47. [DOI] [PubMed] [Google Scholar]
- 23.Toogood PA, Abdel MP, Spear JA, Cook SM, Cook DJ, Taunton MJ. The monitoring of activity at home after total hip arthroplasty. Bone Joint J . 2016;98:1450–1454. [DOI] [PubMed] [Google Scholar]
- 24.Tully M, McBride C, Heron L, Hunter R. The validation of Fitbit Zip™ physical activity monitor as a measure of free-living physical activity. BMC Res Notes. 2014;7:952–956. [DOI] [PMC free article] [PubMed] [Google Scholar]