Abstract
Background
Patient-reported outcome measures (PROMs) rely on individual interpretation of questions and recall ability, which limits their standardization. We developed an extended reality (XR) PROM, transforming questions from the Disabilities of the Arm, Shoulder, and Hand (DASH) measure into standardized tasks and assessed its usability, validity, and reliability (X-DASH). We hypothesized that the X-DASH would show equivalent preliminary validity and reliability as conventionally administered DASH questions (C-DASH).
Methods
The X-DASH simulated 6 activities corresponding to the DASH questionnaire (jar opening, key turning, surface cleaning, back washing, bread cutting, and hammering) and was evaluated in 2 phases. In phase 1, usability was assessed with 20 healthy subjects using the System Usability Scale. In both phases, versions of the questionnaire were delivered conventionally via computer (C-DASH) and using X-DASH. Phase 2 included 40 patients with documented shoulder pathology using a randomized, cross-sectional design.
Results
Phase 1: X-DASH scores were significantly higher (ie, worse) than C-DASH scores (P = .019), specifically due to item 4 (wash your back). No subjects reported higher scores in C-DASH. System Usability Scale scores showed good usability. Phase 2: Composite scores for the C-DASH were higher on average than X-DASH (P < .001). The structural validity of each showed a similar average variance explained (57%, 50%). Order of administration (C-DASH or X-DASH given first) had negligible effects. Twelve patients had large score discrepancies (composite score difference > ±10). Five patients were significantly more likely to select extreme responses ('unable' or 'extreme') in the C-DASH format (P = .009). For items 3 and 6 (with XR tasks) and items 7 and 10 (without XR tasks), significantly more participants scored higher on the C-DASH than the X-DASH (items 3 and 6: P < .001; item 7: P = .036; item 10: P = .035). C-DASH and X-DASH demonstrated similar measurement reliability. All patients reported that the XR system was easy to use.
Conclusion
To our knowledge, this is the first study to evaluate XR-based PROM delivery. The X-DASH shows similarities to C-DASH in our proof of concept, though scoring differences compared to C-DASH warrant further investigation.
Keywords: Outcome measures, Disability assessment, Shoulder, Virtual reality, Extended reality, Patient reported outcome measures
Patient-reported outcome measures (PROMs) have advanced reporting condition states and treatment outcomes in orthopedic surgery. The Disabilities of the Arm, Shoulder, and Hand (DASH and Quick DASH) are common PROMs to assess patients with shoulder conditions.18 Conventional PROM delivery rests solely upon the patient's perception.1,11,18,35 Though validated PROMs are informative, they have 2 main issues. First, they rely on a patient's recollections of disabilities, introducing the potential for overestimating treatment benefits and difficulties in categorizing limitations.7,16,30,36 Second, questions are open to various interpretations.
These issues limit standardization. For example, in the Quick DASH, general questions, such as item 2 ask about the ability to perform “heavy household chores,” and item 6 asks about “recreational activities,” using hammering as an example (Appendix A). While subjectivity can help account for variability in lifestyle and background,28 extended reality (XR) provides a new method for PROMs, offering greater specificity through standardized interactive simulations. XR encompasses a range of technologies, including virtual reality and augmented reality, enabling users to interact with simulations of real-world experiences. In XR-based PROMs, patients can answer standard questions as well as perform specific activities that expose their functional abilities and limitations.
Conventional PROMs vary in their assessment of patients' functional abilities, and different delivery methods have been described, including app-based platforms that integrate with electronic health records.17,20,28,31 Benefits of PROMs include increasing patient involvement, shifting the focus of consultations, improving the quality of care, facilitating patient monitoring, and enhancing patient–physician relationships.6 However, conventional PROMs have limitations, and both the European Medicines Agency and Food and Drug Administration have recommended including objective functional outcomes as adjuncts.6,8,13 XR offers a new platform for PROM delivery, providing opportunities for improved standardization and objectivity.
To date, the use of XR in musculoskeletal care has focused on surgical education, with its potential for patient assessments remaining underexplored.14,21, 22, 23, 24, 25, 26,32 Given XR's ability to engage users in targeted activities, customize simulated experiences, and capture performance metrics, its potential in clinical assessments warrants further investigation.3,9,15,27,32, 33, 34 To examine this potential, we developed an XR version of the DASH instrument (X-DASH) that guides users through specific activities before answering conventional DASH questions (Video 1).
Materials and methods
This investigation was conducted in 2 phases. Institutional review board approval was obtained prior to enrollment.
Study participants
Phase 1 involved healthy, English-speaking adults (>18 years) without upper-extremity–related musculoskeletal conditions. Twenty subjects (8 women, 12 men) between the ages of 23 and 73 (mean [M] = 40.3 years, standard deviation [SD] = 16.8) were enrolled.
For phase 2, patients presenting for an evaluation and/or treatment of an upper-extremity condition were screened. Forty English-speaking patients (19 women, 21 men) between the ages of 28 and 80 (M = 60.1, SD = 12.6) were enrolled. Patient diagnoses included rotator cuff tears (n = 24), glenohumeral arthritis (n = 12), biceps tendonitis (n = 3), and bursitis (n = 1).
X-DASH design
The application was designed in Unity3D (Unity Technologies, San Francisco, CA), running from a personal computer and deployed using the Meta Quest Pro XR device (Meta, Menlo Park, CA). Using the Quick DASH as a model, we designed interactive simulations for 6 items and included 5 items asking about impacts on work, social activities, and sleep as questions (Fig. 1). The newly derived X-DASH is thus divergent from the original DASH instrument. Subjects completed tasks in XR, followed by the conventional questions. We substituted the key-turning question from the DASH for the bag-carrying question of Quick DASH, allowing for a fully seated test. This modified computer-administered DASH was termed the C-DASH. Evaluation was performed while seated to minimize contributions from lower-extremity or thoracolumbar pathology and accommodate patients with disabilities in these domains.
Figure 1.
(a) A view from the user's perspective in XR showing an assessment task and disability survey question rating difficulty level “using a knife to cut food.” (b) A person using an XR headset and controllers to demonstrate the X-DASH assessment system. (c) A 3D plot of motion data, contextualized within the virtual environment, displaying the movement trajectories captured during a performance of this assessment task. XR, extended reality.
The C-DASH was administered electronically, asking all 11 questions on a five-point scale. The X-DASH was administered in XR, consisting of 6 simulated activities, each followed by its respective questions, and 5 stand-alone questions. The question and response options were phrased the same, and composite scores were calculated using the DASH scoring rubric for both versions.20
Phase 1 execution
In phase 1, participants completed the C-DASH followed by the X-DASH. They then completed the System Usability Scale (SUS),5 an instrument for assessing usability challenges. SUS items are scored using a 5-point Likert scale and converted to a range of 0-100. Scores above 68 are considered above average, with 70-80 classified as "good.”2 An investigator-developed user experience (UX) survey was also administered, including 17 questions and open-response fields to gather data about the X-DASH experience4,37 Table I.
Table I.
UX survey results with questions, frequencies of responses, and percentages.
| UX question | Response | Frequency | Percent |
|---|---|---|---|
| Was the visual experience comfortable throughout the VR session? | No | 0 | 0.00 |
| Yes | 40 | 100.00 | |
| Did you experience any eye strain? | No | 37 | 92.50 |
| Yes | 3 | 7.50 | |
| Did you feel dizzy or nauseous? | No | 1 | 2.50 |
| Yes | 39 | 97.50 | |
| Did you find the VR headset comfortable to wear? | No | 2 | 5.00 |
| Yes | 38 | 95.00 | |
| Did you find it easy to learn how to use the VR system? | No | 0 | 0.00 |
| Yes | 40 | 100.00 | |
| Was it easy for you to use the controllers to complete the tasks? | No | 3 | 7.50 |
| Yes | 37 | 92.50 | |
| Was grabbing objects in VR intuitive? | No | 9 | 22.50 |
| Yes | 31 | 77.50 | |
| Were you frustrated by any of the tasks? | No | 31 | 77.50 |
| Yes | 9 | 22.50 | |
| Did you enjoy completing the tasks? | No | 3 | 7.50 |
| Yes | 37 | 92.50 | |
| Did you feel aware of your real-world surroundings while using the VR system? | No | 23 | 57.50 |
| Yes | 17 | 42.50 | |
| Was visibility of the real-world distracting? | No | 40 | 100.00 |
| Yes | 0 | 0.00 | |
| Did you find the audiovisual feedback of the system useful and interesting? | No | 4 | 10.00 |
| Yes | 36 | 90.00 | |
| Is there anything you would want to be presented differently? | No | 30 | 75.00 |
| Yes | 10 | 25.00 | |
| Would you feel confident in setting up and using a system like this at home without assistance? | No | 13 | 32.50 |
| Yes | 27 | 67.50 | |
| Would you prefer to be given the X-DASH over the C-DASH when preparing for an appointment with your doctor? | No | 20 | 50.00 |
| Yes | 20 | 50.00 |
| UX question | Response | Frequency | Percent |
|---|---|---|---|
| How easy/difficult was it to focus on the text instructions? Very easy (1) – very difficult (5) |
Very easy (1) | 21 | 52.50 |
| Easy (2) | 13 | 32.50 | |
| Neutral (3) | 3 | 7.50 | |
| Difficult (4) | 2 | 5.00 | |
| Very difficult (5) | 1 | 2.50 | |
| The objects presented in the virtual environment looked real. | Strongly disagree (1) | 3 | 7.50 |
| Disagree (2) | 1 | 2.50 | |
| Neutral (3) | 4 | 10.00 | |
| Agree (4) | 21 | 52.50 | |
| Strongly agree (5) | 11 | 27.50 | |
| Performing the simulated tasks felt real. | Strongly disagree (1) | 5 | 12.50 |
| Disagree (2) | 3 | 7.50 | |
| Neutral (3) | 11 | 27.50 | |
| Agree (4) | 15 | 37.50 | |
| Strongly agree (5) | 6 | 15.00 | |
| The simulated tasks accurately reflected the questions from the traditional DASH survey. | Strongly disagree (1) | 2 | 5.00 |
| Disagree (2) | 2 | 5.00 | |
| Neutral (3) | 4 | 10.00 | |
| Agree (4) | 20 | 50.00 | |
| Strongly agree (5) | 12 | 30.00 | |
| How frequently do you use VR technology? | Never (1) | 31 | 77.50 |
| Once or a few times previously (2) | 8 | 20.00 | |
| (3) | 0 | 0.00 | |
| A few times a mo (4) | 1 | 2.50 | |
| (5) | 0 | 0.00 |
UX, user experience; VR, virtual reality; DASH, Disabilities of the Arm, Shoulder, and Hand.
Phase 2 execution
Participants in phase 2 were randomized to complete the C-DASH or X-DASH first in a counterbalanced design. Collected variables included age, sex, handedness, unilateral or bilateral pain, clinical diagnosis, previous treatment, and time to complete the X-DASH.
Statistical methods
The primary outcome measure was the composite scores for the C-DASH and X-DASH. Secondary outcome measures included SUS scores in phase 1 and UX responses in phase 2. The validity and reliability of the C-DASH and X-DASH were determined using average variance explained and intraclass correlation coefficients (ICCs). The ICC was determined between groups and not test–retest reliability. Descriptive statistics were used to summarize participant variables, and categorical variables were presented as frequencies and percentages.
For phase 1, Spearman rank-order correlation was used to assess the association between C-DASH and X-DASH, with Wilcoxon signed-rank tests for both cumulative scores and individual items. For phase 2, Pearson correlation and paired t-tests were used to compare the cumulative scores of C-DASH and X-DASH, while Wilcoxon signed-rank tests were used to compare individual items. A two-way repeated measures analysis of variance was conducted to examine the effect of administration order. Tests were performed on cumulative and question-level data, which may skew results using parametric testing. Data were analyzed using IBM SPSS (v29; IBM Corp., Armonk, NY, USA) and R (v4.3.3; R Foundation for Statistical Computing, Vienna, Austria).
Results
Phase 1 results
X-DASH scores (median = 2.27, interquartile range = 9.09) were significantly higher than C-DASH scores (median = 0.00, interquartile range = 2.27), Z = −2.354, P = .019, with a strong positive correlation observed between cumulative scores (ρ = 0.724, P < .001; Fig. 2A). Subjects typically reported “no difficulty” or “none” for all questions except item 4, which asks about the ability to wash one's back. A Wilcoxon signed-rank test indicated that X-DASH scores for item 4 were significantly higher than C-DASH scores (Z = −2.887, P = .004) in 9 of 20 subjects (45%). In comparison, no subjects reported higher scores in C-DASH. SUS scores showed “good” usability (M = 77.63, SD = 14.59).
Figure 2.
(a) Healthy controls. (b) Patients scatterplots of X-DASH vs. C-DASH total scores. The dashed line indicates perfect score alignments between the X-DASH and C-DASH modalities. C-DASH, conventional disabilities of the arm, shoulder, and hand; X-DASH, extended reality (XR) disabilities of the arm, shoulder and hand.
Phase 2 results
Validity
The composite score for the C-DASH was higher than the X-DASH (M = 5.00, SD = 7.55; 95% confidence interval [CI] = 2.59-7.41; P < .001; Fig. 2B), with a positive correlation between the cumulative scores (r = 0.919, P < .001). The structural validity of the C-DASH and X-DASH showed similar average variance explained (57% vs. 50%), and both achieved >50% of the combined data variance. The analysis of counterbalancing revealed a negligible ordering effect, accounting for 0.7% of the variance.
There was no significant difference in scores across diagnosis groups using analysis of variance (C-DASH, F(3, 36) = 0.685, P = .567; X-DASH, F (3, 36) = 0.504, P = .682) (Fig. 3). Twelve patients had large score discrepancies, with 5 patients (n = 5/40; 12.5%) displaying extreme responses, especially on the last 5 items, which asked about interference with social and work activities, sleep, and pain levels. A significant difference was observed for maximum scores: “unable” for task-related questions and “extreme” for pain-related questions. The Wilcoxon signed-rank test showed that participants were more likely to choose extreme responses in the C-DASH format (Z = 2.623, P = .009).
Figure 3.
(a) Box plots of cumulative C-DASH and X-DASH scores showing C-DASH (M = 43.07, SD = 19.06) and X-DASH (M = 38.07, SD = 16.83). (b) Box plots of score differences according to diagnosis. C-DASH, conventional disabilities of the arm, shoulder, and hand; SD, standard deviation; M, mean; X-DASH, extended reality (XR) disabilities of the arm, shoulder and hand.
Comparative analysis at the question level using a significance threshold of P < .05 shows that cleaning (item 3), hammering (item 6), social interference (item 7), and tingling (item 10) contribute most to the differences between the C-DASH and X-DASH (Fig. 4). For items 3 and 6, following the activity simulations, significantly more participants scored higher on the C-DASH than the X-DASH (item 3: n = 20 vs. n = 2, W = 236, Z = −3.731, P < .001) (item 6: n = 29 vs. n = 2, W = 478, Z = −4.613, P < .001). For items 7 and 10, without activity simulations, significantly more participants scored higher on the C-DASH than the X-DASH (item 7: n = 11 vs. n = 2, W = 74, Z = −2.101, P = .036) (item 10: n = 7 vs. n = 1, W = 32, Z = −2.111, P = .035).
Figure 4.
Phase 22 median differences between C-DASH and X-DASH items, with IQRs and significance levels from the Wilcoxon signed-rank test. Orange bars indicate statistically significant differences (P < .05), while blue bars indicate nonsignificant differences. C-DASH, conventional disabilities of the arm, shoulder, and hand; IQR, interquartile range; X-DASH, extended reality (XR) disabilities of the arm, shoulder and hand.
Reliability
Internal consistency was measured using Cronbach alpha for C-DASH (α = 0.89; 95% CI = 0.84-0.94) and X-DASH (α = 0.87; 95% CI = 0.79-0.92), demonstrating that both surveys measured the same concept equivalently. Both C-DASH (ICC = 0.89; 95% CI = 0.84-0.94) and X-DASH (ICC = 0.87; 95% CI = 0.80-0.92) demonstrated similar measurement reliability.
UX results
All patients (n = 40/40; 100%) reported finding the visual experience comfortable and that the XR system easy to use (Table I). When asked: “would you prefer to be given the X-DASH over the C-DASH when preparing for an appointment with your doctor?”, 20 of 40 (50%) responded “yes.” In the open-response fields, patients (n = 12/40, 30%) provided insight into the differences found for cleaning (item 3) and hammering (item 6): 2 stated enjoyment, 3 expressed frustration, and 7 noted differences between the simulation and reality. 80% of patients felt that the simulation looked real, with 52.5% reporting it felt real, supporting concepts of realism and face validity. Similarly, 80% of patients felt that the X-DASH accurately reflected questions from the C-DASH, supporting content validity. Patients averaged 6.78 minutes (SD = 2.21 minutes) to complete the 6 X-DASH tasks and 0.88 minutes (SD = 0.29 minutes) to complete the 5 non–task-related questions (Table II).
Table II.
Summary table for the time in minutes taken in the X-DASH.
| Task type | N | Minimum (min) | Maximum (min) | Mean (min) | Standard deviation (min) |
|---|---|---|---|---|---|
| Tasks 1-6 | 40 | 3.70 | 13.69 | 6.78 | 2.21 |
| Questions 7-11 | 40 | 0.43 | 2.03 | 0.88 | 0.29 |
X-DASH, extended reality (XR) disabilities of the arm, shoulder and hand.
Discussion
We designed and examined the X-DASH as an XR-based PROM modeled on the Quick DASH questionnaire. While confirming our hypothesis of equivalency, important differences were also observed, such as back washing in healthy subjects and items 3, 6, 7, and 10 in our patient population. The data indicate that XR can be a valid method of PROM delivery, offering additional opportunities to differentiate a patient's actual function from their perception or recollection of abilities using simulated tasks.
There is a known risk of bias with PROMs, and tools such as the COnsensus-based Standards for the selection of health status Measurement INstruments (COSMIN) checklist have been developed to evaluate their delivery. A recent systematic review demonstrated that the DASH performs well on COSMIN metrics of reliability, structural validity, hypothesis testing, and responsiveness.19 However, negative evidence was found for measurement error, and despite multiple studies, evidence is still lacking regarding internal consistency, content validity, and criterion validity metrics, a critique applicable to most PROMs in shoulder surgery.19 Following COSMIN, we demonstrated that the X-DASH was directly comparable to the C-DASH. The X-DASH has high internal consistency (α = 0.87; 95% CI = 0.79-0.92) and reliability (ICC = 0.87; 95% CI = 0.80-0.92), demonstrating its ability to distinguish between healthy and nonhealthy patients with homogeneity. Other domains of COSMIN, such as test-retest reliability, predictive ability, and cross-cultural validity/measurement invariance were not tested.
In healthy subjects, the X-DASH revealed awareness of limitations that the C-DASH did not detect. This shows that task-based performance can improve the accuracy of self-assessments by testing functional abilities. In our patient population, we found that X-DASH scores were lower (ie, showing less disability) mainly due to 2 items: item 3, difficulty performing household chores, and item 6, difficulty sustaining impact-related forces from recreational activities such as golf, tennis, or hammering. The C-DASH asks about these items in general, while the X-DASH models specific simulated activities. These differences demonstrate the potential of XR to reduce recall bias through standardizing assessment tasks and suggest that the greater specificity inherent in simulations may support more accurate appraisals of functional abilities. Practically, limitations exist in delivery of XR-derived PROMs relative to conventional computer acquired formats including hardware acquisition and maintenance, higher completion times, and dedicated staff administration.
In patients, we observed differences between the C-DASH and X-DASH questions on quality of life and pain (eg, item 7, interference with social activities, and item 10, tingling sensations). Examining this, we observed a tendency for patients to report more extreme responses on the C-DASH, where an anchoring effect or extreme response style bias may be present. Thus, where a patient exhibits extreme responses, those responses may influence subsequent questions.12,29 In addition, differences in presentation format may reduce anchoring or extreme response style bias, and using XR may have a moderating effect on extreme responses. Alternatively, patients may be impressed with the simulations and alter their responses or they may enjoy the simulated activities and recognize fewer limitations in the process. Further study using psychometric tools such as item response theory or differential item functioning analyses could be performed to differentiate item-level variability.
This work has several limitations. While 50% of patients preferred to use the XR system, our sample consisted only of those interested in participating. Patients did not repeat the X-DASH; therefore, the absolute measurement error, through minimal important change or smallest detectable change of the X-DASH, cannot be determined.10 We did not determine test-retest validity using the X-DASH. The X-DASH may also be limited by ceiling effects in certain shoulder pathologies.18 Interpreting data from question-level ordinal scores may be skewed and effect parametric statistical testing used. While significant differences were not detected between diagnoses, these groups were not balanced in the sample. In addition, the simulated objects have no weight or resistance, limiting their influence in strength, endurance, and force-based activities. The X-DASH evaluates bilateral extremities and may introduce bias. The X-DASH is a modification of the DASH and scores are not interchangeable with the DASH or Quick DASH at this stage. Furthermore, we did not evaluate the comprehensiveness of the modifications using qualitative analyses such as cognitive debriefing with included patients. Lastly, as an initial validation of the X-DASH, this study is limited in describing or validating further objective performance metrics that it is capable of, such as assessments of patients' motion data.
Conclusion
We found the X-DASH to show certain validity and reliability metrics when compared to the C-DASH in healthy participants and patients; however, in both groups, XR as a delivery modality altered users' perspectives on their functional abilities. Longitudinal study is necessary to determine the responsiveness of the X-DASH as a PROM instrument and how this relates to treatment outcomes. Further study is required to better understand the design of simulated activities in XR as performance measures in functional assessments.
Disclaimers:
Funding: Unfunded.
Conflicts of interest: The authors, their immediate families, and any research foundations with which they are affiliated have not received any financial payments or other benefits from any commercial entity related to the subject of this article.
Footnotes
Institutional review board (IRB) statement: This work received IRB approval through the Mass General Brigham (MGB) Institutional Review Board: 2023P003534.
Supplementary data to this article can be found online at https://doi.org/10.1016/j.xrrt.2026.100685.
Supplementary Data
Appendix A.
A playthrough of the X-DASH, captured from within the XR application, showing a user (represented by the motions of the XR headset and controllers) completing a short tutorial and performing the tasks simulating activities of daily living corresponding to the DASH questionnaire (jar opening, key turning, surface cleaning, back washing, bread cutting, and hammering), and 5 additional PROMs questions from the Quick DASH. The playthrough is followed by figures that summarize the study results. DASH, Disabilities of the Arm, Shoulder, and Hand; XR, extended reality; PROM, patient-reported outcome measure.
References
- 1.Almeida R.F., Pereira N.D., Ribeiro L.P., Barreto R.P.G., Kamonseki D.H., Haik M.N., et al. Is the disabilities of the arm, shoulder and hand (DASH) questionnaire adequate to assess individuals with subacromial pain syndrome? Rasch model and international classification of functioning, disability and health. Phys Ther. 2021;101 doi: 10.1093/ptj/pzab065. [DOI] [PubMed] [Google Scholar]
- 2.Bangor A., Kortum P.T., Miller J.T. An empirical evaluation of the system usability scale. Int J Human–Computer Interact. 2008;24:574–594. doi: 10.1080/10447310802205776. [DOI] [Google Scholar]
- 3.Barton A.C., Sheen J., Byrne L.K. Immediate attention enhancement and restoration from interactive and immersive technologies: a scoping review. Front Psychol. 2020;19:11. doi: 10.3389/fpsyg.2020.02050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Bowman D.A., Gabbard J.L., Hix D. A survey of usability evaluation in virtual environments: classification and comparison of methods. Presence Teleoperators Virtual Environ. 2002;11:404–424. doi: 10.1162/105474602760204309. [DOI] [Google Scholar]
- 5.Brooke J. In: Usability evaluation in industry. Jordan P.W., Thomas B.A., Weerdmeester B., McClelland I.L., editors. Taylor & Francis; London, UK: 1996. SUS: a “quick and dirty” usability scale; pp. 189–194. [Google Scholar]
- 6.Campbell R., Ju A., King M.T., Rutherford C. Perceived benefits and limitations of using patient-reported outcome measures in clinical practice with individual patients: a systematic review of qualitative studies. Qual Life Res. 2022;31:1597–1620. doi: 10.1007/s11136-021-03003-z. [DOI] [PubMed] [Google Scholar]
- 7.Clarke P.M., Fiebig D.G., Gerdtham U.-G. Optimal recall length in survey design. J Health Econ. 2008;27:1275–1284. doi: 10.1016/j.jhealeco.2008.05.012. [DOI] [PubMed] [Google Scholar]
- 8.Cook C.E., Wright A., Wittstein J., Barbero M., Tousignant-Laflamme Y. Five recommendations to address the limitations of patient-reported outcome measures. J Orthop Sports Phys Ther. 2021;51:562–565. doi: 10.2519/jospt.2021.10836. [DOI] [PubMed] [Google Scholar]
- 9.Cowan A., Chen J., Mingo S., Reddy S.S., Ma R., Marshall S., et al. Virtual reality vs dry laboratory models: comparing automated performance metrics and cognitive workload during robotic simulation training. J Endourol. 2021;35:1571–1576. doi: 10.1089/end.2020.1037. [DOI] [PubMed] [Google Scholar]
- 10.Devji T., Carrasco-Labra A., Qasim A., Phillips M., Johnston B.C., Devasenapathy N., et al. Evaluating the credibility of anchor based estimates of minimal important differences for patient reported outcomes: instrument development and reliability study. BMJ. 2020;369 doi: 10.1136/bmj.m1714. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Dixon D., Johnston M., McQueen M., Court-Brown C. The Disabilities of the Arm, Shoulder and Hand Questionnaire (DASH) can measure the impairment, activity limitations and participation restriction constructs from the International Classification of Functioning, Disability and Health (ICF) BMC Musculoskelet Disord. 2008;9:114. doi: 10.1186/1471-2474-9-114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Dowling N.M., Bolt D.M., Deng S., Li C. Measurement and control of bias in patient reported outcomes using multidimensional item response theory. BMC Med Res Methodol. 2016;16:63. doi: 10.1186/s12874-016-0161-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.European Medicines Agency Appendix 2 to the Guideline on the evaluation of anticancer medicinal products in man: the use of patient-reported outcome (PRO) measures in oncology studies [Internet] 2016. https://www.ema.europa.eu/en/documents/scientific-guideline/guideline-clinical-evaluation-anticancer-medicinal-products-revision-6_en.pdf Available at: Accessed December 10, 2025.
- 14.Goh G.S., Lohre R., Parvizi J., Goel D.P. Virtual and augmented reality for surgical training and simulation in knee arthroplasty. Arch Orthop Trauma Surg. 2021;141:2303–2312. doi: 10.1007/s00402-021-04037-1. [DOI] [PubMed] [Google Scholar]
- 15.Guha D., Alotaibi N.M., Nguyen N., Gupta S., McFaul C., Yang V.X.D. Augmented reality in neurosurgery: a review of current concepts and emerging applications. Can J Neurol Sci. 2017;44:235–245. doi: 10.1017/cjn.2016.443. [DOI] [PubMed] [Google Scholar]
- 16.Hoffmann T.C., Del Mar C. Patients’ expectations of the benefits and harms of treatments, screening, and tests: a systematic review. JAMA Intern Med. 2015;175:274. doi: 10.1001/jamainternmed.2014.6016. [DOI] [PubMed] [Google Scholar]
- 17.Hollinshead R.M., Mohtadi N.G.H., Vande Guchte R.A., Wadey V.M.R. Two 6-year follow-up studies of large and massive rotator cuff tears: Comparison of outcome measures. J Shoulder Elbow Surg. 2000;9:373–379. doi: 10.1067/mse.2000.108389. [DOI] [PubMed] [Google Scholar]
- 18.Hsu J.E., Nacke E., Park M.J., Sennett B.J., Huffman G.R. The Disabilities of the Arm, Shoulder, and Hand questionnaire in intercollegiate athletes: validity limited by ceiling effect. J Shoulder Elbow Surg. 2010;19:349–354. doi: 10.1016/j.jse.2009.11.006. [DOI] [PubMed] [Google Scholar]
- 19.Huang H., Grant J.A., Miller B.S., Mirza F.M., Gagnier J.J. A systematic review of the psychometric properties of patient-reported outcome instruments for use in patients with rotator Cuff disease. Am J Sports Med. 2015;43:2572–2582. doi: 10.1177/0363546514565096. [DOI] [PubMed] [Google Scholar]
- 20.Hudak P.L., Amadio P.C., Bombardier C., Beaton D., Cole D., Davis A., et al. Development of an upper extremity outcome measure: the DASH (disabilities of the arm, shoulder, and head) Am J Ind Med. 1996;29:602–608. doi: 10.1002/(SICI)1097-0274(199606)29:6<602::AID-AJIM4>3.0.CO;2-L. [DOI] [PubMed] [Google Scholar]
- 21.Lohre R., Bois A.J., Athwal G.S., Goel D.P., Society CSES Improved complex skill acquisition by immersive virtual reality training: a randomized controlled trial. J Bone Joint Surg Am. 2020;102 doi: 10.2106/JBJS.19.00982. [DOI] [PubMed] [Google Scholar]
- 22.Lohre R., Bois A.J., Pollock J.W., Lapner P., McIlquham K., Athwal G.S., et al. Effectiveness of immersive virtual reality on orthopedic surgical skills and knowledge acquisition among senior surgical residents: a randomized clinical trial. JAMA Netw Open. 2020;3 doi: 10.1001/jamanetworkopen.2020.31217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Lohre R., Verhofste B., Hedequist D., Jacobson J., Goel D. The use of Immersive Virtual Reality (IVR) in pediatric orthopaedic education. J Pediatr Orthop Soc N Am. 2022;4:522. doi: 10.55275/JPOSNA-2022-0063. [DOI] [Google Scholar]
- 24.Lohre R., Wang J.C., Lewandrowski K.-U., Goel D.P. Virtual reality in spinal endoscopy: a paradigm shift in education to support spine surgeons. J Spine Surg. 2020;6(Suppl 1):S208–S223. doi: 10.21037/jss.2019.11.16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Lohre R., Warner J.J.P., Athwal G.S., Goel D.P. The evolution of virtual reality in shoulder and elbow surgery. JSES Int. 2020;4:215–223. doi: 10.1016/j.jseint.2020.02.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Mao R.Q., Lan L., Kay J., Lohre R., Ayeni O.R., Goel D.P., et al. Immersive virtual reality for surgical training: a systematic review. J Surg Res. 2021 Dec;268:40–58. doi: 10.1016/j.jss.2021.06.045. [DOI] [PubMed] [Google Scholar]
- 27.Mergen M., Graf N., Meyerheim M. Reviewing the current state of virtual reality integration in medical education - a scoping review. BMC Med Educ. 2024;24:788. doi: 10.1186/s12909-024-05777-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Neve O.M., Van Benthem P.P.G., Stiggelbout A.M., Hensen E.F. Response rate of patient reported outcomes: the delivery method matters. BMC Med Res Methodol. 2021;21:220. doi: 10.1186/s12874-021-01419-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Ni P., Marino M., Dore E., Sonis L., Ryan C.M., Schneider J.C., et al. Extreme response style bias in burn survivors. PLoS one. 2019;14 doi: 10.1371/journal.pone.0215898. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Nusser S.M., Beyler N.K., Welk G.J., Carriquiry A.L., Fuller W.A., King B.M.N. Modeling errors in physical activity recall data. J Phys Act Health. 2012;9:S56–S67. doi: 10.1123/jpah.9.s1.s56. [DOI] [PubMed] [Google Scholar]
- 31.Richards R.R., An K.-N., Bigliani L.U., Friedman R.J., Gartsman G.M., Gristina A.G., et al. A standardized method for the assessment of shoulder function. J Shoulder Elbow Surg. 1994;3:347–352. doi: 10.1016/S1058-2746(09)80019-0. [DOI] [PubMed] [Google Scholar]
- 32.Stasolla F., Passaro A., Di Gioia M., Curcio E., Zullo A. Combined extended reality and reinforcement learning to promote healthcare and reduce social anxiety in fragile X syndrome: a new assessment tool and a rehabilitative strategy. Front Psychol. 2023;20:14. doi: 10.3389/fpsyg.2023.1273117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Suzuki K., Mariola A., Schwartzman D.J., Seth A.K. In: Virtual reality in behavioral neuroscience: New insights and methods. Maymon C., Grimshaw G., Wu Y.C., editors. Springer International Publishing; Cham: 2023. Using extended reality to study the experience of presence; pp. 255–285. [DOI] [Google Scholar]
- 34.Szczepocka E., Mokros Ł., Kaźmierski J., Nowakowska K., Łucka A., Antoszczyk A., et al. Virtual reality-based training may improve visual memory and some aspects of sustained attention among healthy older adults – preliminary results of a randomized controlled study. BMC Psychiatry. 2024;24:347. doi: 10.1186/s12888-024-05811-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Van Lieshout E.M.M., Mahabier K.C., Tuinebreijer W.E., Verhofstad M.H.J., Den Hartog D., Bolhuis H.W., et al. Rasch analysis of the Disabilities of the Arm, Shoulder and Hand (DASH) instrument in patients with a humeral shaft fracture. J Shoulder Elbow Surg. 2020;29:1040–1049. doi: 10.1016/j.jse.2019.09.026. [DOI] [PubMed] [Google Scholar]
- 36.Wechsler M.E., Kelley J.M., Boyd I.O.E., Dutile S., Marigowda G., Kirsch I., et al. Active albuterol or placebo, sham acupuncture, or no intervention in asthma. N Engl J Med. 2011;365:119–126. doi: 10.1056/NEJMoa1103319. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Zhang T., Booth R., Jean-Louis R., Chan R., Yeung A., Gratzer D., et al. A primer on usability assessment approaches for health-related applications of virtual reality. JMIR Serious Games. 2020;8 doi: 10.2196/18153. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
A playthrough of the X-DASH, captured from within the XR application, showing a user (represented by the motions of the XR headset and controllers) completing a short tutorial and performing the tasks simulating activities of daily living corresponding to the DASH questionnaire (jar opening, key turning, surface cleaning, back washing, bread cutting, and hammering), and 5 additional PROMs questions from the Quick DASH. The playthrough is followed by figures that summarize the study results. DASH, Disabilities of the Arm, Shoulder, and Hand; XR, extended reality; PROM, patient-reported outcome measure.





