Using Rasch Analysis to Validate the Michigan Hand Outcomes Questionnaire from the WRIST Trial

Mayank Jayaram; Chang Wang; Melissa J Shauver; Lu Wang; Kevin C Chung

doi:10.1097/PRS.0000000000008317

. Author manuscript; available in PMC: 2022 Oct 1.

Published in final edited form as: Plast Reconstr Surg. 2021 Oct 1;148(4):558e–567e. doi: 10.1097/PRS.0000000000008317

Using Rasch Analysis to Validate the Michigan Hand Outcomes Questionnaire from the WRIST Trial

Mayank Jayaram ¹, Chang Wang ², Melissa J Shauver ³, Lu Wang ⁴, Kevin C Chung ⁵

PMCID: PMC8462015 NIHMSID: NIHMS1704502 PMID: 34550939

Abstract

Background:

The Michigan Hand Outcomes Questionnaire (MHQ) is a patient-reported outcome measure (PROM) that has been validated in many upper extremity disorders using classical test theory. Rasch measurement analysis is a rigorous method of questionnaire validation that offers several advantages over classical test theory and will be used to assess the psychometric properties of the MHQ. This study will use Rasch analysis to evaluate the MHQ for distal radius fractures (DRFs) in older adults. The incidence and costs of DRFs are rising, and reliable assessment tools are needed to measure outcomes in this growing concern.

Methods:

Rasch analysis was performed using 6-month assessment data from the Wrist and Radius Injury Surgical Trial (WRIST). Each domain in the MHQ was independently analyzed for threshold ordering, person-item targeting, item fit, differential-item functioning, response dependency, unidimensionality, and internal consistency.

Results:

After collapsing disordered thresholds and removing any misfitting items from the model, five domains (Function, Activities of Daily Living, Work, Pain, and Satisfaction) demonstrated excellent fit to the Rasch model. Aesthetics demonstrated high reliability and internal consistency but had poor fit to the Rasch model.

Conclusions:

Rasch analysis further supports the reliability and validity of using the MHQ to assess hand outcomes in older adults following treatment for DRFs. Results from this study suggest MHQ scores should be interpreted in a condition-specific manner with more emphasis placed on interpreting individual domain scores, rather than the summary MHQ score.

Keywords: Rasch analysis, Michigan Hand Outcomes Questionnaire (MHQ), distal radius fractures (DRFs), Wrist and Radius Injury Surgical Trial (WRIST)

INTRODUCTION

Patient-reported outcome measures (PROMs) are invaluable tools that elicit patient perspectives following medical treatment. These questionnaires apply ordinal scales to quantify health-related quality of life (HRQOL) factors that are difficult to measure.¹ PROMs are important for assessing quality of care in hand conditions where objective outcome measurements such as grip strength and range of motion do not provide complete insight about a patient’s ability to perform daily activities.^2,3

The Michigan Hand Outcomes Questionnaire (MHQ) is a hand-specific PROM which measures outcomes for upper limb musculoskeletal disorders.⁴ Since its development 20 years ago, the MHQ has been translated into 12 languages and cited in over 200 publications.^5,6 The validity, reliability, and responsiveness of the MHQ have been demonstrated in several hand conditions including rheumatoid arthritis, osteoarthritis, nerve compressions, and systemic sclerosis.^7,8,9

Previous methods used to validate PROMs, including the MHQ, were based on classical test theory (CTT). However, CTT is limited because it relies on ordinal scales to validate a questionnaire. Ordinal scales assume equal stepwise changes in scores (the difference in score between good and very good is equivalent to the difference in score between fair and good) and cannot discriminate between individuals who select the same response option. Rasch measurement models address this limitation by converting ordinal rating scales into interval level measurements. ¹⁰ Rasch analyses rigorously test each item in a questionnaire for biases, redundancy, and ambiguity and can identify poorly functioning items that could be removed to optimize the PROM in clinical practice.¹¹ The Rasch technique identifies strengths and limitations of an instrument while fulfilling criteria necessary for traditional psychometric analyses.¹²

Rasch analysis has not been used to test the psychometric properties of the MHQ. This paper will apply the Rasch technique to data from the Wrist and Radius Injury Surgical Trial (WRIST), a multicenter, international clinical trial conducted to measure outcomes in older adults with distal radius fractures (DRFs). Data from WRIST will investigate the MHQ’s reliability, validity, and applicability to DRF treatment in older adults.

METHODS

Overview of the MHQ

The MHQ contains 37 items separated into 6 domains (Figure 1). A score for each domain is reported from 0 to 100. Lower scores indicate less pain, whereas higher scores indicate better hand performance for all other domains. Each domain measures one construct and will be tested independently.

WRIST

Rasch analysis will be performed using data from WRIST— a randomized clinical trial that assessed outcomes for DRF treatment in older adults. WRIST has been described in detail elsewhere.^13,14 Briefly, this trial enrolled patients with isolated, closed, displaced distal radius fractures for which surgery was indicated and offered as a treatment option; those who opted for non-surgical treatment were placed in a casting group, whereas those who opted for surgical treatment were randomized into one of three surgical groups: internal fixation with volar plates, percutaneous pinning, or external fixation. Participants completed the MHQ 6 weeks, 3 months, 6 months, 12 months, and 24 months following treatment. For this analysis we used 6-month MHQ responses. By 6 months, there are minimal differences in outcomes among treatments, ¹⁵ which enables us to consolidate all participants into one cohort regardless of treatment received.

Rasch Analysis

Rasch analysis was performed in R using the eRm package. The methodology is based off Tennant and Conaghan’s criteria for Rasch analysis.¹⁶

Model Derivation

Polytomous questionnaires (items have 3 or more response options) can be validated using Andrich’s rating scale model (RSM) or Master’s partial credit model (PCM). RSM assumes equal, step-wise distances between each response option, whereas PCM assumes step-wise differences in responses are not equivalent (Figure 2).^17,18 The likelihood ratio will be calculated to determine which model to use. A p-value <0.05 indicates the use of PCM, otherwise RSM will be used.¹⁹

Figure 2. — Response Structure for Equidistant and Non-equidistant Thresholds

Threshold Ordering

A threshold is the point of intersection between adjacent categories in a questionnaire where either response option is equally likely to occur.²⁰ For example, a threshold for the question “How was the strength in your hand” would occur when a respondent feels his/her hand strength is located at the midpoint between good and very good. In essence, the respondent has an equal chance of selecting good or very good. We expect a reliable questionnaire to capture the full range of morbidity for the trait being measured. Thus, we expect a consistent and logical progression of responses across a logit scale, with each subsequent response having the highest probability of being selected at some point along the continuum of severity.²¹ Inconsistent responses indicate disordered thresholds and occur when participants have difficulty distinguishing between responses.²² A category response curve will be generated for each item to evaluate threshold ordering. Disordered thresholds will be addressed by collapsing categories for analysis (i.e. combining agree and strongly agree into one option).

Item-Fit

Rasch analysis evaluates how well observed data fits the expected behavior of the model.¹² Items misfit when they do not perform as expected. For example, the MHQ asks “How difficult was it to wash dishes.” If individuals with low hand function select “not at all difficult” whereas individuals with high hand function select “very difficult,” the item is not performing as intended and will be removed from analysis.

Item-fit will be assessed using fit residuals (the individual differences in observed and expected behavior of the model). The residual sum of squares will be calculated and Chi-square (X²) statistics will be derived. Statistically significant p-values indicate poor fit and these items will be removed from analysis.²³ Fit residuals measure overall item fit; however, the infit (item fit for questions that are similar in difficulty to person-ability²⁴) and outfit (item fit for all items, regardless of person ability¹⁶) mean square statistics (MNSQ) will be calculated to determine misfit for individual items. MNSQ scores between 0.6 and 1.4 indicate good fit to the Rasch model.²⁵

Targeting

Targeting refers to how well the items capture the full range of severity and ability level for the study population. To evaluate targeting, questions are stratified by item-difficulty from easiest to hardest. Item-difficulty is plotted against a person’s ability to score well on the question. Items with greater difficulty relative to person-ability are less likely to be affirmed by the participant.²⁶ A good questionnaire is composed of items with varying levels of difficulty that encompass the full range of person-ability. A person-item map will be created for each domain to evaluate targeting.

Differential-Item functioning (DIF)

The property of invariance states that individuals with identical hand performance will respond similarly regardless of gender, age, socioeconomic status (SES), etc. Questions that do not fulfill this criterion exhibit differential-item functioning (DIF) and may be biased toward a specific demographic.²⁷

There are two subcategories of DIF: uniform and non-uniform. Uniform DIF occurs when differences in group responses are consistent across the category and non-uniform DIF occurs when this consistency is absent.²³ For example, the MHQ asks how difficult participants find washing dishes. If dominant hand contributes 1 unit of difficulty to dish washing and injured hand also contributes 1 unit of difficulty, then we expect the total difficulty of both effects combined to be 2 units. This is uniform DIF. Non-uniform DIF occurs if combining the two factors results in a non-additive sum to the task (i.e. combining dominant and injured hands makes washing dishes more difficult than adding their separate difficulty directly).

Uniform DIF is addressed by separating demographic groups and independently calibrating the scale for each group.²⁶ Items with non-uniform DIF are removed from analysis. This study will test DIF for injured hand (dominant or non-dominant). Three models will be generated: one assuming no DIF, one assuming uniform DIF, and one assuming non-uniform DIF. Insignificant interactions between models demonstrate an absence of DIF.²⁸ A significant difference between models 1 and 2 illustrates uniform DIF, and a significant difference between models 2 and 3 indicates non-uniform DIF.

Local Independence

Once the Rasch factor has been extracted and all misfitting items have been removed, a test for local independence is conducted to verify unidimensionality of the domain. Response dependency occurs when one question influences the response to another question, and will be tested by evaluating residual correlations between each item in a domain. The absence of large correlations implies local independence.²⁹ Additionally, the Martin-Löf test will be performed to test unidimensionality.³⁰ A p-value <0.05 suggests the domain measures multiple constructs.

Reliability

Cronbach’s alpha measures the internal consistency of a questionnaire. Cronbach’s α >0.70 indicates good internal consistency and Cronbach’s α >0.90 indicates excellent internal consistency with some redundancy.³¹

RESULTS

Study population

The analysis cohort included 232 WRIST participants who completed the 6-month MHQ (Table 1). Average age was 71 years. 87% of participants were female and 85% of participants were white. Approximately 24% of participants were treated using VLPS, 22% with external fixation, 21% with pinning, and 33% received casting. By 6 months, participants had good hand function, high satisfaction, and little pain. (Table 2). Aesthetics shows the highest domain score whereas Function shows the lowest level of hand performance. This is expected because DRFs result in minor changes to hand appearance and major reductions in hand/wrist motion. The likelihood ratio was significant for all domains and the PCM was used.

Table 1.

Baseline demographic and clinical characteristics


n	232
Female, n (%)	201 (87%)
Age, mean (SD)	71 (8.8)
median (range)^b	69 (58-97)
Race, n (%)
American Indian or Alaskan Native	1 (0%)
Asian	15 (6%)
Black	13 (5%)
Native Hawaiian or Pacific Islander	1 (0%)
White	198 (85%)
other	3 (1%)
missing	1 (0%)
Education, n (%)
High school diploma/GED or less	77 (33%)
Vocational school/ associate’s degree/some college	70 (30%)
Bachelor’s degree+	80 (34%)
missing	5 (2%)
Employment – baseline, n (%)
Full-time	39 (17%)
Part-time	33 (14%)
Retired	145 (63%)
Receiving disability	5 (2%)
Unemployed	9 (4%)
missing	1 (0%)
Household Income, n (%)
<$20,000	42 (18%)
$20,000 - $39,000	58 (25%)
$40,000 - $59,999	39 (17%)
$60,000+	67 (29%)
missing	26 (11%)
Functional Status – pre-injury, n (%)
Sedentary	21 (9%)
Under-active	114 (49%)
Active	96 (41%)
missing	1 (0%)
No. comorbidities, mean (SD)	3.5 (2.5)
median (range)	3 (0-12)
Treatment, n (%)
VLPS	56 (24%)
External Fixation	51 (22%)
Pinning	48 (21%)
Casting	77 (33%)

Open in a new tab

Table 2.

6-month MHQ scores

	mean (95% CI)	range
Function	71.5 (68.8-74.1)	5-100
ADLs	81.0 (78.2-83.7)	2-100
Work	76.6 (73.3-80-0)	0-100
Pain	22.7 (20,0-25.5)	0-95
Aesthetics	81.3 (78.6-84.0)	6-100
Satisfaction	76.7 (74.2-79.2)	11-100

Open in a new tab

Function Domain

Rasch analysis demonstrated excellent item fit (Table 3) with no disordered thresholds and adequate targeting to the study population (Figure 3a). No DIF was observed for dominant/non-dominant hand injury and no unusual patterns were observed in the residuals indicating local independence. Function was unidimensional and showed excellent reliability with a Cronbach’s alpha of 0.94 (Table 4). Overall, Function had excellent fit to the Rasch model.

Table 3:

Updated Fit to the Rasch model

Function Domain
Item	Item Statement	MNSQ Infit^a	MNSQ Outfit^b	Chi Square (X²)	P-value	DOF^c
1	How well did your hand work?	0.539	0.567	112.748	1	208
2	How well did your fingers move?	0.766	0.874	160.168	0.994	208
3	How well did your wrist move?	0.704	0.723	147.084	1	208
4	How was the strength in your hand?	0.883	0.869	184.479	0.878	208
5	How was the sensation (feeling) in your hand?	1.213	1.012	253.52	0.017	208
Activities of Daily Living Domain
Item	Item Statement	MNSQ Infit^a	MNSQ Outfit^b	Chi Square (X²)	P-value	DOF^c
1	Turn a door knob?	0.724	0.768	131.115	0.998	180
2	Pick up a coin?	1.052	0.941	190.406	0.283	180
3	Hold a glass of water?	0.597	0.859	108.124	1	180
4	Turn a key in a lock?	0.981	1.013	177.548	0.538	180
5	Hold a frying pan?	0.959	1.005	173.54	0.622	180
6	Open a Jar?	Item Misfit
7	Button a shirt/blouse?	1.167	1.076	211.213	0.056	180
8	Eat with a knife/fork?	1.11	1.135	200.874	0.137	180
9	Carry a grocery bag?	1.171	1.02	211.91	0.052	180
10	Wash dishes?	0.819	0.845	148.149	0.96	180
11	Wash your hair?	0.665	0.814	120.369	1	180
12	Tie shoelaces/knots?	0.614	0.757	111.102	1	180
Work Domain
Item	Item Statement	MNSQ Infit^a	MNSQ Outfit^b	Chi Square (X²)	P-value	DOF^c
1	How often were you unable to work because of problems with your hand(s)/wrist(s)?	Item Misfit
2	How often did you have to shorten your work day because of problems with your hand(s)/wrist(s)?	1.148	0.99	163.021	0.099	141
3	How often did you have to take it easy at your work because of problems with your hand(s)/wrist(s)?	0.704	0.707	100.032	0.996	141
4	How often did you accomplish less in your work because of problems with your hand(s)/wrist(s)?	0.606	0.611	86.097	1	141
5	How often did you take longer to do the tasks in your work because of problems with your hand(s)/wrist(s)?	0.695	0.738	98.67	0.997	141
Pain Domain
Item	Item Statement	MNSQ Infit^a	MNSQ Outfit^b	Chi Square (X²)	P-value	DOF^c
1	How often did you have pain in your hand(s)/wrist(s)?	0.861	1.067	146.408	0.895	169
2	Please describe the pain you had in your hand(s)/wrist(s).	Non-uniform DIF
3	How often did the pain in your hand(s)/wrist(s) interfere with your sleep?	0.594	0.775	100.988	1	169
4	How often did the pain in your hand(s)/wrist(s) interfere with your daily activities (such as eating or bathing)?	0.537	0.695	91.304	1	169
5	How often did the pain in your hand(s)/wrist(s) make you unhappy?	0.583	0.709	99.085	1	169
Aesthetics Domain
Item	Item Statement	MNSQ Infit^a	MNSQ Outfit^b	Chi Square (X²)	P-value	DOF^c
1	I am satisfied with the appearance (look) of my hand.	0.945	0.862	131.815	0.88	152
2	The appearance (look) of my hand sometimes made me uncomfortable in public.	0.511	0.451	68.938	1	152
3	The appearance (look) of my hand made me depressed.	Non-uniform DIF
4	The appearance (look) of my hand interfered with my normal social activities.	0.744	0.638	97.625	1	152
Satisfaction Domain
Item	Item Statement	MNSQ Infit^a	MNSQ Outfit^b	Chi Square (X²)	P-value	DOF^c
1	Overall function of your hand?	Non-uniform DIF
2	Motion of the fingers in your hand?	0.63	0.727	114.586	1	181
3	Motion of your wrist?	0.811	0.871	147.598	0.967	181
4	Strength of your hand?	0.834	0.866	151.775	0.944	181
5	Pain level of your hand?	0.726	0.737	132.07	0.998	181
6	Sensation (feeling) of your hand?	0.933	0.887	169.79	0.715	181

Open in a new tab

^a)

Outfit MNSQ = outlier-sensitive mean square statistic

^b)

Infit MNSQ = information-weighted mean square statistic

^c)

DOF = degrees of freedom

Figure 3. — Person-Item Distribution and Updated Threshold Structure for Each Domain After Collapsing Categories and Removing Misfitting Items

Table 4:

Overall Domain Characteristics

Domain	Cronbach’s α	Martin-Löf Test
Function	0.941	0.999
ADL	0.942	1
Work	0.963	0.999
Pain	0.889	0.61
Aesthetics^a	0.746	N/A
Satisfaction	0.924	0.745

Open in a new tab

^a)

Data reported for Aesthetics only includes items 1,2, and 4 which is not enough to test for a unidimensional construct

Activities of Daily Living (ADL) Domain

Three single hand items (pick up a coin, hold a glass of water, and hold a frying pan), and all two hand items had disordered thresholds with narrow gaps between “moderately difficult” and “very difficult”. Item 11 (wash your hair) showed difficulty discriminating between “somewhat difficult”, “moderately difficult”, and “very difficult”. Additionally, item 6 (open a jar) had poor fit with a p-value <0.001 (See Table, Supplemental Digital Content 1, which shows Initial Fit to the Rasch Model).

To improve fit, item 6 was removed and disordered thresholds were collapsed. Rasch analysis was repeated and showed excellent item fit (Table 3) with appropriate targeting to the study population (Figure 3b). DIF was absent for the affected hand and no abnormal correlations were observed in the residuals indicating local independence. ADLs was unidimensional and showed high reliability with a Cronbach’s alpha of 0.94 (Table 4). After collapsing disordered thresholds and removing item 6, ADLs demonstrated excellent fit to the Rasch model.

Work Domain

No disordered thresholds were observed in Work. Item 1 (how often were you unable to work because of problems with your hand/wrist) misfit the Rasch model with a p-value <0.001 (See Table, Supplemental Digital Content 1). Following removal of this item, Work showed good item fit (Table 3) and well-targeted person-item distributions (Figure 3c). No DIF occurred for dominance of injured hand and no local dependency was observed. The Martin-Löf test showed unidimensionality and Cronbach’s alpha was 0.96 indicating excellent reliability (Table 4). Overall, after removal of item 1, Work showed excellent fit to the Rasch model.

Pain Domain

All items in Pain, with the exception of item 1 (how often did you have pain in your hand), showed disordered thresholds with difficulty distinguishing between “moderate” and “severe”. Additionally, item 2 (please describe the pain you had in your hand/wrist) demonstrated non-uniform DIF indicating discrepancies in responses based on dominant or non-dominant hand injury. Item 2 was removed and disordered thresholds were collapsed for items 3, 4, and 5.

Rasch analysis was repeated and demonstrated excellent item fit (Table 3) and appropriate targeting to the study population (Figure 3d). Following removal of item 2, no DIF was observed for affected hand and no response dependency occurred. The Martin-Löf test showed unidimensionality and Cronbach’s alpha was 0.89 indicating high reliability (Table 4). Overall, after removing item 2 and collapsing thresholds, Pain showed high fit to the Rasch model.

Aesthetics Domain

Preliminary analysis showed item 4 (the appearance of my hand interfered with my normal social activities) had disordered thresholds with difficulty discriminating among “strongly agree”, “agree”, and “neither agree nor disagree”. Additionally, item 1 (I am satisfied with the appearance (look) of my hand) fit the Rasch model poorly (See Table, Supplemental Digital Content 1), whereas item 3 (the appearance (look) of my hand made me depressed) showed non-uniform DIF. Items 1 and 3 were independently analyzed for misfit. After removing item 3, item 1 fit well to the Rasch model indicating the non-uniform DIF of item 3 may have caused misfit for item 1. Rasch analysis was repeated following removal of item 3 and collapsing of thresholds for item 4.

Secondary analysis showed excellent item fit (Table 3) and good targeting to the study population (Figure 3e). High residual correlations were observed indicating response dependency. Cronbach’s alpha was 0.75 indicating high reliability (Table 4). Following removal of item 3, Aesthetics did not contain enough items to test for unidimensionality or DIF. Overall, Aesthetics poorly fit the Rasch model.

Satisfaction Domain

Rasch analysis showed all items had disordered thresholds with participants having difficulty discriminating between “neither satisfied nor dissatisfied” and “somewhat dissatisfied”. After collapsing categories, all items showed excellent fit to the Rasch model. However, non-uniform DIF was observed for item 1 (satisfaction with the overall function of your hand) indicating unusual discrepancies in responses based on dominant or non-dominant hand injury.

Rasch analysis was repeated following removal of item 1. Results showed excellent item-fit (Table 3) with good targeting to the study population (Figure 3f). No DIF or response dependency was observed. The Martin-Löf test demonstrated unidimensionality and the internal consistency was excellent with a Cronbach’s alpha of 0.92 (Table 4). Overall, after collapsing disordered thresholds and removing item 1, Satisfaction showed excellent fit to the Rasch model.

DISCUSSION

The results of this study demonstrate the MHQ is a valid and reliable tool for assessing hand outcomes following DRFs and is well-targeted to the WRIST population. Verifying the ability of the MHQ to accurately measure outcomes facilitates more confident interpretation of the results from this trial. However, as with many PROMs, several adjustments were needed to convert the MHQ into an interval level measurement. Specifically, 21/37 items had disordered thresholds. Disordered thresholds appear to occur more often at the higher/better performance end of the spectrum. The two domains with the lowest mean scores at 6 months (Work and Function) had no disordered thresholds, indicating appropriate utilization of categories by participants with lower hand performance.

In addition to disordered thresholds, a few items misfit the Rasch model. In ADLs, the item “How difficult was it for you to open a jar” had poor fit to the model. This question has shown similar misfit in other instruments assessing patients with musculoskeletal disorders.^{32, 33} Due to this consistent misfit, the inclusion of this task in future questionnaires should be carefully considered. In Work, the item “How often were you unable to work because of problems with your hand/wrist” misfit the model as well. This could be because most participants in WRIST were retired and, thus, their injury had little impact on their ability to work.

Item 1 in Pain (please describe the pain in your hand/wrist) and item 1 in Satisfaction (satisfaction with the overall function of your hand) demonstrated non-uniform DIF indicating individuals with the same hand performance responded differently to this item based on dominance of injured hand. Previous studies show right-handed individuals may have higher pain thresholds and increased reliance on their dominant hand for grip strength than left-hand dominant individuals.³⁴ For these individuals, loss of dominant hand function could result in more severe reduction in perceived hand performance for qualities such as pain or functional satisfaction. Finally, Aesthetics showed response dependency, and the item “The appearance of my hand made me depressed” showed DIF. Hand appearance is not often changed after DRF, especially 6 months after treatment when casts or fixators have been removed and scars have faded. For this reason, hand appearance may not be as important for measuring DRF outcomes. Although Aesthetics did not fit well, it had a Cronbach’s alpha of 0.75 indicating high reliability and functional utility of this domain, and could be more applicable in appearance-altering conditions such as rheumatoid arthritis.

Following removal of misfitting items and collapsing of disordered thresholds, five domains (Function, ADL, Performance, Pain, and Satisfaction) showed excellent fit and one domain (Aesthetics) showed poor fit to the Rasch model. These results suggest that each domain of the MHQ does not equally contribute to DRF outcomes and the use of individual domain scores are more pertinent than the summary MHQ score. Additionally, MHQ scores should be interpreted on a condition-specific basis.

We recommend performing additional Rasch analyses in other populations to identify items and domains that consistently misfit the Rasch model and which may not be essential for measuring outcomes related to hand trauma or other conditions. This will allow clinicians to administer domains that are most relevant for measuring each condition. This will increase the clinical applicability of the MHQ by reducing patient burden and leading to more accurate interpretation of the results. We encourage future investigators to consider a domain-specific approach to measuring outcomes for a broad range of conditions and outcome instruments.

One limitation from this study is the WRIST cohort was comprised of adults 60 years and older, and results may not be generalizable to other populations. Another limitation is our DRF population was primarily white and female so we were unable to perform subgroup analyses based on sex or race. It is possible these or other demographic factors may influence participants’ responses. Finally, model fit was achieved by converting the MHQ into an interval measurement, which is unfeasible in clinical practice.

CONCLUSION

The MHQ is a robust analytical tool for assessing hand outcomes following DRFs in older adults. Rasch analysis of the MHQ showed excellent fit to the model for all domains except Aesthetics. Future studies should analyze the MHQ in other conditions to improve clinical interpretation of this widely-used questionnaire.

Supplementary Material

Supplemental Digital Content 1

NIHMS1704502-supplement-Supplemental_Digital_Content_1.pdf^{(64.7KB, pdf)}

Acknowledgments

Financial Disclosure: Research reported in this publication was supported by the National Institute of Arthritis and Musculoskeletal and Skin Diseases and the National Institute on Aging of the National Institutes of Health under Award Number R01 AR062066 (To KCC). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Footnotes

Supplemental Information

Supplemental Digital Content 1. See Table, which shows Initial Fit to the Rasch Model.

REFERENCES

1.Gibbons E, Hewitson P, Morley D, Jenkinson C, Fitzpatrick R. The Outcomes and Experiences Questionnaire: development and validation. Patient Relat Outcome Meas. 2015;6:179–189. Published 2015 July16. doi: 10.2147/PROM.S82784 [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Dacombe PJ, Amirfeyz R, Davis T. Patient-Reported Outcome Measures for Hand and Wrist Trauma: Is There Sufficient Evidence of Reliability, Validity, and Responsiveness?. Hand (N Y). 2016;11(1):11–21. doi: 10.1177/1558944715614855 [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Waljee JF, Curtin C. Quality assessment in hand surgery. Hand Clin. 2014;30(3):329–vi. doi: 10.1016/j.hcl.2014.04.009 [DOI] [PubMed] [Google Scholar]
4.Chung KC, Pillsbury MS, Walters MR, Hayward RA. Reliability and validity testing of the Michigan Hand Outcomes Questionnaire. J Hand Surg Am. 1998;23(4):575–587. doi: 10.1016/S0363-5023(98)80042-7 [DOI] [PubMed] [Google Scholar]
5. https://mchoirresearch.wixsite.com/themhq/translations.
6.Nolte MT, Shauver MJ, Chung KC. Normative Values of the Michigan Hand Outcomes Questionnaire for Patients with and without Hand Conditions. Plast Reconstr Surg. 2017;140(3):425e–433e. doi: 10.1097/PRS.0000000000003581 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Waljee JF, Chung KC, Kim HM, et al. Validity and responsiveness of the Michigan Hand Questionnaire in patients with rheumatoid arthritis: a multicenter, international study. Arthritis Care Res (Hoboken). 2010;62(11):1569–1577. doi: 10.1002/acr.20274 [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Performance of the Michigan Hand Outcomes Questionnaire in hand osteoarthritis Kroon FPB et al. Osteoarthritis and Cartilage, Volume 26, Issue 12, 1627–1635 [DOI] [PubMed] [Google Scholar]
9.Schouffoer Anne A., van der Giesen Florus J., Beaart-van de Voorde Liesbeth J. J., Wolterbeek Ron, Huizinga Tom W. J., Vliet Vlieland Theodora P. M., Validity and responsiveness of the Michigan Hand Questionnaire in patients with systemic sclerosis, Rheumatology, Volume 55, Issue 8, August2016, Pages 1386–1393, 10.1093/rheumatology/kew016 [DOI] [PubMed] [Google Scholar]
10.Hobart J, Cano S. Improving the evaluation of therapeutic interventions in multiple sclerosis: the role of new psychometric methods. Health Technol Assess 2009;13(12) [DOI] [PubMed] [Google Scholar]
11.Hobart JC, Cano SJ, Zajicek JP, Thompson AJ. Rating scales as outcome measures for clinical trials in neurology: problems, solutions, and recommendations [published correction appears in Lancet Neurol. 2008 Jan;7(1):25]. Lancet Neurol. 2007;6(12):1094–1105. doi: 10.1016/S1474-4422(07)70290-9 [DOI] [PubMed] [Google Scholar]
12.Cano SJ, Mayhew A, Glanzman AM, et al. Rasch analysis of clinical outcome measures in spinal muscular atrophy. Muscle Nerve. 2014;49(3):422–430. doi: 10.1002/mus.23937 [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Levin LS. Wrist Fractures in Patients 60 Years or Older—To Plate or Cast? JAMA Netw Open. 2019;2(1):e187078. doi: 10.1001/jamanetworkopen.2018.7078 [DOI] [PubMed] [Google Scholar]
14. https://clinicaltrials.gov/ct2/show/NCT01589692.
15.Chung KC, Kim HM, Malay S, Shauver MJ; Wrist and Radius Injury Surgical Trial Group. The Wrist and Radius Injury Surgical Trial: 12-Month Outcomes from a Multicenter International Randomized Clinical Trial. Plast Reconstr Surg. 2020;145(6):1054e–1066e. doi: 10.1097/PRS.0000000000006829 [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Tennant A, Conaghan PG. The Rasch measurement model in rheumatology: what is it and why use it? When should it be applied, and what should one look for in a Rasch paper?. Arthritis Rheum. 2007;57(8):1358–1362. doi: 10.1002/art.23108 [DOI] [PubMed] [Google Scholar]
17.Andrich D A rating formulation for ordered response categories. Psychometrika 43, 561–573 (1978). 10.1007/BF02293814 [DOI] [Google Scholar]
18.Masters GN A rasch model for partial credit scoring. Psychometrika 47, 149–174 (1982). 10.1007/BF02296272 [DOI] [Google Scholar]
19.Hamilton Clayon B., Chesworth Bert M., A Rasch-Validated Version of the Upper Extremity Functional Index for Interval-Level Measurement of Upper Extremity Function, Physical Therapy, Volume 93, Issue 11, 1November2013, Pages 1507–1519, 10.2522/ptj.20130041 [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Stewart-Brown S, Tennant A, Tennant R et al. Internal construct validity of the Warwick-Edinburgh Mental Well-being Scale (WEMWBS): a Rasch analysis using data from the Scottish Health Education Population Survey. Health Qual Life Outcomes 7, 15 (2009). 10.1186/1477-7525-7-15 [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Pusic AL, Klassen AF, Scott AM, Klok JA, Cordeiro PG, Cano SJ. Development of a new patient-reported outcome measure for breast surgery: the BREAST-Q. Plast Reconstr Surg. 2009;124(2):345–353. doi: 10.1097/PRS.0b013e3181aee807 [DOI] [PubMed] [Google Scholar]
22.Lundström Mats, and Pesudovs Konrad. “Catquest-9SF Patient Outcomes Questionnaire.” Journal of Cataract & Refractive Surgery, vol. 35, no. 3, 2009, pp. 504–513., doi: 10.1016/j.jcrs.2008.11.038. [DOI] [PubMed] [Google Scholar]
23.Pallant JF, Tennant A. An introduction to the Rasch measurement model: an example using the Hospital Anxiety and Depression Scale (HADS). Br J Clin Psychol. 2007;46(Pt 1):1–18. doi: 10.1348/014466506x96931. [DOI] [PubMed] [Google Scholar]
24.Tan SK, & Chellappan K Assessing the Validity and Reliability of the Self-Efficacy Questionnaire for Children (SEQ–C) Among Malaysian Adolescents: Rasch Model Analysis. Measurement and Evaluation in Counseling and Development, 51(3), 179–192. doi: 10.1080/07481756.2018.1435192 [DOI] [Google Scholar]
25.Bond TG, & Fox CM Applying the Rasch Model: Fundamental measurement in the human sciences (3rd ed.). New York, NY: Routledge/Taylor and Francis Group. [Google Scholar]
26.Boone WJ. Rasch Analysis for Instrument Development: Why, When, and How?. CBE Life Sci Educ. 2016;15(4):rm4. doi: 10.1187/cbe.16-04-0148 [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Hagquist C, Bruce M, Gustavsson JP. Using the Rasch model in nursing research: an introduction and illustrative example. Int J Nurs Stud. 2009;46(3):380–393. doi: 10.1016/j.ijnurstu.2008.10.007 [DOI] [PubMed] [Google Scholar]
28.Robinson M, Johnson AM, Walton DM et al. A comparison of the polytomous Rasch analysis output of RUMM2030 and R (ltm/eRm/TAM/lordif). BMC Med Res Methodol 19, 36 (2019). 10.1186/s12874-019-0680-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Hagell P Testing Rating Scale Unidimensionality Using the Principal Component Analysis (PCA) Test Protocol with the Rasch Model: The Primacy of Theory over Statistics. Open Journal of Statistics, 04(06), 456–465. doi: 10.4236/ojs.2014.46044 [DOI] [Google Scholar]
30.Verguts Tom & De Boeck Paul. A note on the Martin-Löf test for unidimensionality. Methods of Psychological Research Online. 5. 77–82. [Google Scholar]
31.Tavakol M, Dennick R. Making sense of Cronbach's alpha. Int J Med Educ. 2011;2:53–55. Published 2011 June27. doi: 10.5116/ijme.4dfb.8dfd [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Haugen IK, Moe RH, Slatkowsky-Christensen B, Kvien TK, van der Heijde D, Garratt A. The AUSCAN subscales, AIMS-2 hand/finger subscale, and FIOHA were not unidimensional scales. J Clin Epidemiol. 2011;64(9):1039–1046. doi: 10.1016/j.jclinepi.2010.11.013 [DOI] [PubMed] [Google Scholar]
33.Chen CC, Bode RK. Psychometric validation of the Manual Ability Measure-36 (MAM-36) in patients with neurologic and musculoskeletal disorders. Arch Phys Med Rehabil. 2010;91(3):414–420. doi: 10.1016/j.apmr.2009.11.012 [DOI] [PubMed] [Google Scholar]
34.Ozcan A, Tulum Z, Pinar L, Başkurt F. Comparison of pressure pain threshold, grip strength, dexterity and touch pressure of dominant and non-dominant hands within and between right-and left-handed subjects. J Korean Med Sci. 2004;19(6):874–878. doi: 10.3346/jkms.2004.19.6.874 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Digital Content 1

NIHMS1704502-supplement-Supplemental_Digital_Content_1.pdf^{(64.7KB, pdf)}

[R1] 1.Gibbons E, Hewitson P, Morley D, Jenkinson C, Fitzpatrick R. The Outcomes and Experiences Questionnaire: development and validation. Patient Relat Outcome Meas. 2015;6:179–189. Published 2015 July16. doi: 10.2147/PROM.S82784 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Dacombe PJ, Amirfeyz R, Davis T. Patient-Reported Outcome Measures for Hand and Wrist Trauma: Is There Sufficient Evidence of Reliability, Validity, and Responsiveness?. Hand (N Y). 2016;11(1):11–21. doi: 10.1177/1558944715614855 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Waljee JF, Curtin C. Quality assessment in hand surgery. Hand Clin. 2014;30(3):329–vi. doi: 10.1016/j.hcl.2014.04.009 [DOI] [PubMed] [Google Scholar]

[R4] 4.Chung KC, Pillsbury MS, Walters MR, Hayward RA. Reliability and validity testing of the Michigan Hand Outcomes Questionnaire. J Hand Surg Am. 1998;23(4):575–587. doi: 10.1016/S0363-5023(98)80042-7 [DOI] [PubMed] [Google Scholar]

[R5] 5. https://mchoirresearch.wixsite.com/themhq/translations.

[R6] 6.Nolte MT, Shauver MJ, Chung KC. Normative Values of the Michigan Hand Outcomes Questionnaire for Patients with and without Hand Conditions. Plast Reconstr Surg. 2017;140(3):425e–433e. doi: 10.1097/PRS.0000000000003581 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Waljee JF, Chung KC, Kim HM, et al. Validity and responsiveness of the Michigan Hand Questionnaire in patients with rheumatoid arthritis: a multicenter, international study. Arthritis Care Res (Hoboken). 2010;62(11):1569–1577. doi: 10.1002/acr.20274 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Performance of the Michigan Hand Outcomes Questionnaire in hand osteoarthritis Kroon FPB et al. Osteoarthritis and Cartilage, Volume 26, Issue 12, 1627–1635 [DOI] [PubMed] [Google Scholar]

[R9] 9.Schouffoer Anne A., van der Giesen Florus J., Beaart-van de Voorde Liesbeth J. J., Wolterbeek Ron, Huizinga Tom W. J., Vliet Vlieland Theodora P. M., Validity and responsiveness of the Michigan Hand Questionnaire in patients with systemic sclerosis, Rheumatology, Volume 55, Issue 8, August2016, Pages 1386–1393, 10.1093/rheumatology/kew016 [DOI] [PubMed] [Google Scholar]

[R10] 10.Hobart J, Cano S. Improving the evaluation of therapeutic interventions in multiple sclerosis: the role of new psychometric methods. Health Technol Assess 2009;13(12) [DOI] [PubMed] [Google Scholar]

[R11] 11.Hobart JC, Cano SJ, Zajicek JP, Thompson AJ. Rating scales as outcome measures for clinical trials in neurology: problems, solutions, and recommendations [published correction appears in Lancet Neurol. 2008 Jan;7(1):25]. Lancet Neurol. 2007;6(12):1094–1105. doi: 10.1016/S1474-4422(07)70290-9 [DOI] [PubMed] [Google Scholar]

[R12] 12.Cano SJ, Mayhew A, Glanzman AM, et al. Rasch analysis of clinical outcome measures in spinal muscular atrophy. Muscle Nerve. 2014;49(3):422–430. doi: 10.1002/mus.23937 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.Levin LS. Wrist Fractures in Patients 60 Years or Older—To Plate or Cast? JAMA Netw Open. 2019;2(1):e187078. doi: 10.1001/jamanetworkopen.2018.7078 [DOI] [PubMed] [Google Scholar]

[R14] 14. https://clinicaltrials.gov/ct2/show/NCT01589692.

[R15] 15.Chung KC, Kim HM, Malay S, Shauver MJ; Wrist and Radius Injury Surgical Trial Group. The Wrist and Radius Injury Surgical Trial: 12-Month Outcomes from a Multicenter International Randomized Clinical Trial. Plast Reconstr Surg. 2020;145(6):1054e–1066e. doi: 10.1097/PRS.0000000000006829 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Tennant A, Conaghan PG. The Rasch measurement model in rheumatology: what is it and why use it? When should it be applied, and what should one look for in a Rasch paper?. Arthritis Rheum. 2007;57(8):1358–1362. doi: 10.1002/art.23108 [DOI] [PubMed] [Google Scholar]

[R17] 17.Andrich D A rating formulation for ordered response categories. Psychometrika 43, 561–573 (1978). 10.1007/BF02293814 [DOI] [Google Scholar]

[R18] 18.Masters GN A rasch model for partial credit scoring. Psychometrika 47, 149–174 (1982). 10.1007/BF02296272 [DOI] [Google Scholar]

[R19] 19.Hamilton Clayon B., Chesworth Bert M., A Rasch-Validated Version of the Upper Extremity Functional Index for Interval-Level Measurement of Upper Extremity Function, Physical Therapy, Volume 93, Issue 11, 1November2013, Pages 1507–1519, 10.2522/ptj.20130041 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Stewart-Brown S, Tennant A, Tennant R et al. Internal construct validity of the Warwick-Edinburgh Mental Well-being Scale (WEMWBS): a Rasch analysis using data from the Scottish Health Education Population Survey. Health Qual Life Outcomes 7, 15 (2009). 10.1186/1477-7525-7-15 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Pusic AL, Klassen AF, Scott AM, Klok JA, Cordeiro PG, Cano SJ. Development of a new patient-reported outcome measure for breast surgery: the BREAST-Q. Plast Reconstr Surg. 2009;124(2):345–353. doi: 10.1097/PRS.0b013e3181aee807 [DOI] [PubMed] [Google Scholar]

[R22] 22.Lundström Mats, and Pesudovs Konrad. “Catquest-9SF Patient Outcomes Questionnaire.” Journal of Cataract & Refractive Surgery, vol. 35, no. 3, 2009, pp. 504–513., doi: 10.1016/j.jcrs.2008.11.038. [DOI] [PubMed] [Google Scholar]

[R23] 23.Pallant JF, Tennant A. An introduction to the Rasch measurement model: an example using the Hospital Anxiety and Depression Scale (HADS). Br J Clin Psychol. 2007;46(Pt 1):1–18. doi: 10.1348/014466506x96931. [DOI] [PubMed] [Google Scholar]

[R24] 24.Tan SK, & Chellappan K Assessing the Validity and Reliability of the Self-Efficacy Questionnaire for Children (SEQ–C) Among Malaysian Adolescents: Rasch Model Analysis. Measurement and Evaluation in Counseling and Development, 51(3), 179–192. doi: 10.1080/07481756.2018.1435192 [DOI] [Google Scholar]

[R25] 25.Bond TG, & Fox CM Applying the Rasch Model: Fundamental measurement in the human sciences (3rd ed.). New York, NY: Routledge/Taylor and Francis Group. [Google Scholar]

[R26] 26.Boone WJ. Rasch Analysis for Instrument Development: Why, When, and How?. CBE Life Sci Educ. 2016;15(4):rm4. doi: 10.1187/cbe.16-04-0148 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Hagquist C, Bruce M, Gustavsson JP. Using the Rasch model in nursing research: an introduction and illustrative example. Int J Nurs Stud. 2009;46(3):380–393. doi: 10.1016/j.ijnurstu.2008.10.007 [DOI] [PubMed] [Google Scholar]

[R28] 28.Robinson M, Johnson AM, Walton DM et al. A comparison of the polytomous Rasch analysis output of RUMM2030 and R (ltm/eRm/TAM/lordif). BMC Med Res Methodol 19, 36 (2019). 10.1186/s12874-019-0680-5 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] 29.Hagell P Testing Rating Scale Unidimensionality Using the Principal Component Analysis (PCA) Test Protocol with the Rasch Model: The Primacy of Theory over Statistics. Open Journal of Statistics, 04(06), 456–465. doi: 10.4236/ojs.2014.46044 [DOI] [Google Scholar]

[R30] 30.Verguts Tom & De Boeck Paul. A note on the Martin-Löf test for unidimensionality. Methods of Psychological Research Online. 5. 77–82. [Google Scholar]

[R31] 31.Tavakol M, Dennick R. Making sense of Cronbach's alpha. Int J Med Educ. 2011;2:53–55. Published 2011 June27. doi: 10.5116/ijme.4dfb.8dfd [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] 32.Haugen IK, Moe RH, Slatkowsky-Christensen B, Kvien TK, van der Heijde D, Garratt A. The AUSCAN subscales, AIMS-2 hand/finger subscale, and FIOHA were not unidimensional scales. J Clin Epidemiol. 2011;64(9):1039–1046. doi: 10.1016/j.jclinepi.2010.11.013 [DOI] [PubMed] [Google Scholar]

[R33] 33.Chen CC, Bode RK. Psychometric validation of the Manual Ability Measure-36 (MAM-36) in patients with neurologic and musculoskeletal disorders. Arch Phys Med Rehabil. 2010;91(3):414–420. doi: 10.1016/j.apmr.2009.11.012 [DOI] [PubMed] [Google Scholar]

[R34] 34.Ozcan A, Tulum Z, Pinar L, Başkurt F. Comparison of pressure pain threshold, grip strength, dexterity and touch pressure of dominant and non-dominant hands within and between right-and left-handed subjects. J Korean Med Sci. 2004;19(6):874–878. doi: 10.3346/jkms.2004.19.6.874 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Using Rasch Analysis to Validate the Michigan Hand Outcomes Questionnaire from the WRIST Trial

Mayank Jayaram, BS

Chang Wang

Melissa J Shauver, MPH

Lu Wang, PhD, MS, MA

Kevin C Chung, MD, MS

Roles

Abstract

Background:

Methods:

Results:

Conclusions:

INTRODUCTION

METHODS

Overview of the MHQ

Figure 1.

WRIST

Rasch Analysis

Model Derivation

Figure 2.

Threshold Ordering

Item-Fit

Targeting

Differential-Item functioning (DIF)

Local Independence

Reliability

RESULTS

Study population

Table 1.

Table 2.

Function Domain

Table 3:

Figure 3.

Table 4:

Activities of Daily Living (ADL) Domain

Work Domain

Pain Domain

Aesthetics Domain

Satisfaction Domain

DISCUSSION

CONCLUSION

Supplementary Material

Acknowledgments

Footnotes

REFERENCES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases