Skip to main content
Bone & Joint Open logoLink to Bone & Joint Open
. 2026 Jan 22;7(1):90–101. doi: 10.1302/2633-1462.71.BJO-2025-0226.R1

Total hip arthroplasty restores population health-related quality of life norms

a propensity-matched study with mediation analysis of BMI

Andrew D Ablett 1, Liam Zen Yapp 1, Nick D Clement 1, Chloe E H Scott 1,2,3,
PMCID: PMC12824939  PMID: 41568542

Abstract

Aims

This study compares health-related quality of life (HRQoL) between patients undergoing primary total hip arthroplasty (THA) for osteoarthritis (OA) and a propensity-matched general population cohort. We also aimed to clarify the relationship between BMI and postoperative improvements, mediated via preoperative HRQoL.

Methods

In this retrospective study using the Edinburgh Arthroplasty database (1 January 2013 to 31 December 2022; n = 3,495) and Health Survey for England data (2010 to 2012; n = 25,320), propensity score matching (1:1) was performed based on age, sex, and BMI. The primary outcome was EuroQol five-dimension three-level questionnaire (EQ-5D-3L) index score. Secondary outcomes included EuroQol-visual analogue scale (EQ-VAS) and mediation analysis examining how preoperative EQ-5D-3L mediated the relationship between BMI and postoperative improvement.

Results

Preoperatively, THA patients had significantly lower EQ-5D-3L scores compared with matched general population (median difference: 0.280, bootstrapped 95% CIs; 0.258 to 0.306; p < 0.001). At one-year follow-up, THA patients exceeded population norms (THA median: 0.814 vs general population: 0.796, p = 0.014). Patients aged > 85 years showed the greatest magnitude of improvements, restoring EQ-5D-3L scores equivalent with their age-matched general population peers (preoperative: 0.189 vs postoperative: 0.796, general population: 0.696). Mediation analysis revealed that BMI’s negative direct effect on improvements in EQ-5D-3L was counterbalanced by stronger indirect effects transmitted through preoperative scores (indirect effects: obesity I (30 to 34.9 kg/m2): β = 0.038, p < 0.001; obesity II (35 to 39.9 kg/m2): β = 0.086, p < 0.001; obesity III (≥ 40 kg/m2): β = 0.123, p < 0.001).

Conclusion

THA was shown to restore HRQoL to that expected of a matched normal population, but in younger patients this was less than expected. Patients aged > 85 years had the greatest magnitude of restoration. Postoperative HRQoL improvement was predominantly influenced by preoperative functional status, rather than BMI alone. These findings challenge current BMI-based eligibility thresholds and support surgical prioritization based on functional impairment severity.

Cite this article: Bone Jt Open 2026;7(1):90–101.

Keywords: Total hip arthroplasty, EQ-5D-3L, EQ-VAS, General population, Propensity score matched, Mediation analysis, BMI, total hip arthroplasty (THA), obesity, propensity score matching, visual analogue scale, arthroplasty, primary total hip arthroplasty, osteoarthritis (OA), comorbidities

Introduction

End-stage hip osteoarthritis (OA) profoundly diminishes health-related quality of life (HRQoL), with one-fifth of patients reporting a subjective health state ‘worse than death’.1,2 While joint registry studies have captured the effects total hip arthroplasty (THA) has on HRQoL,3-5 the magnitude of these improvements relative to population norms remains poorly defined. This represents a critical gap, as such comparisons provide essential context for quantifying both the burden of disease preoperatively and the degree to which THA restores HRQoL to levels of the general population.

The relationship between BMI and arthroplasty outcomes remains contentious, as current practice often employs arbitrary BMI thresholds for surgical eligibility.6,7 Patients with elevated BMI typically begin with lower baseline HRQoL scores, yet experience greater relative improvements compared to patients with a normal BMI.8-10 Arbitrary BMI thresholds may therefore result in the exclusion of patients who substantially benefit from arthroplasty.11 Farrow et al12 suggested that preoperative functional status may better predict postoperative improvements and guide resource allocation. However, the causal mechanisms by which BMI influences THA outcomes remain incompletely understood.

This study addresses these gaps through three objectives: 1) to quantify the HRQoL deficit in patients awaiting THA compared to a propensity score matched general population cohort; 2) to determine whether THA restores HRQoL to age, sex, and BMI-matched population norms; and 3) to clarify the causal pathways through which BMI influences postoperative improvement, transmitted through preoperative scores. Through investigating these relationships, we aim to provide an evidence-based framework to aid in guiding patient prioritization and targeted preoperative optimization, while providing information for expectation-setting across diverse demographic groups.

Methods

Study design

This retrospective study compares EuroQol five-dimension three-level questionnaire (EQ-5D-3L)13 scores in patients undergoing primary THA for OA and the general population. Data for analyses were sourced from two distinct databases: pre- and postoperative patient data were extracted from the Edinburgh Orthopaedic Research Database (2013 to 2022), whereas population-level EQ-5D-3L scores were obtained from the Health Survey for England (HSE, 2010 to 2012). This study follows the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines.14

Primary THA cohort

The Edinburgh Orthopaedic Research Database was prospectively compiled using data collected between 1 January 2013 and 31 December 2022. The THA cohort was composed of patients undergoing primary THA for OA who completed EQ-5D-3L questionnaires during preoperative assessment and at one-year follow-up (Research Ethics Committee approval: 20/SS/0125). A flow diagram of patients included is displayed in Figure 1.

Fig. 1.

A flowchart showing how survey and clinical datasets were processed, filtered, imputed and analysed using propensity score matching and mediation analysis,leading to final tables and figures describing pre- and postoperative outcomes. The figure is a flowchart illustrating how two datasets, a general population health survey and an orthopaedic database of hip arthroplasty patients, werefiltered and combined for two analyses. Both datasets show exclusions before forming a general population group and a surgical cohort. Each group undergoes multipleimputations with chained equations. For Analysis 1, the chart follows the surgical and population groups through propensity score matching, pooling of results, and the production of tables and figures comparing pre- and postoperative data with population norms. For Analysis 2, the surgical group proceeds through a mediation analysis examining the relationship between body mass index, preoperative health-related quality of life and postoperative improvement, before results are pooled and additional tables and figures created.

Flow diagram of included patients and analysis pathway. HRQoL, health-related quality of life; HSE, Health Survey for England; OA, osteoarthritis; THA, total hip arthroplasty.

General population cohort

Annually, the National Centre for Social Research conducts a national survey aimed at monitoring trends in health, lifestyle, and wellbeing across adults and children living in England to produce the HSE. The responses from this survey are representative of the general population and can be accessed through the UK Data Service.15-17 HSE data (2010 to 2012) were used for direct comparability due to routine EQ-5D-3L collection. Inclusion was restricted to respondents aged ≥ 16 years (Figure 1).

Patient characteristics

Baseline characteristics of matched groups are displayed in Table I.

Table I.

Propensity score matched cohorts 1:1 – general population (n = 25,320) vs preoperative group (n = 3,495).

Covariates General population THA cohort SMD
Unmatched Matched Unmatched Matched
Total, n 25,320 3,495 3,495
Median age, yrs (IQR) 49.0 (35.0 to 64.0) 69.0 (61.0 to 76.0) 69.1 (61.2 to 76.3) 0.783 0.003
Male 1,1204 1,393 1,393 0.059 0.000*
Female 14,116 2,102 2,102 0.059 0.000*
Median BMI, kg/m2 (IQR) 26.7 (23.7 to 30.3) 27.9 (24.9 to 31.4) 27.6 (24.5 to 31.2) 0.113 0.027

Standardized mean difference (SMD) values represent effect sizes indicating balance between groups; SMD < 0.1 indicates good balance.

*

Exact matching was used for sex (SMD = 0.000).

THA, total hip arthroplasty.

Primary outcome

Our primary objective was to compare the difference in EQ-5D-3L scores between the THA and general population, pre- and postoperatively. EQ-5D-3L scores were calculated from the five dimensions: mobility, pain/discomfort, anxiety/depression, ability to perform daily activities, and self-care.18 The EQ-5D-3L scores were calculated by applying UK-specific value sets to the health state data, reflecting the relative importance the UK population assigns to different health conditions.19 Scores were generated using the Time Trade-Off value set for the UK,20 with values ranging from -0.594 to 1, where 1 represents ‘perfect health’ and negative values indicate health states considered worse than death.

Secondary outcomes

Secondary outcomes included comparisons of EuroQol visual analogue scale (EQ-VAS) scores between the THA cohort and the general population. As a component of the EQ-5D-3L questionnaire, the EQ-VAS captures respondents’ overall health assessment on a scale from 0 (‘worst health imaginable’) to 100 (‘best health imaginable’).21 We performed a mediation analysis to uncover the underlying relationship between BMI and postoperative improvements in EQ-5D-3L.

Missing data handling

For our main analysis containing both the THA and HSE cohorts, analysis identified structured patterns of missing data that would have resulted in exclusion of 26.6% of cases through listwise deletion (Figure 1, Supplementary Material). Little’s missing completely at random (MCAR) test (chi-squared test = 7,153.8, degrees of freedom (df) = 1,207, p < 0.001), combined with significant associations between missingness and observed variables (p < 0.001), indicated missing at random (MAR) rather than missing completely at random. For the mediation analysis using only the THA cohort with additional comorbidity covariates (diabetes (DM), myocardial infarction (MI), congestive heart failure (CHF), peripheral vascular disease (PVD), chronic obstructive pulmonary disease (COPD), cerebrovascular disease (CVD), other joint pain), missing data analysis revealed that listwise deletion would have resulted in 36.6% due to comorbidity data. Little’s MCAR test (chi-squared test = 1,592.2, df = 421, p < 0.001). In each dataset associations between observed and missing variables (p < 0.001), supported the assumption that data were missing at random.22 We used multiple imputation by chained equations using the mice package in R (R Foundation for Statistical Computing, Austria),23 with a fully conditional specification model adapted for HRQoL data. Predictive mean matching was used for continuous variables (height, weight, EQ-VAS),24 while ordinal EQ-5D-3L dimensions were imputed separately using proportional odds models, and logistic regression for comorbidity variables. We generated 100 imputations with 50 iterations for each dataset to ensure stable estimates.25 Convergence was verified using the Gelman-Rubin statistic (PSRF = 1.0). Derived variables were calculated post-imputation.26 Sensitivity analysis confirmed consistency between complete-case and imputed datasets, with minimal differences (< 1% change).

Propensity score matching

Using MatchIt,27 we performed 1:1 nearest-neighbour matching with a 0.2 SD calliper without replacement based on age, BMI, and exact matching for sex, using logistic regression modelling.28 Two matched cohorts were created following multiple imputations (preoperative THA vs general population and postoperative THA vs general population), producing 3,495 in each group, in each imputed dataset. Balancing of matching was assessed by estimating the standardized mean difference (SMD) for each covariate. Sensitivity analyses were completed comparing 1:1 vs 1:2 vs 1:3 matching, with 1:1 matching selected for optimal balance (maximum SMD: 0.027) while maintaining adequate power.

Mediation analysis

We constructed separate mediation models for each BMI category; underweight (≤ 18.4 kg/m2), overweight (25 to 29.9 kg/m2), obesity class I (30 to 34.9 kg/m2), obesity class II (35 to 39.9 kg/m2), and obesity class III (≥ 40 kg/m2), with normal BMI (18.5 to 24.9 kg/m2) as the reference category. We hypothesized that elevated BMI (exposure) leads to lower preoperative EQ-5D-3L scores (mediator), which in turn leads to altered capacity for improvement post-THA (outcome). This specification is supported by biomechanical evidence,29 observational studies,30,31 and longitudinal data showing that BMI remains relatively stable before and after THA.32,33 We investigated: BMI category’s effect on preoperative EQ-5D-3L (Path A), preoperative EQ-5D-3L’s influence on postoperative change (Path B), and BMI category’s direct effect on postoperative change controlling for preoperative EQ-5D-3L (Path C’). This disentangled BMI’s direct compared with indirect effects through preoperative status. All models were adjusted for age, sex, DM, MI, CHF, COPD, CVD, PVD, and any other joint pain.

Indirect effects (A × B) were calculated using the product-of-coefficients method with bias-corrected and accelerated 95% CIs from 5,000 bootstrap resamples.34-36 Analyses were conducted on each of the 100 imputed datasets, with results pooled using Rubin’s rules. The Benjamini-Hochberg procedure controlled for multiple comparisons.37 Model assumptions were tested with Shapiro-Wilk and Breusch-Pagan tests, applying heteroscedasticity-consistent (HC3) robust standard errors where needed.38,39 Multiple statistical approaches showed consistent estimates (< 1.5% variation), and alternative model specifications produced minimal changes (< 6%) despite violations of distributional assumptions. Multicollinearity was assessed using variance inflation factors (VIF) with a threshold of < 5. All VIF values were < 1.1, including the relationship between BMI and baseline EQ-5D-3L, indicating no evidence of problematic collinearity. Sensitivity to unmeasured confounding was quantified using E-values for mediation analysis.40 Our approach follows A Guideline for Reporting Mediation Analyses guidelines.41

Statistical analysis

All statistical analyses were conducted using R version 4.4.2 (R Foundation for Statistical Computing, Vienna, Austria). Differences between cohorts were assessed based on the median difference and its 95% CI. For our primary analyses, in each of the imputed datasets (m = 100), 5,000 BCa bootstrap resamples were completed with replicates pooled across imputations. The median of the pooled distribution served as our point estimate; BCa 95% CIs and two-tailed p-values were calculated. If the BCa algorithm failed to converge in any dataset, we employed a hierarchical fallback strategy. For stratified analyses involving multiple comparisons, bootstrap p-values were adjusted using the Benjamini-Hochberg method, p < 0.05 was considered significant.37 In our primary analysis comparing EQ-5D-3L of the pre- and postoperative THA cohorts with the general population, propensity score matching was used so that meaningful comparisons could be made. Comparisons of EQ-5D-3L dimensions between the THA cohort and general population controls were completed using the Mann-Whitney U test. The mediation analysis was conducted using the THA cohort without propensity score matching with a control group, as this analysis examined within-group relationships rather than between-group comparisons.

Results

Matched cohorts: preoperative THA and general population

Patients awaiting THA had significantly lower EQ-5D-3L compared with the matched general population (median difference (MD) 0.280, bootstrapped 95% CI 0.258 to 0.306; p < 0.001) (Table II, Figure 2). This deficit was most pronounced in younger patients (aged < 45 years: MD 0.707, 95% CI 0.484 to 0.912; p < 0.001) and females (MD 0.532, 95% CI 0.280 to 0.601; p < 0.001). Similarly, EQ-VAS scores were significantly lower across all preoperative groups (Table III, Figure 3). Preoperative patients had higher rates of extreme problems across all EQ-5D-3L domains, indicating functional impairment beyond pain and mobility (Table IV, Figure 4).

Table II.

Summary of EuroQol five-dimension three-level index scores: matched cohorts (n = 3,495 in each group).

Variable Median EQ-5D-3L (IQR) Difference
(95% CI)*
p-value Median postop EQ-5D-3L (IQR) Difference
(95% CI)
p-value
General population (unmatched) General population (matched) Preop
Sex
Male 1.000 (0.796 to 1.000) 0.848 (0.725 to 1.000) 0.587 (0.088 to 0.691) 0.261 (0.255 to 0.307) < 0.001 0.850 (0.691 to 1.000) -0.002 (-0.087 to 0.033) 0.578
Female 0.850 (0.727 to 1.000) 0.796 (0.691 to 1.000) 0.264 (0.055 to 0.691) 0.532 (0.280 to 0.601) < 0.001 0.796 (0.691 to 1.000) 0.000 (-0.003 to 0.015) 1.000
Age, yrs
Under 45 1.000 (0.848 to 1.000) 1.000 (0.809 to 1.000) 0.293 (-0.016 to 0.691) 0.707 (0.484 to 0.912) < 0.001 0.796 (0.620 to 1.000) 0.204 (0.054 to 0.273) 0.003
45 to 54 1.000 (0.796 to 1.000) 1.000 (0.727 to 1.000) 0.516 (0.088 to 0.691) 0.484 (0.261 to 0.636) < 0.001 0.848 (0.691 to 1.000) 0.152 (-0.035 to 0.204) 0.348
55 to 64 0.848 (0.725 to 1.000) 0.848 (0.725 to 1.000) 0.516 (0.055 to 0.691) 0.332 (0.282 to 0.405) < 0.001 0.848 (0.691 to 1.000) 0.000 (-0.087 to 0.034) 0.887
65 to 74 0.796 (0.710 to 1.000) 0.796 (0.691 to 1.000) 0.587 (0.088 to 0.691) 0.209 (0.177 to 0.234) < 0.001 0.848 (0.691 to 1.000) -0.052 (-0.087 to -0.018) 0.021
75 to 84 0.760 (0.656 to 1.000) 0.760 (0.656 to 1.000) 0.260 (0.055 to 0.691) 0.500 (0.244 to 0.601) < 0.001 0.814 (0.691 to 1.000) -0.054 (-0.087 to 0.000) 0.097
85 and older 0.725 (0.587 to 0.850) 0.696 (0.612 to 0.848) 0.189 (-0.003 to 0.620) 0.507 (0.175 to 0.568) < 0.001 0.796 (0.620 to 1.000) -0.100 (-0.123 to -0.019) 0.142
Total 1.000 (0.760 to 1.000) 0.796 (0.691 to 1.000) 0.516 (0.055 to 0.691) 0.280 (0.258 to 0.306) < 0.001 0.814 (0.691 to 1.000) -0.018 (-0.052 to -0.016) 0.014

All analyses were performed on multiply imputed datasets (m = 100). Median differences were calculated based on pooled medians for each group across imputations. CIs and p-values were calculated from pooled bootstrap replicates (from 5,000 resamples per imputation). CIs were primarily derived using the bias-corrected and accelerated (BCa) method, with fallback to alternatives when necessary. p-values smaller than 0.001 are reported as < 0.001. CI, bootstrapped 95% CIs (5,000 resamples).

*

General population (matched) vs preoperative THA; positive values indicate higher EQ-5D-3L scores in general population.

Bootstrap-based p-values for stratified comparisons (by sex or age) were adjusted using the Benjamini-Hochberg method.

General population (matched) vs postoperative THA; negative values indicate higher EQ-5D-3L scores in postoperative THA.

EQ-5D-3L, EuroQol five-dimension three-level questionnaire; THA, total hip arthroplasty.

Fig. 2.

Boxplots with density curves show EQ-5D-3L index scores for three groups: Pre-op, Post-op, and General Population. Pre-op scores are lowest, Post-op scores are higher, and General Population scores are highest. The figure displays three horizontal boxplots with accompanying density curves representing EQ-5D-3L index scores for Pre-op, Post-op, and General Population groups. The Pre-op group has a median around 0.5 with a wide spread and lower scores overall. The Post-op group shows improvement, with scores concentrated near 0.8. The General Population group has the highest scores, mostly near 1.0, with a narrower distribution. The x-axis represents EQ-5D-3L index scores ranging from approximately -0.5 to 1.0, and the y-axis lists the three groups. A legend identifies the groups corresponding to each boxplot.

Box and whisker with density plot of EuroQol five-dimension three-level questionnaire (EQ-5D-3L) Index scores by group (matched cohorts).

Table III.

Summary of EuroQol visual analogue scale scores: matched cohorts (n = 3,495 in each group).

Variable Median EQ-VAS (IQR) Difference
(95% CI)*
p-value Median postop EQ-VAS (IQR) Difference
(95% CI)
p-value
General population (unmatched) General population (matched) Preop
Sex
Male 80.0 (70.0 to 90.0) 80.0 (65.0 to 90.0) 70.4 (54.0 to 81.0) 9.6 (8.4 to 10.0) < 0.001 81.2 (70.0 to 90.5) -1.2 (-4.0 to -0.4) < 0.001
Female 80.0 (70.0 to 90.0) 80.0 (60.0 to 90.0) 70.0 (50.0 to 80.5) 10.0 (9.2 to 12.2) < 0.001 80.6 (70.0 to 90.9) -0.6 (-3.7 to -0.2) 0.009
Age, yrs
Under 45 80.0 (70.0 to 90.0) 80.0 (70.0 to 90.0) 65.0 (50.0 to 80.0) 15.0 (10.0 to 24.0) < 0.001 80.0 (66.0 to 90.0) 0.0 (-0.5 to 7.0) 1.000
45 to 54 80.0 (70.0 to 90.0) 80.0 (70.0 to 90.0) 70.0 (53.5 to 84.0) 10.0 (10.0 to 11.0) < 0.001 86.0 (74.0 to 92.0) -6.0 (-10.0 to -2.0) 0.003
55 to 64 80.0 (70.0 to 90.0) 80.0 (70.0 to 90.0) 70.2 (50.0, 81.8) 9.8 (8.0 to 10.0) < 0.001 86.5 (70.1 to 93.0) -6.5 (-10.0 to -2.0) < 0.001
65 to 74 80.0 (69.0 to 90.0) 80.0 (65.0 to 90.0) 70.4 (59.0 to 81.0) 9.7 (8.0 to 10.0) < 0.001 81.1 (70.0 to 91.0) -1.1 (-4.2 to -0.6) 0.001
75 to 84 75.0 (60.0 to 85.0) 75.0 (60.0 to 85.0) 70.0 (50.2 to 80.0) 5.0 (0.0 to 8.0) 0.093 80.0 (70.0 to 90.0) -5.0 (-10.0 to -2.0) 0.003
Over 85 70.0 (50.0 to 80.0) 70.0 (50.0 to 80.0) 69.9 (50.0 to 80.0) 0.2 (-10.0 to 5.0) 1.000 80.0 (69.8 to 90.0) -10.0 (-20.0 to -10.0) < 0.001
Total 80.0 (70.0 to 90.0) 80.0 (60.0 to 90.0) 70.0 (50.3 to 80.7) 10.0 (9.6 to 10.5) < 0.001 81.0 (70.0 to 90.8) -1.0 (-2.0 to -0.4) < 0.001

All analyses were performed on multiply imputed datasets (m = 100). Median differences were calculated based on pooled medians for each group across imputations. CIs and p-values were calculated from pooled bootstrap replicates (from 5,000 resamples per imputation). CIs were primarily derived using the bias-corrected and accelerated (BCa) method, with fallback to alternatives when necessary. p-values smaller than 0.001 are reported as < 0.001.

CI, bootstrapped 95% CIs (5,000 resamples).

*

General population (matched) vs preoperative THA; positive values indicate higher EQ-VAS scores in general population.

Bootstrap-based p-values for stratified comparisons (by sex or age) were adjusted using the Benjamini-Hochberg method.

General population (matched) vs postoperative THA; negative values indicate higher EQ-VAS scores in postoperative THA.

EQ-VAS, EuroQol visual analogue scale; THA, total hip arthroplasty.

Fig. 3.

Boxplots compare EQ-VAS scores for males and females across three groups: General Population, Pre-op, and Post-op. Both sexes show higher scores post-op than pre-op, with General Population scores slightly higher than pre-op. The figure contains two panels side by side. The left panel shows EQ-VAS score distributions for males, and the right panel shows distributions for females. Each panel includes three vertical boxplots labeled General Population, Pre-op, and Post-op. For both sexes, the General Population group has the highest median scores, around 80, with interquartile ranges roughly between 65 and 90. Pre-op groups have lower medians, near 70, and wider spreads. Post-op groups show improvement, with medians close to 80 and ranges similar to the General Population. Outliers appear below 40 in all groups, more frequent in females. The y-axis ranges from 0 to 100, representing EQ-VAS scores.

Box and whisker plot EuroQol visual analogue scale (EQ-VAS) scores by group and sex (matched cohorts).

Table IV.

Descriptive summary of EuroQol five-dimension three-level questionnaire components: matched cohorts.

Component General population, n (%) Preop, n (%) p-value* Postop, n (%) p-value
Mobility < 0.001 0.565
1 2,249 (64.3) 399 (11.4) 2,222 (63.6)
2 1,236 (35.4) 2,979 (85.2) 1,262 (36.1)
3 10 (0.3) 117 (3.3) 11 (0.3)
Problems 1,246 (35.7) 3,096 (88.6) 1,273 (36.4)
Self-care < 0.001 < 0.001
1 3,130 (89.6) 2,237 (64.0) 2,898 (82.9)
2 342 (9.8) 1,202 (34.4) 570 (16.3)
3 23 (0.7) 56 (1.6) 27 (0.8)
Problems 365 (10.4) 1,258 (36.0) 597 (17.1)
Usual activities < 0.001 < 0.001
1 2,491 (71.3) 493 (14.1) 2,143 (61.3)
2 892 (25.5) 2,435 (69.7) 1,268 (36.3)
3 112 (3.2) 567 (16.2) 84 (2.4)
Problems 1,004 (28.7) 3,002 (85.9) 1,352 (38.7)
Pain and discomfort < 0.001 < 0.001
1 1,654 (47.3) 78 (2.2) 1,944 (55.6)
2 1,577 (45.1) 1,922 (55.0) 1,429 (40.9)
3 264 (7.6) 1,495 (42.8) 122 (3.5)
Problems 1,841 (52.7) 3,417 (97.8) 1,551 (44.4)
Anxiety and depression < 0.001 < 0.001
1 2,613 (74.8) 2,195 (62.8) 2,862 (81.9)
2 811 (23.2) 1,158 (33.1) 564 (16.1)
3 71 (2.0) 142 (4.1) 69 (2.0)
Problems 882 (25.2) 1,300 (37.2) 633 (18.1)

EuroQol five-dimension three-level questionnaire domain level definitions: 1 = no problems, 2 = moderate problems, 3 = extreme problems (n = 3,495 in each group).

*

p-value from pooled Mann-Whitney U test comparing distributions between General population and Preoperative groups across 100 imputations.

p-value from pooled Mann-Whitney U test comparing distributions between General population and Postoperative groups across 100 imputations.

Combined moderate and extreme problems (levels 2 and 3).

Fig. 4.

Two radar plots compare EuroQol domain scores. Left shows General Population vs Post-op with similar patterns and slight improvement post-op. Right shows General Population vs Pre-op, where Pre-op has more problems across all domains. The figure contains two radar plots labeled a and b, each with five axes representing EuroQol domains: Mobility, Self-care, Usual activities, Pain/Discomfort, and Anxiety/Depression. Plot a compares General Population and Post-op groups, showing similar patterns with slight improvement in Post-op, indicating fewer problems. Plot b compares General Population and Pre-op groups, where Pre-op shows larger values toward the outer edges, indicating more problems across all domains, especially in Pain/Discomfort and Usual activities. Each axis has three concentric levels: inner for no problems, middle for some problems, and outer for extreme problems. General Population consistently appears closer to the center, reflecting fewer issues.

a) Radar plots of a) postoperative and b) preoperative EuroQol five-dimension three-level (EQ-5D-3L) questionnaire domain scores: matched cohorts. Each pentagon represents possible domain scores (inner = no problems, middle = some problems, outer = extreme problems).

Matched cohorts: postoperative THA and general population

At one-year follow-up, THA not only restored but exceeded population-normative HRQoL, with patients achieving significantly higher EQ-5D-3L scores than matched controls (median: 0.814 vs 0.796, MD -0.018, 95% CI -0.052 to -0.016; p = 0.014) (Table II). Age-stratified analysis showed older patients (aged ≥ 85 years) demonstrated the greatest magnitude of improvement from preoperative scores (preoperative: 0.189 vs general population: 0.696, postoperative: 0.796), while those aged < 45 years remained below population norms despite substantial improvements (MD 0.204, 95% CI 0.054 to 0.273; p = 0.003). This pattern of exceeding population norms was also observed in EQ-VAS scores (MD -1.0, 95% CI -2.0 to -0.4; p < 0.001) (Table III), with the oldest patients (aged > 85 years) reporting the most substantial improvements compared with their age-matched peers (MD -10.0, 95% CI -20.0 to -10.0; p < 0.001). Dimensional analysis revealed domain-specific recovery patterns: postoperative mobility problems matched population rates, pain/discomfort and anxiety/depression improved beyond population levels (p < 0.001), while self-care and usual activities showed significant improvement but remained below population benchmarks (p < 0.001).

Mediation analysis

Baseline characteristics by BMI category are displayed in the Supplementary Material. Mediation analysis revealed that preoperative EQ-5D-3L scores mediated BMI’s influence on outcomes (Table V, Figure 5 and Figure 6). Baseline scores showed a clear gradient: normal (median 0.587; IQR 0.088 to 0.691), obesity I (0.433; IQR 0.055 to 0.691), obesity II (0.159; IQR -0.016 to 0.689), and obesity III (0.091; IQR -0.004 to 0.620). Path A (BMI to preoperative score) demonstrated significantly lower baseline scores with increasing obesity (class I: β = -0.046, p < 0.01; II: β = -0.105, p < 0.001; III: β = -0.150, p < 0.001).

Table V.

Summary of EuroQol five-dimension three-level questionnaire (EQ-5D-3L) by BMI category and mediation analysis of BMI category (independent) effects on postoperative change in scores (dependent): preoperative EQ-5D-3L as mediator (reference category = normal BMI) (n = 3,495).

BMI category N (%) Median preop EQ-5D (IQR) Median postop EQ-5D (IQR) Median difference
(95% CI)*
Path A (95% CI) Path B (95% CI) Path C’ (95% CI) A × B indirect effect (95% CI) Total effect (95% CI) Proportion mediated, % Mediation type
Normal 974 (27.9) 0.587 (0.088 to 0.691) 0.850 (0.710 to 1.000) 0.310 (-0.168 to 1.016) - - - - - - Reference
Underweight 49 (1.4) 0.516 (0.088 to 0.691) 0.805 (0.638 to 1.000) 0.309 (-0.092 to 0.941) -0.002 (-0.105 to 0.102) -0.822 (-0.846 to -0.797)§ -0.064 (-0.141 to 0.013) 0.001 (-0.071 to 0.075) -0.063 (-0.176 to 0.051) 2.0 Partial
Overweight 1,295 (37.1) 0.587 (0.088 to 0.691) 0.849 (0.691 to 1.000) 0.309 (-0.175 to 1.074) -0.006 (-0.036 to 0.023) -0.822 (-0.846 to -0.797)§ -0.001 (-0.022 to 0.020) 0.005 (-0.017 to 0.027) 0.004 (-0.028 to 0.036) 122.4 Inconsistent
Obesity I 737 (21.1) 0.433 (0.055 to 0.691) 0.796 (0.691 to 1.000) 0.380 (-0.250 to 1.068) -0.046 (-0.080 to -0.012) -0.822 (-0.846 to -0.797)§ -0.026 (-0.050 to -0.001)** 0.038 (0.012 to 0.064)§ 0.012 (-0.026 to 0.049) 316.1 Inconsistent
Obesity II 324 (9.3) 0.159 (-0.016 to 0.689) 0.796 (0.620 to 1.000) 0.428 (-0.279 to 1.177) -0.105 (-0.151 to -0.060)§ -0.822 (-0.846 to -0.797)§ -0.064 (-0.097 to -0.032)§ 0.086 (0.050 to 0.123)§ 0.022 (-0.027 to 0.071) 390.3 Inconsistent
Obesity III 116 (3.3) 0.091 (-0.004 to 0.620) 0.727 (0.587 to 1.000) 0.526 (-0.428 to 1.083) -0.150 (-0.222 to -0.079)§ -0.822 (-0.846 to -0.797)§ -0.083 (-0.136 to -0.031) 0.123 (0.072 to 0.174)§ 0.040 (-0.036 to 0.116) 307.4 Inconsistent

Partial mediation: the direct (Path C’) and indirect (A × B) effects have the same direction, with the mediator explaining only part of the total effect.

Inconsistent mediation: the direct (Path C’) and indirect (A × B) effects have opposite directions, causing the indirect effect to counteract (suppress) the direct effect. This results in a total effect smaller than either the direct or indirect effect, leading to a proportion mediated > 100%.

*

Preoperative vs postoperative EQ-5D-3L scores.

Mediation paths A, B, C, and indirect effects are presented with 95% CIs. All estimates were derived from multiple linear regression models adjusted for age, sex, diabetes, myocardial infarction, congestive heart failure, chronic pulmonary obstructive disease, peripheral vascular disease, cerebral vascular disease, and other joint pain; pooled across 100 imputations using Rubin's rules.

CIs for indirect effects were calculated using bias-corrected bootstrapping with 5,000 resamples.

§

p < 0.001.

p < 0.01.

**

p < 0.05.

Fig. 5.

Three diagrams show mediation models for obesity classes I, II, and III. Each links BMI category to post-op EQ-5D score change through pre-op EQ-5D score, with path coefficients indicating indirect and direct effects. The figure presents three side-by-side mediation diagrams labeled a, b, and c for obesity classes I (30–34.9 kg/m²), II (35–39.9 kg/m²), and III (≥40 kg/m²). Each diagram includes three boxes: BMI category on the left, Pre-op EQ-5D score in the center, and Post-op EQ-5D score change on the right. Arrows connect BMI category to Pre-op EQ-5D score (Path A), Pre-op EQ-5D score to Post-op EQ-5D score change (Path B), and BMI category directly to Post-op EQ-5D score change (Path C′). Path coefficients are shown: for class I, Path A β = -0.046, Path B β = -0.822, Path C′ β = -0.026; for class II, Path A β = -0.105, Path B β = -0.822, Path C′ β = -0.064; for class III, Path A β = -0.150, Path B β = -0.822, Path C′ β = -0.083. Significance levels are indicated with asterisks, showing all paths significant except Path C′ in class I, which is marginal. These diagrams illustrate that higher BMI is associated with lower pre-op EQ-5D scores, which strongly predict post-op score changes.

a) Mediation analysis pathway diagram, displaying the relationship between BMI category (obesity classes I to III), preoperative EuroQol five-dimension three-level questionnaire (EQ-5D-3L) score, and postoperative change in EQ-5D-3L scores, adjusting for age and sex. b) Mediation analysis pathway diagram, displaying the relationship between BMI category (obesity classes I to III), preoperative EQ-5D-3L score, and postoperative change in EQ-5D-3L scores, adjusting for age and sex. c) Mediation analysis pathway diagram, displaying the relationship between BMI category (obesity classes I to III), preoperative EQ-5D-3L score, and postoperative change in EQ-5D-3L scores, adjusting for age and sex.

Fig. 6.

Bar chart shows direct and indirect effects for obesity classes I, II, and III. Indirect effects are positive and increase with obesity class, while direct effects are negative and become more pronounced from class I to III. The figure is a grouped bar chart comparing β coefficients for direct and indirect effects across three obesity classes: I, II, and III. The y-axis represents β coefficients ranging from approximately -0.10 to 0.12. For obesity class I, the direct effect is slightly negative, around -0.02, and the indirect effect is positive, near 0.04. For class II, the direct effect is more negative, about -0.06, while the indirect effect rises to roughly 0.08. For class III, the direct effect is the most negative, near -0.08, and the indirect effect is the highest, around 0.11. The chart illustrates that as obesity severity increases, indirect effects strengthen positively, while direct effects become increasingly negative.

Direct compared with indirect mediation effects of obesity classes on postoperative EuroQol five-dimension three-level questionnaire scores following total hip arthroplasty, showing inconsistent mediation.

Path B (preoperative score leads to postoperative score change) showed a consistent negative relationship (β = -0.822, p < 0.001), indicating greater improvement potential with lower baseline scores. Path C’ (BMI leads to postoperative score change) was negative and increased with BMI severity (obesity I: β = -0.026, p = 0.040; II: β = -0.064, p < 0.001; III: β = -0.083, p = 0.002), suggesting obesity directly impairs improvement potential.

The indirect effects (Path A × B) operated in opposite directions to direct effects (Path C’), becoming increasingly positive with higher BMI (obesity I: β = 0.038, 95% CI 0.012 to 0.064; p < 0.001; II: β = 0.086, 95% CI 0.050 to 0.123; p < 0.001; III: β = 0.123, 95% CI 0.072 to 0.174; p < 0.001), demonstrating that lower preoperative scores associated with obesity created greater potential for relative improvement. This pattern represents inconsistent mediation, where direct and indirect effects have opposite signs, resulting in suppression of total effects (Figure 6). E-values for the indirect effect showed increasing resistance to unmeasured confounding with higher BMI (obesity class I: 1.71; class II: 2.45; class III: 3.11).

Discussion

This study quantifies the substantial HRQoL deficits among patients awaiting primary THA for OA, compared to propensity score matched general population controls. At one-year follow-up THA was shown to restore HRQoL to that expected of a matched normal population, and in some this was exceeded, highlighting the broader impact of THA beyond pain relief and improved mobility alone. Yapp et al42 demonstrated similar restorative effects of total knee arthroplasty (TKA) for OA, collectively reinforcing the fact that both THA and TKA are transformative interventions capable of restoring profound HRQoL deficits to population levels.

Older patients (aged > 65 years), particularly those aged > 85 years, demonstrated the greatest magnitude of improvement from baseline scores, restoring HRQoL to their age-matched general population peers (EQ-5D-3L preoperative: 0.189 vs postoperative: 0.796, general population: 0.696). Aalund et al43 have noted that older adults experience greater improvements in HRQoL than younger populations. In a prospective study as part of the Special Orthopaedic Geriatrics trial, Reinhard et al44 reported similar substantial improvements in HRQoL among 101 geriatric patients (mean age 78.1 years (SD 4.9)) while also acknowledging the increased risk of complications in this multimorbid and frail cohort. Consequently, the authors strongly advocate for preoperative Comprehensive Geriatric Assessment to identify modifiable risk factors and optimize comorbidities, in conjunction with postoperative Geriatric Co-Management, which has been shown to improve outcomes.44-46

We observed that preoperatively, female patients experienced considerably lower HRQoL compared to males, consistent with Steinbeck et al,47 but also compared with the matched female general population. Hawker et al48 found females with more severe symptoms were less likely to undergo arthroplasty. The underlying cause of this sex disparity is unclear, but may stem from female patients with chronic pain receiving inferior care due to sex-related bias,49 contributing to inequalities in referral,50 or patient concerns that treatment may lead to disruptions in their role as family caregiver.51,52 Encouragingly, at one-year follow-up, female patients reported dramatic relative improvements, and THA restored HRQoL to general population levels.

Conversely, younger patients (aged < 45 years) showed the greatest preoperative HRQoL deficit and, despite remarkable improvements, uniquely remained below normative values postoperatively. Rolfson et al5 in their Swedish Registry study reported that patients aged < 44 years had lower relative improvements when compared with older patients, possibly reflecting higher functional demands and expectations of younger adults that exceed even modern prostheses’ capabilities. Nevertheless, despite achieving lower EQ-5D-3L scores, these younger patients reported postoperative EQ-VAS scores equivalent to their age-matched peers, suggesting meaningful perceived improvements even if functional recovery was incomplete.

Our mediation analysis revealed that preoperative functional status significantly mediates the relationship between BMI and improvement after THA. For patients with BMI ≥ 30 kg/m2, inconsistent mediation was observed, where obesity directly impairs improvement (negative direct effect) while simultaneously creating greater improvement potential through lower baseline scores (positive indirect effect), with the latter effect predominating in determining relative improvements. Our findings provide an explanation of the clinical paradox where obese patients show larger relative HRQoL gains despite achieving lower absolute outcomes.8,12 Elcock et al,11 in their study of patients with higher BMIs undergoing TKA, reported that despite the increased rate of complications in obesity class III patients, TKA remained a cost-effective treatment. Targeted preoperative optimization of high BMI patients may enhance absolute outcomes while reducing complications,53 as demonstrated by recent evidence showing that glucagon-like peptide 1 (GLP-1) receptor agonists before arthroplasty significantly reduced postoperative complications.54 These insights challenge arbitrary BMI-based eligibility thresholds, suggesting surgical prioritization should focus on functional impairment severity.

Our study’s strengths include the robust methodological rigour using propensity matching, statistical validation, and a large representative patient cohort. Additionally, the novel mediation analysis provides valuable insights into BMI’s complex relationship with surgical outcomes, directly informing clinical decision-making. Limitations include only one-year follow-up which, although clinically meaningful, means that we did not capture longer-term functional outcome trends. While E-values generated for our mediation analysis suggest acceptable robustness for indirect effects, we acknowledge that unmeasured confounding cannot be definitively excluded in observational research. The EQ-5D-3L instrument, while widely used, is known for its ceiling effects, particularly in patients achieving higher functional outcomes. However, in our study we observed a varied distribution of postoperative scores below the ceiling across age, sex, and BMI categories. We recognize the temporal mismatch between the THA (2013 to 2022) and HSE (2010 to 2012) cohorts. However, the documented relative stability of EQ-5D-3L in the general population of the UK across most age groups, over two decades, suggests this mismatch is unlikely to significantly impact our findings.55 We acknowledge that the temporal relationship between preoperative HRQoL and BMI category cannot be proven given this study design. Patients awaiting THA may gain modest weight due to immobility, however the combination of BMI stability over time;32,33 the clear dose-response gradient of decreasing EQ-5D-3L with increasing BMI; and consistent observational evidence of lower EQ-5D-3L in higher BMI patients supports our specification of BMI as the exposure rather than the mediator.30,31 The reverse causation hypothesis that functional impairment causes obesity would require implausibly precise mechanisms whereby specific EQ-5D-3L scores produced large, predictable, and temporally consistent weight gains sufficient to shift patients across BMI categories. Finally, BMI-based selection bias may exclude patients who could benefit from surgery, potentially underestimating true benefits.

Future studies should include longer follow-up periods and EQ-5D-5L to enhance clinical applicability. Our mediation analysis has revealed inconsistent mediation, with the direct effect of BMI acting in opposition with the indirect effect, mediated through preoperative EQ-5D-3L score, on change in postoperative improvement. Future research should aim to identify whether the suppressive direct effects in higher BMI categories on postoperative improvement may be overcome by structured weight management programmes, which include the use of GLP-1 agonists and their variants. These studies should take note of the induced catabolic state association with the use of these medications, and aim to identify optimal windows and safe rate of weight loss, preoperative medication cessation timing, effects of sarcopenia, and longer-term arthroplasty-related risk profiles.

In conclusion, this study highlights the profound HRQoL deficits experienced by patients awaiting THA for OA, compared to matched general population norms. At one-year follow-up, THA restored HRQoL to normative general population levels, with particularly notable improvements observed among older adults and female patients. Our mediation analysis demonstrates that postoperative improvements are predominantly driven by preoperative functional impairment rather than BMI alone. These findings challenge current BMI-based surgical eligibility thresholds, suggesting instead that clinical prioritization should focus on preoperative functional status.

Take home message

- Total hip arthroplasty restores health-related quality of life to population norms across propensity score matched sex and age stratifications at one year postoperatively.

- Our mediation analysis of BMI reveals that postoperative improvement is driven primarily by preoperative functional impairment, rather than BMI alone.

- These findings challenge BMI-based eligibility thresholds and support improving equitable access to total hip arthroplasty.

Author contributions

A. D. Ablett: Data curation, Formal analysis, Writing – original draft

L. Z. Yapp: Conceptualization, Formal analysis, Investigation, Methodology, Validation, Writing – review & editing

N. D. Clement: Conceptualization, Methodology, Validation, Writing – review & editing

C. E. H. Scott: Conceptualization, Methodology, Validation, Writing – review & editing

Funding statement

The author(s) received no financial or material support for the research, authorship, and/or publication of this article.

ICMJE COI statement

N. D. Clement and C. E. H. Scott report research grants from Stryker, unrelated to this study. C. E. H. Scott also reports consulting fees from Stryker, Smith & Nephew, and Osstec, and teaching payments from Stryker, all of which are unrelated to this study. C. E. H. Scott is also Editor-in-Chief of Bone & Joint Research, an editorial board member of The Bone & Joint Journal, a member of the advisory boards for Osstec and Smith & Nephew, and the Data Safety Monitoring Committee for the PASHION Study. N. D. Clement is an editorial board member of The Bone & Joint Journal and Bone & Joint Research.

Data sharing

The datasets generated and analyzed in the current study are not publicly available due to data protection regulations. Access to data is limited to the researchers who have obtained permission for data processing. Further inquiries can be made to the corresponding author.

Ethical review statement

Research Ethics Committee approval: 20/SS/0125

Supplementary material

Tables showing missing data patterns by group prior to multiple imputations, propensity score matching, and mediation analysis as well as preoperative patient characteristics by BMI category included in mediation analysis.

This paper was presented at the Best of the Best, BOA Congress 2025 in Liverpool, UK.

Social media

Follow the authors on X @EdinOrthopaedic

Follow C. E. H. Scott on X @EdinburghKnee

© 2026 Ablett et al. This is an open-access article distributed under the terms of the Creative Commons Attribution Non-Commercial No Derivatives (CC BY-NC-ND 4.0) licence, which permits the copying and redistribution of the work only, and provided the original author and source are credited. See https://creativecommons.org/licenses/by-nc-nd/4.0/

Data Availability

The datasets generated and analyzed in the current study are not publicly available due to data protection regulations. Access to data is limited to the researchers who have obtained permission for data processing. Further inquiries can be made to the corresponding author.

References

  • 1. Scott CEH, MacDonald DJ, Howie CR. “Worse than death” and waiting for a joint arthroplasty. Bone Joint J. 2019;101-B(8):941–950. doi: 10.1302/0301-620X.101B8.BJJ-2019-0116.R1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Clement ND, Scott CEH, Murray JRD, Howie CR, Deehan DJ, IMPACT-Restart Collaboration The number of patients “worse than death” while waiting for a hip or knee arthroplasty has nearly doubled during the COVID-19 pandemic. Bone Joint J. 2021;103-B(4):672–680. doi: 10.1302/0301-620X.103B.BJJ-2021-0104.R1. [DOI] [PubMed] [Google Scholar]
  • 3. Heath EL, Ackerman IN, Cashman K, Lorimer M, Graves SE, Harris IA. Patient-reported outcomes after hip and knee arthroplasty: results from a large national registry. Bone Jt Open. 2021;2(6):422–432. doi: 10.1302/2633-1462.26.BJO-2021-0053.R1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Peters RM, van Steenbergen LN, Stewart RE, et al. Which patients improve most after total hip arthroplasty? Influence of patient characteristics on patient-reported outcome measures of 22,357 total hip arthroplasties in the Dutch Arthroplasty Register. Hip Int. 2021;31(5):593–602. doi: 10.1177/1120700020913208. [DOI] [PubMed] [Google Scholar]
  • 5. Rolfson O, Kärrholm J, Dahlberg LE, Garellick G. Patient-reported outcomes in the Swedish Hip Arthroplasty Register: results of a nationwide prospective observational study. J Bone Joint Surg Br. 2011;93-B(7):867–875. doi: 10.1302/0301-620X.93B7.25737. [DOI] [PubMed] [Google Scholar]
  • 6. Springer B, Parvizi J, Austin M, Backe H, Della Valle C, Kolessar D, et al. Obesity and total joint arthroplasty a literature based review. J Arthroplasty. 2013;28(5):714–721. doi: 10.1016/j.arth.2013.02.011. [DOI] [PubMed] [Google Scholar]
  • 7. Springer BD, Roberts KM, Bossi KL, Odum SM, Voellinger DC. What are the implications of withholding total joint arthroplasty in the morbidly obese? Bone Joint J. 2019;101-B(7_Supple_C):28–32. doi: 10.1302/0301-620X.101B7.BJJ-2018-1465.R1. [DOI] [PubMed] [Google Scholar]
  • 8. McLawhorn AS, Steinhaus ME, Southren DL, Lee YY, Dodwell ER, Figgie MP. Body mass index class is independently associated with health-related quality of life after primary total hip arthroplasty: an institutional registry-based study. J Arthroplasty. 2017;32(1):143–149. doi: 10.1016/j.arth.2016.06.043. [DOI] [PubMed] [Google Scholar]
  • 9. Schatz C, Klein N, Marx A, Buschner P. Preoperative predictors of health-related quality of life changes (EQ-5D and EQ VAS) after total hip and knee replacement: a systematic review. BMC Musculoskelet Disord. 2022;23(1):58. doi: 10.1186/s12891-021-04981-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Sach TH, Barton GR, Doherty M, Muir KR, Jenkinson C, Avery AJ. The relationship between body mass index and health-related quality of life: comparing the EQ-5D, EuroQol VAS and SF-6D. Int J Obes (Lond) 2007;31(1):189–196. doi: 10.1038/sj.ijo.0803365. [DOI] [PubMed] [Google Scholar]
  • 11. Elcock KL, Carter TH, Yapp LZ, et al. Total knee arthroplasty in patients with severe obesity provides value for money despite increased complications. Bone Joint J. 2022;104-B(4):452–463. doi: 10.1302/0301-620X.104B4.BJJ-2021-0353.R3. [DOI] [PubMed] [Google Scholar]
  • 12. Farrow L, Redmore J, Talukdar P, Clement N, Ashcroft GP. Prioritisation of patients awaiting hip and knee arthroplasty: lower pre-operative EQ-5D is associated with greater improvement in quality of life and joint function. Musculoskeletal Care. 2022;20(4):892–898. doi: 10.1002/msc.1645. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. EuroQol Group EuroQol--a new facility for the measurement of health-related quality of life. Health Policy. 1990;16(3):199–208. doi: 10.1016/0168-8510(90)90421-9. [DOI] [PubMed] [Google Scholar]
  • 14. Elm E von, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP. Strengthening the reporting of observational studies in epidemiology (STROBE) statement: guidelines for reporting observational studies. BMJ. 2007;335(7624):806–808. doi: 10.1136/bmj.39335.541782.AD. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.NatCen Social Research, University College London Health Survey for England, 2012. UK Data Service. 2012. [3 December 2025]. https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=7480 date last. accessed.
  • 16.NatCen Social Research, University College London Health Survey for England, 2011. UK Data Service. 2011. [3 December 2025]. https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=7260 date last. accessed.
  • 17.NatCen Social Research, University College London Health Survey for England, 2010. UK Data Service. 2010. [3 December 2025]. https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=6986 date last. accessed.
  • 18. Rabin R, de Charro F. EQ-5D: a measure of health status from the EuroQol Group. Ann Med. 2001;33(5):337–343. doi: 10.3109/07853890109002087. [DOI] [PubMed] [Google Scholar]
  • 19.Devlin N, Parkin D, Janssen B. Methods for Analysing and Reporting EQ-5D Data. Cham (CH): Springer; 2020. [PubMed] [Google Scholar]
  • 20. Whitehead SJ, Ali S. Health outcomes in economic evaluation: the QALY and utilities. Br Med Bull. 2010;96:5–21. doi: 10.1093/bmb/ldq033. [DOI] [PubMed] [Google Scholar]
  • 21. Feng Y, Parkin D, Devlin NJ. Assessing the performance of the EQ-VAS in the NHS PROMs programme. Qual Life Res. 2014;23(3):977–989. doi: 10.1007/s11136-013-0537-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Little RJA, Rubin DB. Statistical Analysis with Missing Data. Wiley; 2019. [DOI] [Google Scholar]
  • 23. Buuren S, Groothuis-Oudshoorn K. mice: Multivariate imputation by chained equations in R. J Stat Softw. 2011;45(3):67. doi: 10.18637/jss.v045.i03. [DOI] [Google Scholar]
  • 24. Morris TP, White IR, Royston P. Tuning multiple imputation by predictive mean matching and local residual draws. BMC Med Res Methodol. 2014;14:75. doi: 10.1186/1471-2288-14-75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Graham JW, Olchowski AE, Gilreath TD. How many imputations are really needed? Some practical clarifications of multiple imputation theory. Prev Sci. 2007;8(3):206–213. doi: 10.1007/s11121-007-0070-9. [DOI] [PubMed] [Google Scholar]
  • 26.van Buuren S. Flexible Imputation of Missing Data. CRC Press; 2018. [DOI] [Google Scholar]
  • 27. Ho D, Imai K, King G, Stuart EA. MatchIt: nonparametric preprocessing for parametric causal inference. J Stat Softw. 2011;42(8):28. doi: 10.18637/jss.v042.i08. [DOI] [Google Scholar]
  • 28. Austin PC. Optimal caliper widths for propensity-score matching when estimating differences in means and differences in proportions in observational studies. Pharm Stat. 2011;10(2):150–161. doi: 10.1002/pst.433. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Recnik G, Kralj-Iglič V, Iglič A, et al. The role of obesity, biomechanical constitution of the pelvis and contact joint stress in progression of hip osteoarthritis. Osteoarthritis Cartilage. 2009;17(7):879–882. doi: 10.1016/j.joca.2008.12.006. [DOI] [PubMed] [Google Scholar]
  • 30. Mukka S, Rolfson O, Mohaddes M, Sayed-Noor A. The effect of body mass index class on patient-reported health-related quality of life before and after total hip arthroplasty for osteoarthritis: registry-based cohort study of 64,055 patients. JB JS Open Access. 2020;5(4):e20.00100. doi: 10.2106/JBJS.OA.20.00100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Salis Z, Sainsbury A, I Keen H, Gallego B, Jin X. Weight loss is associated with reduced risk of knee and hip replacement: a survival analysis using Osteoarthritis Initiative data. Int J Obes (Lond) 2022;46(4):874–884. doi: 10.1038/s41366-021-01046-3. [DOI] [PubMed] [Google Scholar]
  • 32. Ramos MS, Hale ME, Rullán PJ, Kunze KN, Nair N, Piuzzi NS. Do overall weight, body mass index, or clinically meaningful weight changes occur after total joint arthroplasty? A meta-analysis of 60,837 patients. J Arthroplasty. 2025;40(4):1083–1096. doi: 10.1016/j.arth.2024.10.024. [DOI] [PubMed] [Google Scholar]
  • 33. Rullán PJ, Oyem PC, Pumo TJ, et al. A Longitudinal analysis of weight changes before and after total hip arthroplasty: weight trends, patterns, and predictors. Technol Health Care. 2024;32(5):3747–3760. doi: 10.3233/THC-231404. [DOI] [PubMed] [Google Scholar]
  • 34. Preacher KJ, Hayes AF. Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. Behav Res Methods. 2008;40(3):879–891. doi: 10.3758/brm.40.3.879. [DOI] [PubMed] [Google Scholar]
  • 35. Efron B. Better bootstrap confidence intervals. J Am Stat Assoc. 1987;82(397):171–185. doi: 10.1080/01621459.1987.10478410. [DOI] [Google Scholar]
  • 36. MacKinnon DP, Fairchild AJ, Fritz MS. Mediation analysis. Annu Rev Psychol. 2007;58(1):593–614. doi: 10.1146/annurev.psych.58.110405.085542. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Royal Stat Soc Ser B. 1995;57(1):289–300. doi: 10.1111/j.2517-6161.1995.tb02031.x. [DOI] [Google Scholar]
  • 38. MacKinnon JG, White H. Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties. J Econom. 1985;29(3):305–325. doi: 10.1016/0304-4076(85)90158-7. [DOI] [Google Scholar]
  • 39. Knief U, Forstmeier W. Violating the normality assumption may be the lesser of two evils. Behav Res Methods. 2021;53(6):2576–2590. doi: 10.3758/s13428-021-01587-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. VanderWeele TJ, Ding P. Sensitivity analysis in observational research: introducing the e-value. Ann Intern Med. 2017;167(4):268–274. doi: 10.7326/M16-2607. [DOI] [PubMed] [Google Scholar]
  • 41. Cashin AG, McAuley JH, Lee H. Advancing the reporting of mechanisms in implementation science: a guideline for reporting mediation analyses (AGReMA) Implement Res Pract. 2022;3:26334895221105568. doi: 10.1177/26334895221105568. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Yapp LZ, Scott CEH, MacDonald DJ, Howie CR, Simpson A, Clement ND. Primary knee arthroplasty for osteoarthritis restores patients’ health-related quality of life to normal population levels. Bone Joint J. 2023;105-B(4):365–372. doi: 10.1302/0301-620X.105B4.BJJ-2022-0659.R1. [DOI] [PubMed] [Google Scholar]
  • 43. Aalund PK, Glassou EN, Hansen TB. The impact of age and preoperative health-related quality of life on patient-reported improvements after total hip arthroplasty. Clin Interv Aging. 2017;12(1951–6):1951–1956. doi: 10.2147/CIA.S149493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Reinhard J, Michalk K, Schiegl JS, et al. Impressive short-term improvement in functional outcome and quality of life after primary total hip arthroplasty (THA) in the orthogeriatric patient in a prospective monocentric trial. J Clin Med. 2024;13(9):2693. doi: 10.3390/jcm13092693. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Kappenschneider T, Bammert P, Maderbacher G, et al. The impact of primary total hip and knee replacement on frailty: an observational prospective analysis. BMC Musculoskelet Disord. 2024;25(1):78. doi: 10.1186/s12891-024-07210-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Harari D, Hopper A, Dhesi J, Babic-Illman G, Lockwood L, Martin F. Proactive care of older people undergoing surgery ('POPS’): designing, embedding, evaluating and funding a comprehensive geriatric assessment service for older elective surgical patients. Age Ageing. 2007;36(2):190–196. doi: 10.1093/ageing/afl163. [DOI] [PubMed] [Google Scholar]
  • 47. Steinbeck V, Bischof AY, Schöner L, et al. Gender health gap pre- and post-joint arthroplasty: identifying affected patient-reported health domains. Int J Equity Health. 2024;23(1):44. doi: 10.1186/s12939-024-02131-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Hawker GA, Wright JG, Coyte PC, et al. Differences between men and women in the rate of use of hip and knee arthroplasty. N Engl J Med. 2000;342(14):1016–1022. doi: 10.1056/NEJM200004063421405. [DOI] [PubMed] [Google Scholar]
  • 49. Samulowitz A, Gremyr I, Eriksson E, Hensing G. “Brave men” and “emotional women”: a theory-guided literature review on gender bias in health care and gendered norms towards patients with chronic pain. Pain Res Manag. 2018;2018:6358624. doi: 10.1155/2018/6358624. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Borkhoff CM, Hawker GA, Wright JG. Patient gender affects the referral and recommendation for total joint arthroplasty. Clin Orthop Relat Res. 2011;469(7):1829–1837. doi: 10.1007/s11999-011-1879-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Novicoff WM, Saleh KJ. Examining sex and gender disparities in total joint arthroplasty. Clin Orthop Relat Res. 2011;469(7):1824–1828. doi: 10.1007/s11999-010-1765-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Karlson EW, Daltroy LH, Liang MH, Eaton HE, Katz JN. Gender differences in patient preferences may underlie differential utilization of elective surgery. Am J Med. 1997;102(6):524–530. doi: 10.1016/s0002-9343(97)00050-8. [DOI] [PubMed] [Google Scholar]
  • 53. Lau LCM, Chan PK, Lui TWD, et al. Preoperative weight loss interventions before total hip and knee arthroplasty: a systematic review of randomized controlled trials. Arthroplasty. 2024;6(1):30. doi: 10.1186/s42836-024-00252-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Kim BI, LaValva SM, Parks ML, Sculco PK, Della Valle AG, Lee GC. Glucagon-like peptide-1 receptor agonists decrease medical and surgical complications in morbidly obese patients undergoing primary TKA. J Bone Joint Surg Am. 2025;107-A(4):348–355. doi: 10.2106/JBJS.24.00468. [DOI] [PubMed] [Google Scholar]
  • 55.Hernández Alava M, Pudney S, Wailoo A. NICE Decision Support Unit; 2022. Estimating EQ-5D by Age and Sex for the UK.https://sheffield.ac.uk/nice-dsu/methods-development/estimating-eq-5d [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The datasets generated and analyzed in the current study are not publicly available due to data protection regulations. Access to data is limited to the researchers who have obtained permission for data processing. Further inquiries can be made to the corresponding author.


Articles from Bone & Joint Open are provided here courtesy of British Editorial Society of Bone and Joint Surgery

RESOURCES