Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Dec 1.
Published in final edited form as: J Occup Rehabil. 2017 Dec;27(4):576–583. doi: 10.1007/s10926-016-9687-5

Validity and Reliability of the 8-Item Work Limitations Questionnaire

Timothy J Walker 1,2,3, Jessica M Tullar 1,2,3, Pamela M Diamond 1,2,3, Harold W Kohl III 1,2, Benjamin C Amick III 1,2
PMCID: PMC5484749  NIHMSID: NIHMS839265  PMID: 28025750

Abstract

Purpose

To evaluate factorial validity, scale reliability, test-retest reliability, convergent validity, and discriminant validity of the 8-item Work Limitations Questionnaire (WLQ) among employees from a public university system.

Methods

A secondary analysis using de-identified data from employees who completed an annual Health Assessment between the years 2009–2015 tested research aims. Confirmatory factor analysis (CFA) (n=10,165) tested the latent structure of the 8-item WLQ. Scale reliability was determined using a CFA-based approach while test-retest reliability was determined using the intraclass correlation coefficient. Convergent/discriminant validity was tested by evaluating relations between the 8-item WLQ with health/performance variables for convergent validity (health-related work performance, number of chronic conditions, and general health) and demographic variables for discriminant validity (gender and institution type).

Results

A 1-factor model with three correlated residuals demonstrated excellent model fit (CFI = 0.99, TLI = 0.99, RMSEA = 0.03, and SRMR = 0.01). The scale reliability was acceptable (0.69, 95% CI 0.68–0.70) and the test-retest reliability was very good (ICC=0.78). Low-to-moderate associations were observed between the 8-item WLQ and the health/performance variables while weak associations were observed between the demographic variables.

Conclusions

The 8-item WLQ demonstrated sufficient reliability and validity among employees from a public university system. Results suggest the 8-item WLQ is a usable alternative for studies when the more comprehensive 25-item WLQ is not available.

Keywords: Presenteeism, Work, Work Performance, Employee Health, Productivity

INTRODUCTION

Poor employee health is a major economic burden to employers [1]. In addition to the direct costs of poor employee health, there are substantial indirect costs related to lost productivity due to work limitations. Health related lost productivity results from days absent and lost productivity while at work. Studies suggest costs due to lost productivity at work have a sizable economic impact and exceed costs due to days absent from work [2,3]. Given the financial implications, multiple tools have been developed to measure the on-the-job impact of poor health among employees [4,5].

One commonly used tool, the Work Limitations Questionnaire (WLQ), was designed to measure the on-the-job impact of chronic conditions and their treatment [6]. The original 25-item version captures four dimensions: time demands, physical demands, mental-interpersonal demands, and output demands. The 25-item WLQ has demonstrated high reliability and validity and has been used with many different samples [7]. An 8-item version has also been developed; however, no published studies to date have tested the validity or reliability of this short-form measure.

The 8-item WLQ was developed from the original 25-item version based on the eight most predictive questions of economic outcomes related to lost productivity [8]. Given the short length, the 8-item WLQ is commonly used in non-research settings such as health assessment (HA) tools [9]. However, multiple research studies have used the 8-item WLQ (often collected from HA data) to evaluate lost productivity at work [1012]. Many of these studies cite the strong measurement properties from the original 25-item version despite the fact they are two different measures. Given the wide use of the 8-item WLQ, there is a need to evaluate its validity and reliability. This study aims to evaluate the factorial validity, scale reliability, test-retest reliability, convergent validity, and discriminant validity of the 8-item WLQ.

METHODS

Design

A retrospective cohort study using secondary, de-identified HA data was used to evaluate the psychometric properties of the 8-item WLQ. Data were collected over a 6 year period with baseline data being used to test scale reliability and convergent, discriminant, and factorial validity. HA data were provided with a study identification number and stripped of all other identifying information prior to being shared with authors. Approval for data access and analyses was provided by the UT Health Science Center at Houston ethics review board.

Participants

Employees of a large university system who completed an annual HA during the respective years of study were eligible to be included in analyses. The HA was administered annually (from 2009–2015) to all benefits eligible employees who were at least 18 years of age. Employees could take the HA at any time during each respective plan year. For analyses testing the factorial validity, scale reliability, convergent validity, and discriminant validity, the sample consisted of employees who completed the 2009 HA. This sample was used because it was adequately powered (n=10,165) and the 2009 HA contained questions most suitable for testing convergent and discriminant validity. The test-retest reliability sample was derived from employees who took the HA twice within a 45 day period and reported no notable changes in health (further defined in the measures section). Employees were excluded from the test-retest sample if their HAs were taken ≤ 7 days apart. A 45 day period was chosen to include more respondents in the test-retest sample while accounting for those who reported health changes.

Measures

WLQ

The 8-item WLQ uses two questions from four dimensions related to on-the-job work demands: time management, physical demands, mental-interpersonal demands, and output demands. The recall period is 2-weeks with response categories capturing the percentage of time an employee has meeting the respective work demand. Response options include “all of the time (100%),” “most of the time,” “some of the time (about 50%),” “a slight bit of the time,” and “none of the time (0%).” There is also an option, “does not apply to my job,” allowing workers to answer items faithfully, which were considered missing values. The two physical demands questions use a reversed scale and thus were reversed scored. To calculate percent time an employee was unable to meet job demands, responses were converted to percentages and averaged together to create a score ranging from 0–100. Thus, an index score of 0 represented an employee limited none of the time, whereas 100 represented an employee limited all of the time.

Additional Variables

Additional questions from the HA were used to describe the samples, analyze the convergent and discriminant validity, and define the test-retest sample. Key variables included gender, age, institution type (medical vs. academic), health-related work performance, number of chronic conditions, and general health. Health-related work performance was captured by the following question: “During the last month, what percentage of your work performance was affected by an underlying health condition, including allergies, headaches, back pain, depression, arthritis, or any other health condition?” The number of chronic conditions were determined by participants indicating if “a doctor ever diagnosed you with any of the following…allergies, arthritis, asthma, cancer, chronic back pain, chronic neck pain, chronic obstructive pulmonary disease (COPD), depression, diabetes, gastroesophogeal reflux disease (GERD), heart disease, migraines, osteoporosis, hypertension, and stroke.” An ordinal variable was created to represent number of chronic conditions (0, 1–2, or ≥3). General health was determined by responses to the statement: “In general, my overall health is…” Response options included excellent, very good, good, fair, and poor.

Discriminant validity was tested using the gender and institution type variables while convergent validity was tested using the health-related work performance, number of chronic conditions, and general health variables. To determine the test-retest sample, changes in health were evaluated between assessments. Health changes were determined by changes in responses to the chronic conditions question or the general health question between HAs.

Statistical Procedures

Descriptive statistics were used to describe the demographics and health of the sample. For the WLQ items, means, standard deviations, and frequencies of missing data (both not applicable and missing responses) were evaluated. More specifically, the WLQ items were checked to ensure missing data did not exceed 15% of the sample for each respective item.

Factorial Validity

Confirmatory factor analysis was used to examine the latent structure of the WLQ. Since the 8-item WLQ is commonly used as a scaled score, a 1-factor model was initially tested. Full information maximum likelihood parameter estimation with robust standard errors was used as the estimator to account for missing data and the non-normality of the WLQ items. The collective performance of the following indicators were used to assess model fit: overall chi-square (non-significant value=good fit), comparative fit index (CFI, >0.90=adequate fit and >0.95=good fit), Tucker-Lewis index (TLI, >0.90=adequate fit and >0.95=good fit), root mean square error of approximation (RMSEA, 0.05–0.08=adequate fit, <0.05=good fit), and standardized root mean square residual (SRMR, <0.08=adequate fit and <0.05=good fit)[13]. In addition, the magnitude of factor loadings were evaluated for meaningfulness and statistical significance (p<0.05). CFA models were tested using Mplus 7.31, Los Angeles, CA [14].

Model adjustments were considered if models demonstrated poor fit. Modification indices were evaluated to determine if there were unspecified parameter estimates that could greatly improve model fit. Only theoretically meaningful modifications that greatly reduced the model chi square were added to improve fit. Measurement invariance was tested to evaluate if the model differed between two different respondent groups.

Scale Reliability

Scale reliability was tested using a CFA-based approach rather than Cronbach’s alpha [15,16]. Cronbach’s alpha can underestimate or overestimate scale reliability depending on whether the scale contains correlated measurement errors [17,18]. Scale reliability is represented by the proportion of true-score variance to the total observed variance in a measure [17]. Thus, using the CFA based approach, point estimation of scale reliability was captured using estimates of the true score and error variance (calculated using factor loadings, error variances, and error covariances from the final model). A reliability score ≥0.65 was considered acceptable [19].

Test-Retest Reliability

Test-retest reliability was determined using the intraclass correlation coefficient (ICC). A two-way, random effects model was used as it was determined to be the most appropriate [20]. The ICC value is considered excellent when >0.75, good when between 0.60 and 0.74, moderate when between 0.40 and 0.59, and poor when below 0.40.

Convergent/Discriminant Validity

Due to highly skewed distributions, Spearman’s correlation coefficient was used to test convergent validity between the health-related work performance variable and ordinalized WLQ score. We hypothesized a positive correlation ≥0.60 between the variables. One-way analysis of variance (ANOVA) was used to test group differences in WLQ mean scores between people with 0, 1–2, or 3+ conditions, and between people who responded to being in excellent/very good, good, or fair/poor health. Post hoc pairwise comparisons were made using Tukey HSD tests. We hypothesized those in worse health (represented by greater chronic conditions and/or worse general health) would have significantly greater scores on the WLQ. Independent samples t-tests evaluated differences between genders and institution type, respectively. We hypothesized there would be no significant differences between gender or institution type groupings on WLQ scores. Test-retest reliability as well as convergent and discriminant validity tests were conducted using Stata 13.0, College Station, TX [21].

RESULTS

Participants

The 2009 HA respondents consisted of 10,165 employees. Seventy two percent were female and the mean age was 42.1 years. A majority of the respondents were white (53.9%) and employed at a medical institution (67.6%). About 64% had at least one chronic condition and just over half (53%) reported being in excellent/very good health. The test-retest sample consisted of 18 employees (there were no data missing for the test-retest sample). There were 88 employees who completed two HAs within a 45 day period. However, 46 were excluded due to: 1) reporting a change in general health or number of chronic conditions (n=42); or 2) completing two HAs ≤ 7 days apart (n=4). Descriptive statistics for study participants are presented in Table 1.

Table 1.

Descriptive Statistics for Demographic and Health-Related Variables

2009 HA Respondents (n=10,165) Test-Retest Sample (n=42)

Demographic Variables Mean ± SD or % Mean ± SD or %
Sex 71.8%
 Female 73.8%
Age 42.2 ± 11.3 45.6 ± 12.3
Institution Type
 Medical 67.6% 64.3%
Health-Related Variables
Number of Chronic Conditions
 None 36.4% 26.2%
 1–2 44.3% 35.7%
 3 or more 19.3% 38.1%
General Health*
 Excellent/Very Good 53.5% 64.3%
 Good 33.6% 33.3%
 Fair/Poor 12.9% 2.4%
Health Related Work Performance 7.9 ± 11.3 n/a
*

General Health variable was completed for 10,150 respondents; HA, Health Assessment; SD, Standard Deviation

Factorial Validity

Item correlations, means, standard deviations, and percent missing data are presented in Table 2. All missing data were due to not applicable responses and no items were missing more than 4.4% of responses. Results from fitting the 1-factor model with no correlated residuals suggested poor model fit (χ2 = 8128.48, df = 20, p<0.001, CFI = 0.59, TLI = 0.43, RMSEA = 0.20, and SRMR = 0.13). However, modification indices revealed uncorrelated residuals between items 1&2, 3&4, and 7&8, respectively, were greatly contributing to model misfit. Each respective item pairing captures a different dimension of work limitations (time, physical, and output demands). Thus, the respective sets of indicator variables have shared components providing theoretical support for fitting a new model allowing residuals of three item pairs to correlate [22]. Furthermore, the physical demands items (3&4) were reverse scaled on the questionnaire, which represents a method effect that may contribute to the covariation of the items beyond the common factor [17]. Results from the new 1-factor model with three correlated residuals suggested excellent fit (χ2 = 168.77, df = 17, p<0.001, CFI = 0.99, TLI = 0.99, RMSEA = 0.03, and SRMR = 0.01) (Figure 1). Although all factor loadings were statistically significant, the factor loadings for items 3&4 were low (0.22 and 0.21, respectively) relative to the other items.

Table 2.

Correlations, Descriptive Statistics, and Missing Data for WLQ Items in the 2009 HA Respondents

Item 1 Item 2 Item 3 Item 4 Item 5 Item 6 Item 7 Item 8
Item 1 1
Item 2 0.73 1
Item 3 0.17 0.16 1
Item 4 0.16 0.16 0.81 1
Item 5 0.57 0.56 0.18 0.18 1
Item 6 0.44 0.44 0.15 0.14 0.61 1
Item 7 0.50 0.51 0.17 0.16 0.63 0.60 1
Item 8 0.45 0.50 0.16 0.15 0.59 0.56 0.84 1

Mean 0.69 0.48 0.58 0.58 0.51 0.26 0.29 0.27
SD 0.97 0.85 1.06 1.05 0.76 0.64 0.65 0.64
% Missing 4.4% 4.6% 1.5% 4.2% 0.7% 1.4% 0.4% 0.5%

WLQ, Work Limitations Questionnaire; SD, Standard Deviation

Fig 1.

Fig 1

Final, 1-Factor Model with Three Correlated Residuals and Standardized Estimates from the 2009 HA Respondents

The 1-factor model with three correlated residuals also demonstrated excellent fit with the 2015 HA respondents (χ2 = 202.56, df = 17, p<0.001, CFI = 0.99, TLI = 0.98, RMSEA = 0.03, and SRMR = 0.02). Invariance testing between the 2009 and 2015 respondents which constrained factor loadings and intercepts to equality between groups, resulted in minimal reductions in model fit based on the minor changes in CFI (ΔCFI <0.01) values (Table 3) [23]. Therefore, results further support the factorial validity of the 1-factor model with correlated residuals.

Table 3.

Measurement Invariance Model Fit Results Comparing 2009 and 2015 Respondents

Model Type χ2 df RMSEA SRMR CFI TLI
Equal Forms 374.01 34 0.031 0.015 0.990 0.984
Equal Factor Loadings 403.24 41 0.029 0.022 0.989 0.986
Equal Indicator Intercepts 677.38 48 0.035 0.030 0.982 0.979

χ2, chi-square; df, degrees of freedom; RMSEA, Root Mean Square Error of Approximation; SRMR, Standardized root mean square residual; CFI, Comparative Fit Index; TLI, Tucker-Lewis Index; Measurement invariance was based on the change in CFI values. A difference in CFI values ≤ -.01 indicates the null hypothesis of invariance should not be rejected [23].

Scale and Test-Retest Reliability

The point estimate of scale reliability for the 8-item WLQ was 0.69 with a 95% confidence interval from 0.68–0.70. The point estimate is considered in the acceptable range (≥0.65). Test-retest results demonstrated a high degree of reliability between measurements. The ICC agreement was 0.78 with a 95% confidence interval from 0.62 to 0.87 and F(41)=7.81, p<0.001. The mean time between measurements was 30.7 (±10.3) days

Convergent and Discriminant Validity

Spearman correlation results demonstrated a significant, yet moderate correlation between WLQ scores and the health related work performance variable (Spearman’s r=0.51, p<0.001). ANOVA results revealed significant group differences between the number of chronic conditions and general health groups, respectively (Table 4). Furthermore, Tukey HSD results revealed significant differences between all respective groups for both variables. The effect sizes, measured by eta-squared, revealed a small effect for the chronic conditions variable and a medium-to-large effect for the general health variable [24]. Overall, results suggest WLQ scores were significantly higher for respondents who reported more chronic conditions or to be in worse general health.

Table 4.

Convergent and Discriminant Validity Results for Group Comparisons

Convergent Validity Variables Mean SD F or t (df) η2
Number of Chronic Conditions 156.1 (2, 10,162)* 0.03
 None 9.5 12.9
 1–2 10.8 13.5
 3 or more 16.2 16.2
General Health 687.1 (2, 10,147)* .12
 Excellent/Very Good 7.5 10.8
 Good 13.4 14.2
 Fair/Poor 21.9 18.5
Discriminant Validity Variables
Sex −4.8 (10,163)* 0.002
 Female 10.3 13.8
 Male 11.8 14.2
Institution Type −3.5 (10,163)* 0.001
 Medical 11.7 13.5
 Non-Medical 10.7 14.3

SD, Standard Deviation; F, F-statistics; t, t-statistic, df, degrees of freedom; η2, eta-squared

Results from independent samples t-tests revealed significant differences between males versus females and respondents who worked at a medical versus a non-medical institution. However, eta-squared effect sizes for both group comparisons were very small, 0.002 and 0.001, respectively. Therefore the significant results are likely in part due to large sample sizes and not meaningful group differences.

DISCUSSION

Several key findings about the 8-item WLQ were apparent in this study. First, a 1-factor model with three sets of correlated residuals demonstrated excellent model fit with multiple data sets. This finding suggests the measure is capturing one construct and supports the continued use of an index score. Second, the reliability tests were in acceptable ranges. More specifically, the point estimate of scale reliability was sufficient (0.69) and the test-retest results were very good (0.78). Third, all three convergent validity tests demonstrated significant associations suggesting sufficient construct validity. The associations between the chronic conditions and health related work performance variables with the WLQ were weaker than expected but still in the low-to-moderate range, while the general health variable demonstrated a moderate-to-high association as expected. Lastly, the discriminant validity tests revealed weak associations between gender and institution type with the WLQ. Even though associations were statistically significant, the effect sizes were very small further supporting construct validity.

The 8-item WLQ is primarily used as an index score representing health-related work limitations. Therefore, standardized factor loadings for each item were expected to be >0.40. This expectation was confirmed with two exceptions. Loadings for items 3&4 (capturing physical demands) were 0.21 and 0.22, respectively. Other studies evaluating the 25-item WLQ have reported lower than expected correlations between the physical demands items and items capturing other work demands [25,26]. Similar to the 25-item WLQ, the 8-item version uses reversed instructions and answer options for physical demands items. Therefore, lower than expected factor loadings could have resulted from respondents not recognizing the different instructions. However, the overall model fit was excellent supporting a 1-factor model.

The reliability tests demonstrated sufficient scale reliability and very good test-retest reliability. The 8-item WLQ scale reliability was lower than values reported from the 25-item version [6,26,27] This finding is likely due to using a CFA-based approach rather than Cronbach’s alpha, which can misestimate results in addition to having other limitations [17,18]. Since Cronbach’s alpha is dependent on the number of items, the 25-item WLQ would have a higher alpha even with a lower mean inter-item correlation value. Even though the 8-item WLQ scale reliability is lower than reported values from the 25-item version, it is still in the adequate range.

The test-retest results further support the 8-item WLQ’s sufficient reliability. Results were lower (0.79 vs. 0.93) than a study evaluating the test-retest reliability of the 25-item WLQ index score among employees with chronic conditions [26]. In contrast, our results were higher than results from a different study testing the 25-item WLQ index score reliability among a sample of cancer survivors (0.74) [27]. The unique samples make direct comparisons difficult; however, the 8-item WLQ appears to have good test-retest reliability and be within range of past studies testing the 25-item version.

The 8-item WLQ demonstrated sufficient construct validity based on the convergent and discriminant validity test results. The association between the 8-item WLQ and the health related work performance and chronic condition variables were lower than expected. However, the health related work performance variable was based on a 1-item measure, used a different time frame (2 vs. 4 weeks), and emphasized specific conditions. These differences likely contributed to finding a moderate rather than strong association. The weaker than expected relationship between the 8-item WLQ and number of chronic conditions was likely due to the question emphasized existing physician diagnosed conditions. Therefore, the question did not take into account undiagnosed conditions or existing conditions that are well controlled (and thus have little-to-no symptoms). Both of these scenarios would weaken the expected relationship between the variables.

Results from tests between the general health question, gender, and institution type provided the strongest support for construct validity. As expected, the 8-item WLQ had a moderate-to-strong relationship to general health. Furthermore, associations between the 8-item WLQ and the gender and institution type variables revealed very small effect sizes in magnitude and relative to the other convergent validity test results.

Limitations and Strengths

Study limitations primarily impacted test-retest reliability and convergent validity results. The test-retest sample was derived from respondents who took two HAs within 45 days. There were varying lengths of time between assessments that could influence the consistency between time points. An attempt was made to ensure the test-retest sample had stable health between time points by excluding respondents who reported health changes in general health or number of chronic conditions. However, neither health-related question has published reliability or validity information, meaning we could have excluded respondents from the test-retest sample who were in stable health. Lack of reliability and validity from study questions could have also weakened associations evaluated in convergent validity tests. Ideally, measures with strong psychometric properties such as the SF-36 or the 25-item WLQ would have been used to evaluate construct validity of the 8-item WLQ [6,28]. However, these questions were not available in our dataset. This study was also conducted in a relatively healthy sample of employees from a university system. Therefore, results may not be generalizable to more specific employee subgroups or other work settings.

This is the first study to test the reliability and validity of the 8-item WLQ despite its use in previous research studies. The main strength of this study was our ability to test multiple forms of validity and reliability. We had rare access to longitudinal data from a public university system. This access allowed us to derive a test-retest sample to evaluate measurement consistency over time. In addition, multiple years of data allowed us to cross-validate the final CFA model with a different respondent group further supporting the appropriateness and generalizability of the 1-factor model. Lastly, we used a CFA-based approach to evaluate scale-reliability, which is not subject to the same limitations as Cronbach’s alpha. Therefore, we obtained a more accurate estimate of the measure’s precision.

Conclusion

Based on study results, the 8-item WLQ demonstrated sufficient reliability and validity among a sample of employees from a public university system. Furthermore, results support the continued use of an index score based on all eight items. Past studies have used the 8-item WLQ to assess associations between health-related variables and work limitations [11,12,2932], determine cost estimates associated with poor health [10], and evaluate intervention effects [33]. Our findings provide additional support for results from these studies by providing evidence the 8-item WLQ is a valid and reliable measure. The 8-item WLQ is a usable alternative to the 25-item version. However, the 25-item WLQ is still the preferred measure in a research setting given its more comprehensive nature and strong psychometric properties that have been demonstrated among multiple populations [7]. Our work provides support for using the 8-item WLQ when more comprehensive measures like the 25-item version are not feasible. Future work should further test the reliability and validity of the 8-item WLQ in different employee samples and work settings.

Acknowledgments

This work was supported by The University of Texas System Office of Employee Benefits; Postdoctoral Fellowship, University of Texas Health Science Center at Houston School of Public Health Cancer Education and Career Development Program – National Cancer Institute/NIH Grant R25 CA57712; and partial support from the Center for Health Promotion and Prevention Research.

Footnotes

Disclaimer: The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Cancer Institute or the National Institute of Health.

References

  • 1.Rezaee ME, Pollock M. Prevalence and Associated Cost and Utilization of Multiple Chronic Conditions in the Outpatient Setting among Adult Members of an Employer-based Health Plan. Popul Health Manag. 2015;18:421–428. doi: 10.1089/pop.2014.0124. [DOI] [PubMed] [Google Scholar]
  • 2.Goetzel RZ, Long SR, Ozminkowski RJ, Hawkins K, Wang S, Lynch W. Health, absence, disability, and presenteeism cost estimates of certain physical and mental health conditions affecting U.S. employers. J Occup Environ Med. 2004;46:398–412. doi: 10.1097/01.jom.0000121151.40413.bd. [DOI] [PubMed] [Google Scholar]
  • 3.Loeppke R, Taitel M, Haufle V, Parry T, Kessler RC, Jinnett K. Health and productivity as a business strategy: a multiemployer study. J Occup Environ Med. 2009;51:411–28. doi: 10.1097/JOM.0b013e3181a39180. [DOI] [PubMed] [Google Scholar]
  • 4.Mattke S, Balakrishnan A, Bergamo G, Newberry SJ. A review of methods to measure health-related productivity loss. Am J Manag Care. 2007;13:211–7. [PubMed] [Google Scholar]
  • 5.Lofland JH, Pizzi L, Frick KD. A review of health-related workplace productivity loss instruments. Pharmacoeconomics. 2004;22:165–84. doi: 10.2165/00019053-200422030-00003. [DOI] [PubMed] [Google Scholar]
  • 6.Lerner D, Amick B, III, Rogers W, Malspeis S, Bungay K, Cynn D. The Work Limitations Questionnaire. Med Care. 2001;39:72–85. doi: 10.1097/00005650-200101000-00009. [DOI] [PubMed] [Google Scholar]
  • 7.Amick B, III, Gimeno D. Measuring work outcomes with a focus on health-related work productivity loss. In: Wittink H, Carr D, editors. Pain management: Evidence, Outcomes and Quality of Life: a sourcebook. New York: Elsevier; 2008. pp. 329–43. [Google Scholar]
  • 8.Lerner D, Amick BC, Lee JC, Rooney T, Rogers WH, Chang H, et al. Relationship of employee-reported work limitations to work productivity. Med Care. 2003;41:649–59. doi: 10.1097/01.MLR.0000062551.76504.A9. [DOI] [PubMed] [Google Scholar]
  • 9.Grossmeier J. Program Measurement and Evaluation Guide : Core Metrics for Employee Health Management. Chapter 7: Productivity and Performance. 2015 [Google Scholar]
  • 10.Mitchell RJ, Bates P. Measuring health-related productivity loss. Popul Health Manag. 2011;14:93–8. doi: 10.1089/pop.2010.0014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Walsh Ja, McFadden ML, Morgan MD, Sawitzke AD, Duffin KC, Krueger GG, et al. Work productivity loss and fatigue in psoriatic arthritis. J Rheumatol. 2014;41:1670–4. doi: 10.3899/jrheum.140259. [DOI] [PubMed] [Google Scholar]
  • 12.Burton WN, Chen C-Y, Li X, Schultz AB, Abrahamsson H. The Association of Self-Reported Employee Physical Activity With Metabolic Syndrome, Health Care Costs, Absenteeism, and Presenteeism. J Occup Environ Med. 2014;56:919–26. doi: 10.1097/JOM.0000000000000257. [DOI] [PubMed] [Google Scholar]
  • 13.Byrne BM. Structural equation modeling with Mplus : basic concepts, applications, and programming. New York: Routledge; 2012. [Google Scholar]
  • 14.Muthen L, Muthen B. Mplus user’s guide. 7. Los Angeles: 2014. [Google Scholar]
  • 15.Raykov T. Estimation of congeneric scale reliability using covariance structure analysis with nonlinear constraints. Br J Math Stat Psychol. 2001;54:315–23. doi: 10.1348/000711001159582. [DOI] [PubMed] [Google Scholar]
  • 16.Raykov T. Behavioral scale reliability and measurement invariance evaluation using latent variable modeling. Behav Ther. 2004:299–331. [Google Scholar]
  • 17.Brown TA. Confirmatory Factor Analysis For Applied Research. 2. New York: Guilford Press; 2015. [Google Scholar]
  • 18.Streiner DL, Norman GR. Health Measurement Scales: a practical guide to their development and use. 4. New York: Oxford University Press; 2008. [Google Scholar]
  • 19.Brown TA, White KS, Forsyth JP, Barlow DH. The structure of perceived emotional control: Psychometric properties of a revised anxiety control questionnaire. Behav Ther. 2004;35:75–99. [Google Scholar]
  • 20.McGraw KO, Wong SP. Forming inferences about some intraclass correlation coefficients. Psychol Methods. 1996;1:30–46. [Google Scholar]
  • 21.StataCorp. Stata Statistical Software: Release 13. 2013. [Google Scholar]
  • 22.Landis R, Edwards B, Corina J. Correlated residuals among items in the estimation of measurement models. In: Lance CE, Vandenberg RJ, editors. Statisical methodological myths and urban legends: Doctrine, verity, and fable in the organizational and social sciences. New York: Routledge; 2009. pp. 195–214. [Google Scholar]
  • 23.Cheung GW, Rensvold RB. Evaluating Goodness-of-Fit Indexes for Testing Measurement Invariance. Struct Equ Model A Multidiscip J. 2002;9:233–55. [Google Scholar]
  • 24.Warner R. Applied Statistics: From Bivariate Through Multivariate Techniques. Thousand Oaks, California: Sage Publications, Inc; 2008. [Google Scholar]
  • 25.Tang K, Beaton DE, Amick BC, Hogg-Johnson S, Côtè P, Loisel P. Confirmatory factor analysis of the work limitations questionnaire (wlq-25) in workers’ compensation claimants with chronic upper-limb disorders. J Occup Rehabil. 2013;23:228–38. doi: 10.1007/s10926-012-9397-6. [DOI] [PubMed] [Google Scholar]
  • 26.Verhoef JaC, Miedema HS, Bramsen I, Roebroeck ME. Using the Work Limitations Questionnaire in Patients With A Chronic Condition in the Netherlands. J Occup Environ Med. 2012;54:1293–9. doi: 10.1097/JOM.0b013e31825cb68d. [DOI] [PubMed] [Google Scholar]
  • 27.Tamminga SJ, Verbeek JHAM, Frings-Dresen MHW, De Boer AGEM. Measurement properties of the Work Limitations Questionnaire were sufficient among cancer survivors. Qual Life Res. 2014;23:515–25. doi: 10.1007/s11136-013-0484-8. [DOI] [PubMed] [Google Scholar]
  • 28.Ware JE, Snow KK, Kosinski M, Gandek B. SF-36 Health Survey Manual and Interpretation Guide. Bost New Engl Med Cent. 1993:1. [Google Scholar]
  • 29.Burton WN, Chen C-Y, Conti DJ, Schultz AB, Edington DW. The association between health risk change and presenteeism change. J Occup Environ Med. 2006;48:252–63. doi: 10.1097/01.jom.0000201563.18108.af. [DOI] [PubMed] [Google Scholar]
  • 30.Burton WN, Chen C-Y, Conti DJ, Schultz AB, Pransky G, Edington DW. The association of health risks with on-the-job productivity. J Occup Environ Med. 2005;47:769–77. doi: 10.1097/01.jom.0000169088.03301.e4. [DOI] [PubMed] [Google Scholar]
  • 31.Alker HJ, Wang ML, Pbert L, Thorsen N, Lemon SC. Impact of School Staff Health on Work Productivity in Secondary Schools in Massachusetts. J Sch Health. 2015;85:398–404. doi: 10.1111/josh.12266. [DOI] [PubMed] [Google Scholar]
  • 32.Cash SW, Beresford SAA, Henderson JA, McTiernan A, Xiao L, Wang CY, et al. Dietary and physical activity behaviours related to obesity-specific quality of life and work productivity: baseline results from a worksite trial. Br J Nutr. 2012:1134–42. doi: 10.1017/S0007114511006258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.O’Connell SE, Jackson BR, Edwardson CL, Yates T, Biddle SJH, Davies MJ, et al. Providing NHS staff with height-adjustable workstations and behaviour change strategies to reduce workplace sitting time: protocol for the Stand More AT (SMArT) Work cluster randomised controlled trial. BMC Public Health. 2015;15:1219. doi: 10.1186/s12889-015-2532-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES